[Pgpool-general] Several questions about pgpool

Thu Aug 10 08:22:01 UTC 2006

Michael Ulitskiy wrote:
> On Tuesday 08 August 2006 09:51 pm, Taiki Yamaguchi wrote:
>> Hi,
>>
>> Michael Ulitskiy wrote:
>>> Hello,
>>>
>>> I'm using pgpool 3.1 and recently discovered several things that
>>> I'm not sure whether they're bugs or intended. I'd appreciate if someone
>>> can enlighten me on that.
>>> 1. Will pgpool fork additional children when the number of available children exhausted? 
>>> I was under impression it will, but the testing shows otherwise. Connecting clients just block
>>> until another client disconnects and pgpool process becomes available.
>> No. pgpool only forks children up to num_init_children. If there is no 
>> available child left, other connections will be queued until one of 
>> children will be available.
>>
>>> 2. I thought that each pgpool process can serve "max_pool" number of usernames, multiplexing
>>> queries from different usernames. My testing shows it doesn't work. I set 
>>> num_init_children=4
>>> max_pool=4
>>> So i expect to be able to connect 16 clients using 4 different usernames, but after 4 clients
>>> connected (with different usernames) the 5th blocks waiting for available pgpool process.
>>> Is my understanding wrong? What is the purpose of max_pool parameter then?
>> pgpool child process *pools* up to max_pool number of connections (based 
>> on username, database, and protocol version). In your case, pgpool will 
>> pool at most 16 connections, but only 4 clients can connect concurrently.
>>
>>> 3. I'm using load-balancing mode with replication done by slony-I. It seems when a client
>>> is connected, pgpool child opens connections to both primary and secondary backend
>>> which means that the number of concurrently served clients do not depend on the number
>>> of backends, right? In other words if I have num_init_children=32 then I won't be able to
>>> connect more than 32 clients regardless of number of backends I'm using?
>> As I answered in question 1, pgpool only serves connections up to 
>> num_init_children. The number of backends has nothing to do with it.
>>
>>> 4. This is more a feature request. Have no idea how hard it is to do and whether it's possible
>>> in pgpool architecture. I'm expecting to have hundreds of clients connecting through pgpool
>>> to database cluster behind it. Some connections will be short-lived - several seconds, 
>>> while others will be relatively long-lived - minutes-to-hours, while others will be permanent - daemons.
>>> It would be really nice if pgpool children could process more than one connection multiplexing
>>> queries from different connection. Is it possible?
>> I am not sure why you need this feature. pgpool can handle hundreds of 
>> clients if you set your num_init_children to suit your application. If a 
>> short-lived client disconnects, child process which was serving that 
>> client will be available for the next incoming connection.
> 
> Thanks for the info.

No problem :)

> I need this feature because at the moment I already have around 80 idle postmaster processes
> on each of 2 backends and this number is expected to grow significantly. When I say "idle"
> I mean client is still connected to the database, but doing some other work. I don't think that having
> hundreds of idle postmaters is good.
> Currently I have 2 choices: disconnect after every query or implement a connection manager
> either on application side or in middleware. Since pgpool is actually a kind of middleware, 
> I think it's very logical for it to have connection manager functionality.
> I'd imagine it like this: 
> - when client connection comes in pgpool checks if there's another connection
>   with the same username/database. if no - open connections to backends. if yes - do nothing, wait for query.

pgpool already does this. Only thing that differs from what you are 
probably expecting is that pgpool's child process checks if there's a 
connection with the same username & database & protocol version exists 
only within itself; not in other child processes.

> - when client query comes in - check if there're idle at the moment connections to backend with the
>   same credentials.   if yes - use it. if no - open new connections and use it.

I think it is faster to have another child process handle the new 
incoming connection rather than checking and make a child process 
handles multiple client connections. Think about a case when a daemon 
started working while the child was processing other new time-consuming 
connections.

> Also it would be very nice for pgpool to fork additional children as needed as I believe it's always
> good for application to be able to adapt to the run-time conditions instead of solely depending on
> manual configuration.

pgpool adapts pre-fork model that all children will be created when 
pgpool is started, just like apache2. The reason for this is that the 
overhead created by forking additional processes degrades performance, 
and many applications may connect to the backend thousands of times, but 
each connection lives only for a short period of time.

Moreover, others may want to prevent unlimited number of client 
connections to be made to the backend. To control that, you need a 
parameter to limit the max number of connections made to the backend anyway.

--
Taiki Yamaguchi

> Again I have no idea how hard that is to implement. I realize that complexity and/or pgpool architecture
> may prohibit it, but I believe this would be a very nice and logical addition. 
> What do you think?
> 
> Michael
> 
>> --
>> Taiki Yamaguchi
>>
>>> Thanks,
>>> Michael 
>>> _______________________________________________
>>> Pgpool-general mailing list
>>> Pgpool-general at pgfoundry.org
>>> http://pgfoundry.org/mailman/listinfo/pgpool-general
>>>
>>>
>>>
>>
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general