[pgpool-hackers: 3713] Re: discuss of pgpool enhancement

Tue Jul 14 20:40:47 JST 2020

> Hi Jianshen,
> 
> I think it is a very good idea to have on-demand spawning of the
> child processes and it will enable us to effectively configure the
> Pgpool-II in environments where we get instantaneous connection spikes.
> Currently we have to configure the Pgpool's num_init_children to a
> value of "maximum number of connections expected" and most of the
> time, 50 to 60 percent of child processes keep sitting idle and
> consuming system resources.
> 
> Similarly, I also agree with having an option of the global connection
> pool and I believe that will enable us to have less number of opened backend
> connections and also in the future we can build a different type of pooling
> options on that like transaction pooling and similar features.

Not sure abut this. Suppose we have num_init_children = 4 and max_pool
= 1. We have 4 users u1, u2, u3 and u4. For simplicity, all those
users connect to the same database.

Current Pgpool-II:
1. u1 connects to pgpool child p1 and creates connection pool p_u1.
2. u2 connects to pgpool child p2 and creates connection pool p_u2.
3. u3 connects to pgpool child p3 and creates connection pool p_u3.
4. u4 connects to pgpool child p4 and creates connection pool p_u4.

Pgpool-II with global connection pooling.
1. u1 connects to pgpool child p1 and creates connection pool p_u1.
2. u2 connects to pgpool child p2 and creates connection pool p_u2.
3. u3 connects to pgpool child p3 and creates connection pool p_u3.
4. u4 connects to pgpool child p4 and creates connection pool p_u4.

So there's no difference with/without global connection pooling in the
end.

The case global connection when pooling wins would be, number of kind
of users and concurrent connections are much smaller than
num_init_children. For example, if there's only one user u1 and
there's only one concurrent connections, we will have:

Current Pgpool-II:
1. u1 connects to pgpool child p1 and creates connection pool p_u1.
2. u1 connects to pgpool child p2 and creates connection pool p_u2.
3. u1 connects to pgpool child p3 and creates connection pool p_u3.
4. u1 connects to pgpool child p4 and creates connection pool p_u4.

Pgpool-II with global connection pooling.
1. u1 connects to pgpool child p1 and creates connection pool p_u1.
2. u1 connects to pgpool child p2 and reuses connection pool p_u1.
3. u1 connects to pgpool child p3 and reuses connection pool p_u1.
4. u1 connects to pgpool child p4 and reuses connection pool p_u1.

Thus global connection pool wins having only 1 connection pool if
number of kind of users and concurrent connections are much smaller
than num_init_children.

But question is, if we have only 1 concurrent session,
num_init_children = 1 would be enough in the first place. In this case
we will have similar result with current Pgpool-II.

1. u1 connects to pgpool child p1 and creates connection pool p_u1.
2. u1 connects to pgpool child p1 and reuses connection pool p_u1.
3. u1 connects to pgpool child p1 and reuses connection pool p_u1.
4. u1 connects to pgpool child p1 and reuses connection pool p_u1.

So there's no point to have global connection pool here.

> IMHO we should take both of these features as a separate project.
> We can start with on-demand child spawning feature and once we have
> that in Pgpool-II we build the global connection pool option on top of that.
> 
> So if you are interested in working on that, you can send the proposal and
> include the details like how are you planning to manage the
> child-process-pool
> and when will the Pgpool-II spawn and destroy the child processes?
> My idea would be to make child-process-pool as much configurable as
> possible.
> Some of the configuration parameters I can think of for the purpose are.
> 
> CPP_batch_size                                     /* number of child
> process we will spawn when required */
> 
> CPP_downscale_trigger                          /* number of idle child
> process in Pgpool-II to start
>                                                                 * killing
> the idle child process */
> 
> CPP_upscale_trigger                             /* number of idle child
> process in Pgpool-II to start
>                                                                 * spawning
> new child process */
> 
> CPP here stands for CHILD-PROCESS-POOL and these are just my thoughts and
> you may want to choose
> different names and/or different types of configurations altogether.

Apache already has similar parameters:

-------------------------------------------------------------------------
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxRequestWorkers: maximum number of server processes allowed to start

	StartServers			 5
	MinSpareServers		  5
	MaxSpareServers		 10
	MaxRequestWorkers	  150
-------------------------------------------------------------------------

I think our num_init_children looks similar to StartServers. So all we
have to have are MinSpareServers, MaxSpareServers, and
MaxRequestWorkers. (Probably we should rename them to more appropreate
ones).

> Looking forward to getting an actual proposal.
> 
> Thanks
> Best regards
> Muhammad Usama
> 
> 
> 
> On Mon, Jul 13, 2020 at 2:56 PM 周建身 <zhoujianshen at highgo.com> wrote:
> 
>> Hello Usama and Hackers,
>>
>>     I have tested the pgpool connection pool.And I think there are some
>> parts need to be enhanced.
>>
>>     When you set the parameter num_init_children = 32.only 32 child
>> processes will be forked and waiting for the connection from client.Each
>> child process can only receive one client connection,therefore, only 32
>> clients can connect to pgpool at the same time.The extra
>> connections,before connection, can only wait for the previous connection to
>> be disconnected.So,can we change the waiting connection structure of
>> pgpool. When pgpool starts ,we can fork ten child processes to wait for the
>> client to connect.When the child process receives the connection request,
>> it creates a new child process to maintain the session connection.
>>
>>     there is also another one which should be enhance is the connection
>> pool.Now, for each connection, the child process can only receive one
>> client connection. Therefore, the connection pool in the child process does
>> not play a global reuse effect.And each connection will re-initialize the
>> connection pool. Therefore we should implement a global connection pool to
>> achieve the effect of back end reuse.However ,we should confirm how many
>> connections the global connection pool should maintain.And also we should
>> confirm that if the connection pool is full,how should we respond to the
>> arrival of new connections.I can come up with two kind of solutions.
>>
>>     The first one is waiting until the connection in the connection pool
>> disconnected.and then receive the new connection.The second one,We should
>> check the number of connection and  the last access time of the connection
>> in connection pool.And we replace the connection which has the oldest
>> access time in the connection pool to the new connection.or we periodic
>> detection the access time of each connection,and throw away the connections
>> whose access time exceed a certain value.then we can use the extra space in
>> connection pool.
>>
>>     In my opinion，these two aspects need to be enhanced.How about your
>> opinion.And what do you think we need to do to enhance these two
>> aspects.Any suggestions and comments are welcome.
>>
>>
>>
>> Thanks
>> Best regards
>> Zhou Jianshen
>> zhoujianshen at highgo.com
>>
>>
>>
>>
>>
>> _______________________________________________
>> pgpool-hackers mailing list
>> pgpool-hackers at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>