[pgpool-hackers: 4192] Re: Dynamic spare process management of Pgpool-II children

Tatsuo Ishii ishii at sraoss.co.jp
Wed Sep 14 06:48:57 JST 2022


Hi Usama,

after applying the patch, I have run regression test and encountered 7
timeout.

testing 008.dbredirect...timeout.
testing 018.detach_primary...timeout.
testing 033.prefer_lower_standby_delay...timeout.
testing 034.promote_node...timeout.
testing 075.detach_primary_left_down_node...timeout.
testing 076.copy_hang...timeout.
testing 077.invalid_failover_node...timeout.

> Hi Ishii San,
> 
> Please find the rebased version attached.
> 
> Best regards
> Muhammad Usama
> 
> On Tue, Sep 13, 2022 at 9:06 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> Hi Usama,
>>
>> > Thanks!
>> >
>> > I will look into this and get back to you.
>>
>> Unfortunately your patch does not apply any more because of recent
>> commit.  Can you please rebase it?
> 
> 
>> $ git apply ~/dynamic_spare_process_management.diff
>> error: patch failed: src/main/pgpool_main.c:115
>> error: src/main/pgpool_main.c: patch does not apply
>>
>> > Best reagards,
>> > --
>> > Tatsuo Ishii
>> > SRA OSS LLC
>> > English: http://www.sraoss.co.jp/index_en/
>> > Japanese:http://www.sraoss.co.jp
>> >
>> >> Hi Hackers.
>> >>
>> >> Few years back we had a discussion on implementing the on-demand child
>> >> process spawning and "zhoujianshen at highgo.com" also shared a patch for
>> that.
>> >> Ref:
>> >>
>> https://www.sraoss.jp/pipermail/pgpool-hackers/2020-September/003831.html
>> >>
>> >> The patch had a few issues and review comments and somehow or the other
>> it
>> >> never made it to the committable state. So I decided to take that up and
>> >> re-work on that.
>> >>
>> >> Little background:
>> >> The motivation behind this feature is that while deciding the value of
>> >> num_init_children configuration, the administrator has to figure out the
>> >> maximum number of concurrent client connections that needs to be
>> supported
>> >> by the setup even if that maximum number might only hit once in a day or
>> >> even once in a month depending on the type of setup, and 90% of time
>> only
>> >> 5-10% of connections are required. But because Pgpool-II always spawns
>> >> num_init_children number of child processes at the startup, so for such
>> >> setups majority of the time the huge amount of child processes keep
>> sitting
>> >> idle and consume the system resources. This approach is suboptimal in
>> terms
>> >> of system resource usage and also causes problems like 'thundering
>> herd' (
>> >> although we do have a serialize-accept to get around that) in some
>> cases.
>> >>
>> >> So the idea is to keep the spare child processes (processes that are
>> >> sitting idle in 'waiting for connection' state) within the configured
>> >> limits., and depending on the connected client count scale up/down the
>> >> number of child processes.
>> >>
>> >> Attached re-worked patch:
>> >> The original patch on the topic had few shortcomings but my biggest
>> concern
>> >> was the approach that it uses to scale down the child processes. IMHO
>> the
>> >> patch was too aggressive in bringing down the child process if
>> >> max_spare_child and the victim process identification was not smart.
>> >> Secondly the responsibility to manage the spare children was not
>> properly
>> >> segregated and was shared between main and child processes.
>> >>
>> >> So I took up the patch and basically redesigned it from the ground up.
>> The
>> >> attached version of patch gives the responsibility of keeping the track
>> of
>> >> spare processes and scaling them to the main process and also implements
>> >> three strategies of scaling down. On top of that it also adds the switch
>> >> that can be used to turn off this auto scaling feature and brings back
>> >> current behaviour.
>> >>
>> >> Moreover instead of using a new configuration (max_children as in the
>> >> original patch) the attached one uses the existing num_init_children
>> config
>> >> to keep the backward compatibility.
>> >>
>> >> To summarise, the patch adds the following new config parameters to
>> control
>> >> the process scaling
>> >>
>> >> -- process_management_mode (default = static )
>> >> can be set to either static or dynamic. while static keeps the current
>> >> behaviour and dynamic mode enables the auto scaling of spare processes
>> >>
>> >> -- process_management_strategy (default = gentle )
>> >> Configures the process management strategy to satisfy spare processes
>> count.
>> >> Valid options:
>> >> lazy:
>> >> In this mode the scale down is performed gradually
>> >> and only gets triggered when excessive spare processes count
>> >> remains high for more than 5 mins
>> >> gentle:
>> >> In this mode the scale down is performed gradually
>> >> and only gets triggered when excessive spare processes count
>> >> remains high for more than 2 mins
>> >> aggressive:
>> >> In this mode the scale down is performed aggressively
>> >> and gets triggered more frequently in case of higher spare processes.
>> >> This mode uses faster and slightly less smart process selection criteria
>> >>  to identify the child processes that can be serviced to satisfy
>> >> max_spare_children
>> >>
>> >> -- min_spare_children
>> >> Minimum number of spare child processes to keep in waiting for
>> connection
>> >> state
>> >> This works only for dynamic process management mode
>> >>
>> >> --max_spare_children
>> >> Maximum number of spare child processes to keep in waiting for
>> connection
>> >> state
>> >> This works only for dynamic process management mode
>> >>
>> >> Furthermore, the patch relies on existing conn_counter to keep track of
>> >> connected children count, that means it does not add additional
>> overhead of
>> >> computing that information, and the documentation updates are still not
>> >> part of the patch and I will add those once we have an agreement on the
>> >> approach, and usability of the feature.
>> >>
>> >>
>> >> Meanwhile I am trying to figure out a way to benchmark this feature if
>> it
>> >> adds any performance benefits but haven't been able to figure out the
>> way
>> >> to do that currently. So any suggestions on this topic is welcome.
>> >>
>> >> Thanks
>> >> Best rgards
>> >> Muhammad Usama
>> > _______________________________________________
>> > pgpool-hackers mailing list
>> > pgpool-hackers at pgpool.net
>> > http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>>


More information about the pgpool-hackers mailing list