[pgpool-hackers: 4246] Re: Issue with failover_require_consensus

Sat Dec 17 21:01:19 JST 2022

>>> On Tue, Nov 29, 2022 at 3:27 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>> 
>>>> >> Hi Ishii-San
>>>> >>
>>>> >> Sorry for the delayed response.
>>>> >
>>>> > No problem.
>>>> >
>>>> >> With the attached fix I guess the failover objects will linger on
>>>> forever
>>>> >> in case of a false alarm by a health check or small glitch.
>>>> >
>>>> > That's not good.
>>>> >
>>>> >> One way to get around the issue could be to compute
>>>> >> FAILOVER_COMMAND_FINISH_TIMEOUT based on the maximum value
>>>> >> of health_check_peroid across the cluster.
>>>> >> something like: failover_command_finish_timouut =
>>>> max(health_check_period)
>>>> >> * 2 = 60
>>>>
>>>> After thinking more, I think we need to take account
>>>> health_check_max_retries and health_check_retry_delay as
>>>> well. i.e. instead of max(health_check_period), something like:
>>>> max(health_check_period + (health_check_retry_delay *
>>>> health_check_max_retries)).
>>>>
>>>> What do you think?
>>>>
>>> 
>>> Thanks for the valuable suggestions.
>>> Can you try out the attached patch to see if it solves the issue?
>> 
>> Unfortunately the patch did not pass my test case.
>> 
>> - 3 watchdog nodes and 2 PostgreSQL servers, streaming replication
>>   cluster (created by watchdog_setup). pgpool0 is the watchdog leader.
>> 
>> - health_check_period = 300, health_check_max_retries = 0
>> 
>> - pgpool1 starts 120 seconds after pgpool0 starts
>> 
>> - pgpool2 does not start
>> 
>> - after watchdog cluster becomes ready, shutdown PostgreSQL node 1 (standby).
>> 
>> - wait for 600 seconds to expect a failover.
>> 
>> Unfortunately failover did not happen.
>> 
>> Attached is the test script and pgpool0 log.
>> 
>> To run the test:
>> 
>> - unpack test.tar.gz
>> 
>> - run prepare.sh
>>   $ sh prepare.sh
>>   This should create "testdir" directory with 3 watchdog node + PostgreSQL 2 node cluster.
>> 
>> - cd testdir and run the test
>>   $ sh ../start.sg -o 120
>>   This will start the test, "-o" specifies how long wait before strating pgpool1.
> 
> After the test failure, I examined the pgpool log on the pgpool leader
> node (node 0). It seems timeout was not updated as expected.
> 
> 2022-12-17 08:07:11.419: watchdog pid 707483: LOG:  failover request from 1 nodes with ID:42 is expired
> 2022-12-17 08:07:11.419: watchdog pid 707483: DETAIL:  marking the failover object for removal. timeout: 15
> 
> After looking into the code, I found update_failover_timeout() only
> examines "health_check_period".  I think you need to examine
> "health_check_period0" etc. as well and find the larget one for the
> timeout caliculation.
> 
> By the way,
> 
>> failover_command_timout
>> g_cluster.failover_command_timout
> 
> I think "timout" should be "timeout".

I was trying to create a proof of concept patch for this:

> After looking into the code, I found update_failover_timeout() only
> examines "health_check_period".  I think you need to examine
> "health_check_period0" etc. as well and find the larget one for the
> timeout caliculation.

and noticed that update_failover_timeout() is called by
standard_packet_processor() when WD_POOL_CONFIG_DATA packet is
received like: update_failover_timeout(wdNode, standby_config); But
standby_config was created by get_pool_config_from_json() which does
not seem to create health_check_params in pool_config data. Also I
wonder why standard_packet_processor() needs to be called when
WD_POOL_CONFIG_DATA is recieved. Can't we simply omit the call to
update_failover_timeout() in this case?

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp