[pgpool-hackers: 4251] Re: Issue with failover_require_consensus

Tatsuo Ishii ishii at sraoss.co.jp
Tue Dec 20 09:16:30 JST 2022


Hi Usama,

Thank you for pushing your fix but it seems the fix was only pushed
down to 4.2 stable.  Since the quorum failover was introduced in 3.7,
I think we need to push the fix to 4.1 and 4.0 at least (3.7 will be
obsoleted soon, and we may not need to push to 3.7).

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


> Thank you! Looks perfect (and my test passed).
> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
> 
> 
>> Hi
>> Sorry for the inconvenience. I guess I attached the wrong patch file
>> previously.
>> Please find the latest one that also changes the DEBUG to LOG, as suggested.
>> 
>> Thanks
>> best Regards
>> Muhammad Usama
>> 
>> 
>> On Mon, Dec 19, 2022 at 9:07 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> 
>>> > Hi Ishii San
>>> >
>>> > Thanks for the feedback. Please find the updated patch that takes care of
>>> > transferring the per node health check params through WD_POOL_CONFIG_DATA
>>> > and use those in calculations.
>>> > Could you verify if the attached fixes the issue?
>>>
>>> Sure.
>>>
>>> The patch adds trailing spaces.
>>>
>>> /home/t-ishii/fix_failover_command_timout_v2.diff:68: trailing whitespace.
>>>                         int pn_failover_command_timeout =
>>> pool_config->health_check_params[i].health_check_period +
>>> /home/t-ishii/fix_failover_command_timout_v2.diff:69: trailing whitespace.
>>>
>>> (pool_config->health_check_params[i].health_check_retry_delay *
>>> /home/t-ishii/fix_failover_command_timout_v2.diff:118: trailing whitespace.
>>>                         health_check_params_count =
>>> config->backend_desc->num_backends;
>>> warning: 3 lines add whitespace errors.
>>>
>>> There is a compile error:
>>> watchdog.c:8203:10: error: 。DEBUG「 undeclared (first use in this
>>> function); did you mean 。DEBUG1「?
>>>  8203 |  ereport(DEBUG,(errmsg("Setting failover command timeout to
>>> %d",failover_command_timeout)));
>>>       |          ^~~~~
>>>
>>> We'd better to use "LOG", instead DEBUG* here? Because:
>>> - the log message is not frequent
>>> - the timeout value is an important information
>>>
>>> What do you think?
>>>
>>> > As for standard_packet_processor() is concerned, it Is intended to handle
>>> > messages
>>> > that do not require to be handled differently for each watchdog state.
>>>
>>> Ok. Thank you for the explanation.
>>>
>>> Best reagards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS LLC
>>> English: http://www.sraoss.co.jp/index_en/
>>> Japanese:http://www.sraoss.co.jp
>>>
>>> > Thanks
>>> > Best Regards
>>> > Muhammad Usama
>>> >
>>> >
>>> > On Sat, Dec 17, 2022 at 5:01 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>> >
>>> >> >>> On Tue, Nov 29, 2022 at 3:27 AM Tatsuo Ishii <ishii at sraoss.co.jp>
>>> >> wrote:
>>> >> >>>
>>> >> >>>> >> Hi Ishii-San
>>> >> >>>> >>
>>> >> >>>> >> Sorry for the delayed response.
>>> >> >>>> >
>>> >> >>>> > No problem.
>>> >> >>>> >
>>> >> >>>> >> With the attached fix I guess the failover objects will linger
>>> on
>>> >> >>>> forever
>>> >> >>>> >> in case of a false alarm by a health check or small glitch.
>>> >> >>>> >
>>> >> >>>> > That's not good.
>>> >> >>>> >
>>> >> >>>> >> One way to get around the issue could be to compute
>>> >> >>>> >> FAILOVER_COMMAND_FINISH_TIMEOUT based on the maximum value
>>> >> >>>> >> of health_check_peroid across the cluster.
>>> >> >>>> >> something like: failover_command_finish_timouut =
>>> >> >>>> max(health_check_period)
>>> >> >>>> >> * 2 = 60
>>> >> >>>>
>>> >> >>>> After thinking more, I think we need to take account
>>> >> >>>> health_check_max_retries and health_check_retry_delay as
>>> >> >>>> well. i.e. instead of max(health_check_period), something like:
>>> >> >>>> max(health_check_period + (health_check_retry_delay *
>>> >> >>>> health_check_max_retries)).
>>> >> >>>>
>>> >> >>>> What do you think?
>>> >> >>>>
>>> >> >>>
>>> >> >>> Thanks for the valuable suggestions.
>>> >> >>> Can you try out the attached patch to see if it solves the issue?
>>> >> >>
>>> >> >> Unfortunately the patch did not pass my test case.
>>> >> >>
>>> >> >> - 3 watchdog nodes and 2 PostgreSQL servers, streaming replication
>>> >> >>   cluster (created by watchdog_setup). pgpool0 is the watchdog
>>> leader.
>>> >> >>
>>> >> >> - health_check_period = 300, health_check_max_retries = 0
>>> >> >>
>>> >> >> - pgpool1 starts 120 seconds after pgpool0 starts
>>> >> >>
>>> >> >> - pgpool2 does not start
>>> >> >>
>>> >> >> - after watchdog cluster becomes ready, shutdown PostgreSQL node 1
>>> >> (standby).
>>> >> >>
>>> >> >> - wait for 600 seconds to expect a failover.
>>> >> >>
>>> >> >> Unfortunately failover did not happen.
>>> >> >>
>>> >> >> Attached is the test script and pgpool0 log.
>>> >> >>
>>> >> >> To run the test:
>>> >> >>
>>> >> >> - unpack test.tar.gz
>>> >> >>
>>> >> >> - run prepare.sh
>>> >> >>   $ sh prepare.sh
>>> >> >>   This should create "testdir" directory with 3 watchdog node +
>>> >> PostgreSQL 2 node cluster.
>>> >> >>
>>> >> >> - cd testdir and run the test
>>> >> >>   $ sh ../start.sg -o 120
>>> >> >>   This will start the test, "-o" specifies how long wait before
>>> >> strating pgpool1.
>>> >> >
>>> >> > After the test failure, I examined the pgpool log on the pgpool leader
>>> >> > node (node 0). It seems timeout was not updated as expected.
>>> >> >
>>> >> > 2022-12-17 08:07:11.419: watchdog pid 707483: LOG:  failover request
>>> >> from 1 nodes with ID:42 is expired
>>> >> > 2022-12-17 08:07:11.419: watchdog pid 707483: DETAIL:  marking the
>>> >> failover object for removal. timeout: 15
>>> >> >
>>> >> > After looking into the code, I found update_failover_timeout() only
>>> >> > examines "health_check_period".  I think you need to examine
>>> >> > "health_check_period0" etc. as well and find the larget one for the
>>> >> > timeout caliculation.
>>> >> >
>>> >> > By the way,
>>> >> >
>>> >> >> failover_command_timout
>>> >> >> g_cluster.failover_command_timout
>>> >> >
>>> >> > I think "timout" should be "timeout".
>>> >>
>>> >> I was trying to create a proof of concept patch for this:
>>> >>
>>> >> > After looking into the code, I found update_failover_timeout() only
>>> >> > examines "health_check_period".  I think you need to examine
>>> >> > "health_check_period0" etc. as well and find the larget one for the
>>> >> > timeout caliculation.
>>> >>
>>> >> and noticed that update_failover_timeout() is called by
>>> >> standard_packet_processor() when WD_POOL_CONFIG_DATA packet is
>>> >> received like: update_failover_timeout(wdNode, standby_config); But
>>> >> standby_config was created by get_pool_config_from_json() which does
>>> >> not seem to create health_check_params in pool_config data. Also I
>>> >> wonder why standard_packet_processor() needs to be called when
>>> >> WD_POOL_CONFIG_DATA is recieved. Can't we simply omit the call to
>>> >> update_failover_timeout() in this case?
>>> >>
>>> >> Best reagards,
>>> >> --
>>> >> Tatsuo Ishii
>>> >> SRA OSS LLC
>>> >> English: http://www.sraoss.co.jp/index_en/
>>> >> Japanese:http://www.sraoss.co.jp
>>> >>
>>>
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers


More information about the pgpool-hackers mailing list