[Pgpool-hackers] Health check retries (patch)

Sun Nov 20 06:45:07 UTC 2011

> Thanks!  English documentation patch is attached.

Thanks! Patch committed along with Japanese doc changes.
(I didn't change other languages docs since I don't understand thenm).

> One comment:  I wrote "You need to reload pgpool.conf
> if you change these settings", but in fact, I have no idea if this is true :-).  Please feel free to change as necessary.

This is true and no need to change.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> On Nov 18, 2011, at 8:53 PM, Tatsuo Ishii wrote:
> 
>> Matt,
>> 
>> Thank you! The patch looks pretty good.  Patch committed with a few
>> modications.
>> http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=55199bdfa7630cf9a5142703ef85ee7695bb4221
>> 
>> 1) While retrying, emit log(rather than debug message). This would be
>> more usefull for DBA because it makes clear that pgpool tries to
>> recover state. Here is a sampe message.
>> 2011-11-19 13:23:12 LOG:   pid 10375: health check retry sleep time: 1 second(s)
>> 
>> 2) After successfull retry, emit a log.
>> 2011-11-19 13:23:19 LOG:   pid 10375: after some retrying backend returned to healthy state
>> 
>> BTW, I think to make the new feature works better, it's best to turn
>> on fail_over_on_backend_error because even if health checking retries,
>> writing to backend socket causes immediate failover if
>> fail_over_on_backend_error is set to off.
>> 
>> Also new_connection() was fixed because it caused immediate failover
>> when trying to connect to backend despite fail_over_on_backend_error
>> is set to on.
>> 
>> Could you provide English documentation for this?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>> 
>>> Hi everyone.  In August, I wrote to the pgpool-general list (see below) asking if there was any
>>> way to have pgpool-II retry a failed health check before promoting the slave.
>>> 
>>> I'm attaching a patch that adds this functionality.  Would anyone care to review it?  We've been
>>> using it successfully in production for about 3 months now, and it's working great.
>>> 
>>> This is my first time submitting a patch to PostgreSQL or PgPool, so go easy :-).
>>> 
>>> Some comments:
>>> - The purpose of this feature is to allow pgpool-II to handle brief networking interruptions
>>>   without being "fooled" into thinking that the master node is down and the slave needs to
>>>   be promoted.
>>> - This patch adds two new configuration settings.
>>> - The "health_check_max_retries" setting is the maximum number of times to retry a health
>>>   check before giving up.
>>> - The "health_check_retry_delay" is the amount of time (in seconds) to sleep between retries.
>>> - The feature is turned *off* by default (health_check_max_retries defaults to 0, or no retries).
>>> 
>>> Patch is against git HEAD revision (commit 58043c962b8305507de0f450be74c24cbe4c8430).
>>> 
>>> Please let me know if you have any questions or comments.
>>> 
>>> -- Matt