[pgpool-general: 6064] Re: "health check timer expired" on local machine
ishii at sraoss.co.jp
Sat Apr 28 09:10:09 JST 2018
I noticed you set health_check_max_retries = 0. If the error were a
transient one, set some positive number to health_check_max_retries
Also I am interested in a strace log when the failover occurs.
SRA OSS, Inc. Japan
> Oh I forgot the configuration, here it is :
> health_check_period = 2
> health_check_timeout = 6
> health_check_max_retries = 0
> health_check_retry_delay = 1
> connect_timeout = 10000
> No individual healtcheck settings.
> So of course I could increase connect_timeout, but 10 seconds is already a
> lot to trigger failover process for a production server receiving ~10
> insert / second.
> 2018-04-26 21:23 GMT+02:00 Bud Curly <psyckow.prod at gmail.com>:
>> Hi and thanks for your work.
>> I use pgpool2 3.7.2 (latest git) with 2 backend as master-slave mode with
>> native stream replication.
>> I think I have an issue concerning the health check process.
>> Since two days now I had two "health check timer expired" that appears
>> yersterday around 9 am and today around 8 pm.
>> The weird thing is... Pgpool and the backend in question are on the same
>> machine. This backend is the master. Here is the log :
>> 2018-04-26 20:59:29: pid 2153:LOG: failed to connect to PostgreSQL server
>> on "x.x.x.x:xxx" using INET socket
>> 2018-04-26 20:59:29: pid 2153:DETAIL: health check timer expired
>> 2018-04-26 20:59:29: pid 2153:ERROR: failed to make persistent db
>> 2018-04-26 20:59:29: pid 2153:DETAIL: connection to host:" x.x.x.x:xxx"
>> 2018-04-26 20:59:29: pid 2153:LOG: health check failed on node 0
>> 2018-04-26 20:59:29: pid 2153:LOG: received degenerate backend request
>> for node_id: 0 from pid 
>> 2018-04-26 20:59:29: pid 2104:LOG: Pgpool-II parent process has received
>> failover request
>> 2018-04-26 20:59:29: pid 2104:LOG: starting degeneration. shutdown host
>> 2018-04-26 20:59:29: pid 2104:LOG: Restart all children
>> Despite the fact that these are on the same machine, I use public IP for
>> the backend0 and not 127.0.0.1, because of failover process that required
>> this ip.
>> Do you think this could be a problem from network conditions on the server
>> itself or an actual issue ?
More information about the pgpool-general