[pgpool-general: 6066] Re: "health check timer expired" on local machine
psyckow.prod at gmail.com
Mon Apr 30 22:23:55 JST 2018
So on the 28 I set health_check_max_retries = 1 and had no problem during
1 whole day. So I set back health_check_max_retries = 0 yersterday (the
29th) to make sure of the problem and the problem didn't showed up since
So I want to think that was some network connection reset from my server
datacenter... This problem appeared one week after running with
= 0 and no problem.
I'm sorry I don't have the log anymore. I will wait until tomorrow
= 0 but then will set health_check_max_retries = 1 to start pre-prod test.
Thanks for your help, have a nice day !
2018-04-28 2:10 GMT+02:00 Tatsuo Ishii <ishii at sraoss.co.jp>:
> I noticed you set health_check_max_retries = 0. If the error were a
> transient one, set some positive number to health_check_max_retries
> might help.
> Also I am interested in a strace log when the failover occurs.
> Best regards,
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> > Oh I forgot the configuration, here it is :
> > health_check_period = 2
> > health_check_timeout = 6
> > health_check_max_retries = 0
> > health_check_retry_delay = 1
> > connect_timeout = 10000
> > No individual healtcheck settings.
> > So of course I could increase connect_timeout, but 10 seconds is already
> > lot to trigger failover process for a production server receiving ~10
> > insert / second.
> > 2018-04-26 21:23 GMT+02:00 Bud Curly <psyckow.prod at gmail.com>:
> >> Hi and thanks for your work.
> >> I use pgpool2 3.7.2 (latest git) with 2 backend as master-slave mode
> >> native stream replication.
> >> I think I have an issue concerning the health check process.
> >> Since two days now I had two "health check timer expired" that appears
> >> yersterday around 9 am and today around 8 pm.
> >> The weird thing is... Pgpool and the backend in question are on the same
> >> machine. This backend is the master. Here is the log :
> >> 2018-04-26 20:59:29: pid 2153:LOG: failed to connect to PostgreSQL
> >> on "x.x.x.x:xxx" using INET socket
> >> 2018-04-26 20:59:29: pid 2153:DETAIL: health check timer expired
> >> 2018-04-26 20:59:29: pid 2153:ERROR: failed to make persistent db
> >> connection
> >> 2018-04-26 20:59:29: pid 2153:DETAIL: connection to host:" x.x.x.x:xxx"
> >> failed
> >> 2018-04-26 20:59:29: pid 2153:LOG: health check failed on node 0
> >> (timeout:1)
> >> 2018-04-26 20:59:29: pid 2153:LOG: received degenerate backend request
> >> for node_id: 0 from pid 
> >> 2018-04-26 20:59:29: pid 2104:LOG: Pgpool-II parent process has
> >> failover request
> >> 2018-04-26 20:59:29: pid 2104:LOG: starting degeneration. shutdown host
> >> x.x.x.x:xxx
> >> 2018-04-26 20:59:29: pid 2104:LOG: Restart all children
> >> Despite the fact that these are on the same machine, I use public IP for
> >> the backend0 and not 127.0.0.1, because of failover process that
> >> this ip.
> >> Do you think this could be a problem from network conditions on the
> >> itself or an actual issue ?
> >> Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the pgpool-general