[pgpool-general: 6069] failover_require_consensus does not work.

Vlad G omenvlad at gmail.com
Thu May 3 21:20:04 JST 2018


Hey Guys,
I have a cluster with Pgpool-II-pg96-3.7.3 and postgresql-9.6.
(3 x pgpool and 3 x postgresql
The same scheme as:
http://www.pgpool.net/docs/latest/en/html/example-cluster.html

When master node of postgresql (pgpoolpsql-1) goes down the master node of pgpool (  pgpool-1)  does not get second vote from one of the standby pgpool nodes (pgpool-2 and pgpool-3). 

If I set:
failover_require_consensus = off
Everything works fine.

May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24237: LOG:  failed to connect to PostgreSQL server on "pgpoolpsql-1:5432", getsockopt() detected error "Connection refused"
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24237: LOG:  received degenerate backend request for node_id: 0 from pid [24237]
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24217: LOG:  new IPC connection received
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24217: LOG:  watchdog received the failover command from local pgpool-II on IPC interface
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24217: LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24217: LOG:  failover requires the majority vote, waiting for consensus
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24217: DETAIL:  failover request noted
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24217: LOG:  failover command [DEGENERATE_BACKEND_REQUEST] request from pgpool-II node "pgpool-1:9999 Linux pgpool-1" is queued, waiting for the confirmation from other nodes
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24237: LOG:  degenerate backend request for node_id: 0 from pid [24237], will be handled by watchdog, which is building consensus for request
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24237: FATAL:  failed to create a backend connection
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24237: DETAIL:  executing failover on backend
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24216: LOG:  child process with pid: 24237 exits with status 256
May 03 13:02:45 pgpool-1 pgpool[24216]: 2018-05-03 13:02:45: pid 24216: LOG:  fork a new child process with pid: 24268
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24228: LOG:  failed to connect to PostgreSQL server on "pgpoolpsql-1:5432", getsockopt() detected error "Connection refused"
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24228: LOG:  received degenerate backend request for node_id: 0 from pid [24228]
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  new IPC connection received
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  watchdog received the failover command from local pgpool-II on IPC interface
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  watchdog is processing the failover command [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  Duplicate failover request from "pgpool-1:9999 Linux pgpool-1" node
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: DETAIL:  request ignored
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  failover requires the majority vote, waiting for consensus
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: DETAIL:  failover request noted
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24228: LOG:  degenerate backend request for 1 node(s) from pid [24228], is changed to quarantine node request by watchdog
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24228: DETAIL:  watchdog is taking time to build consensus
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24228: FATAL:  failed to create a backend connection
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24228: DETAIL:  executing failover on backend
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24216: LOG:  Pgpool-II parent process has received failover request
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  new IPC connection received
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  received the failover indication from Pgpool-II on IPC interface
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  watchdog is informed of failover end by the main process
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24216: LOG:  starting quarantine. shutdown host pgpoolpsql-1(5432)
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24216: LOG:  Restart all children
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24216: LOG:  failover: set new primary node: -1
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24216: LOG:  failover: set new master node: 1
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24252: LOG:  worker process received restart request
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  new IPC connection received
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  received the failover indication from Pgpool-II on IPC interface
May 03 13:02:46 pgpool-1 pgpool[24216]: 2018-05-03 13:02:46: pid 24217: LOG:  watchdog is informed of failover start by the main process
May 03 13:02:46 pgpool-1 pgpool[24216]: quarantine done. shutdown host pgpoolpsql-1(5432)2018-05-03 13:02:46: pid 24216: LOG:  quarantine done. shutdown host pgpoolpsql-1(5432)
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24251: LOG:  restart request received in pcp child process
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  PCP child 24251 exits with status 0 in failover()
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  fork a new PCP child pid 24301 in failover()
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  child process with pid: 24219 exits with status 0
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  child process with pid: 24219 exited with success and will not be restarted
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  child process with pid: 24220 exits with status 0
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  child process with pid: 24220 exited with success and will not be restarted
May 03 13:02:47 pgpool-1 pgpool[24216]: 2018-05-03 13:02:47: pid 24216: LOG:  child process with pid: 24221 exits with status 0

Around a month ago it woked fine (It seems I tested it on pgpool-3.7.2), but now it does not work. Could you tell me some parameters what it depends on or you have other thoughts. 

Best regards,
Vladyslav



More information about the pgpool-general mailing list