View Revisions: Issue #536

Summary 0000536: Recovery of standby node in streaming replication mode causes all connections to be dropped
Revision 2019-08-15 16:32 by t-ishii
Additional Information Log events:

Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 624658: LOG: node recovery, node: 4 restarted
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 624658: LOG: received failback request for node_id: 4 from pid [624658]
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: new IPC connection received
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog received the failover command from local pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog is processing the failover command [FAILBACK_REQUEST] received from local pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: The failover request does not need quorum to hold
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: DETAIL: proceeding with the failover
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: HINT: REQ_DETAIL_CONFIRMED
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: Pgpool-II parent process has received failover request
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: new IPC connection received
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: received the failover indication from Pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog is informed of failover start by the main process
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: starting fail back. reconnect host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: Node 0 is not down (status: 2)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: Do not restart children because we are failing back node id 4 host: db005 port: 5432 and we are in streaming replication mode and not all backends were down
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node_repeatedly: waiting for finding a primary node
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 0
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 1
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 2
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 3
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: primary node id is 3
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: failover: set new primary node: 3
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: failover: set new master node: 0
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: new IPC connection received
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: received the failover indication from Pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog is informed of failover end by the main process
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: failback done. reconnect host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: start health check process for host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: start health check process for host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 624658: LOG: recovery done
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 614305: LOG: failback event detected
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 614305: DETAIL: restarting myself
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 598356: LOG: restart request received in pcp child process
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 597088: LOG: PCP child 598356 exits with status 0 in failover()
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 597088: LOG: fork a new PCP child pid 624833 in failover()
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 617874: LOG: failback event detected
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 617874: DETAIL: restarting myself
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 597137: LOG: failback event detected
(repeat failback many times)
Revision 2019-08-09 20:11 by gmartinez
Additional Information Log events:

Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 624658: LOG: node recovery, node: 4 restarted
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 624658: LOG: received failback request for node_id: 4 from pid [624658]
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: new IPC connection received
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog received the failover command from local pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog is processing the failover command [FAILBACK_REQUEST] received from local pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: The failover request does not need quorum to hold
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: DETAIL: proceeding with the failover
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: HINT: REQ_DETAIL_CONFIRMED
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: Pgpool-II parent process has received failover request
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: new IPC connection received
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598089: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: received the failover indication from Pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597318: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog is informed of failover start by the main process
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597971: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597993: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: starting fail back. reconnect host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: Node 0 is not down (status: 2)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: Do not restart children because we are failing back node id 4 host: db005 port: 5432 and we are in streaming replication mode and not all backends were down
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node_repeatedly: waiting for finding a primary node
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 0
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 1
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 2
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: WARNING: failover/failback is in progress
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: DETAIL: executing failover or failback on backend
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 598130: HINT: In a moment you should be able to reconnect to the database
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: checking backend no 3
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: find_primary_node: primary node id is 3
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: failover: set new primary node: 3
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: failover: set new master node: 0
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: new IPC connection received
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: received the failover indication from Pgpool-II on IPC interface
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597090: LOG: watchdog is informed of failover end by the main process
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: failback done. reconnect host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: start health check process for host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 597088: LOG: start health check process for host db005(5432)
Aug 09 10:43:54 db006 cgexec[597088]: 2019-08-09 10:43:54: pid 624658: LOG: recovery done
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 614305: LOG: failback event detected
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 614305: DETAIL: restarting myself
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 598356: LOG: restart request received in pcp child process
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 597088: LOG: PCP child 598356 exits with status 0 in failover()
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 597088: LOG: fork a new PCP child pid 624833 in failover()
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 617874: LOG: failback event detected
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 617874: DETAIL: restarting myself
Aug 09 10:43:55 db006 cgexec[597088]: 2019-08-09 10:43:55: pid 597137: LOG: failback event detected
(repeat failback many times)