View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000219 | Pgpool-II | Bug | public | 2016-07-15 19:41 | 2016-07-26 07:32 |
| Reporter | stivi21 | Assigned To | |||
| Priority | high | Severity | major | Reproducibility | always |
| Status | closed | Resolution | open | ||
| OS | Ubuntu | OS Version | 14.04 | ||
| Summary | 0000219: Inconsistent status of Postgresql nodes in pgPool instances after restart. Watchdog not syncing status. | ||||
| Description | Pgpool: 3.5.3 Postgresql: 9.4 Pgpool configuration: replication mode, using watchdog but without VIP, haproxy instead. Environment: Amazon Web Services, EC2 Host0 (vir-stag-pg-00): pgpool and postgres instance (master) Host1 (vir-stag-pg-01): pgpool and postgres instance (slave) Host2 (vir-stag-pg-02): pgpool instance (for quorum) | ||||
| Steps To Reproduce | 1) All hosts are working with statuses: # Status od postgresql (the same status of all instances od pgpool): node_id | hostname | port | status | lb_weight | role | select_cnt ---------+----------------+------+--------+-----------+---------+------------ 0 | vir-stag-pg-00 | 5432 | 2 | 0.500000 | primary | 1023309 1 | vir-stag-pg-01 | 5432 | 3 | 0.500000 | standby | 0 # Watchdog status (from Host0): 3 YES Linux_ip-10-0-1-129_5433 vir-stag-pg-00 Linux_ip-10-0-1-129_5433 vir-stag-pg-00 5433 9000 4 MASTER Linux_ip-10-0-112-190_5433 vir-stag-pg-01 5433 9000 7 STANDBY Linux_ip-10-0-2-132_5433 vir-stag-pg-02 5433 9000 7 STANDBY 2) Host1 going to stop for few minutes, then start again 3) After this, on Host0 and Host2 are the same status, where postgres from host2 have status 3: node_id | hostname | port | status | lb_weight | role | select_cnt ---------+----------------+------+--------+-----------+---------+------------ 0 | vir-stag-pg-00 | 5432 | 2 | 0.500000 | primary | 1026367 1 | vir-stag-pg-01 | 5432 | 3 | 0.500000 | standby | 0 On Host1 both postgres nodes has status 2: node_id | hostname | port | status | lb_weight | role | select_cnt ---------+----------------+------+--------+-----------+---------+------------ 0 | vir-stag-pg-00 | 5432 | 2 | 0.500000 | primary | 5 1 | vir-stag-pg-01 | 5432 | 2 | 0.500000 | standby | 0 Status of watchdog is ok on all pgpool instances: 3 YES Linux_ip-10-0-1-129_5433 vir-stag-pg-00 Linux_ip-10-0-1-129_5433 vir-stag-pg-00 5433 9000 4 MASTER Linux_ip-10-0-112-190_5433 vir-stag-pg-01 5433 9000 7 STANDBY Linux_ip-10-0-2-132_5433 vir-stag-pg-02 5433 9000 7 STANDBY | ||||
| Additional Information | The same situation, when I restarted Host0, but after restart pgpool on this host see 2 masters postgresql (backause slave was promoted to master), but both are active, another pgpool instances have first pgpool node as detached. Logs from pgpool on Host2 (node "Linux_ip-10-0-112-190_5433" is pgpool on restarted Host1): ### Host1 stopped Jul 15 09:15:07 ip-10-0-2-132 pgpool[7154]: [40-1] 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection Jul 15 09:15:07 ip-10-0-2-132 pgpool: 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection Jul 15 09:15:07 ip-10-0-2-132 pgpool[7154]: [41-1] 2016-07-15 09:15:07: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down Jul 15 09:15:07 ip-10-0-2-132 pgpool: 2016-07-15 09:15:07: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down Jul 15 09:15:07 ip-10-0-2-132 pgpool[7154]: [42-1] 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection Jul 15 09:15:07 ip-10-0-2-132 pgpool: 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [43-1] 2016-07-15 09:15:08: pid 7154: LOG: received degenerate backend request for node_id: 1 from pid [7154] Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: received degenerate backend request for node_id: 1 from pid [7154] Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [44-1] 2016-07-15 09:15:08: pid 7150: LOG: starting degeneration. shutdown host vir-stag-pg-01(5432) Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: starting degeneration. shutdown host vir-stag-pg-01(5432) Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [44-1] 2016-07-15 09:15:08: pid 7154: LOG: sending watchdog response Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [44-2] 2016-07-15 09:15:08: pid 7154: DETAIL: failover request from other pgpool is canceled because of switching Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sending watchdog response Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: failover request from other pgpool is canceled because of switching Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [45-1] 2016-07-15 09:15:08: pid 7150: LOG: Restart all children Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: Restart all children Jul 15 09:15:08 ip-10-0-2-132 pgpool[7156]: [5-1] 2016-07-15 09:15:08: pid 7156: LOG: child process received shutdown request signal 3 Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7156: LOG: child process received shutdown request signal 3 [ cut a lot of the same lines ] Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [45-1] 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [46-1] 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [47-1] 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433" Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [47-2] 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node... Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7303: LOG: child process received shutdown request signal 3 [ cut a lot of the same lines ] Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433" Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node... Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [46-1] 2016-07-15 09:15:08: pid 7150: LOG: execute command: /etc/pgpool2/failover_stream.sh 1 vir-stag-pg-00 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: execute command: /etc/pgpool2/failover_stream.sh 1 vir-stag-pg-00 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool: Fri Jul 15 09:15:08 UTC 2016 Jul 15 09:15:08 ip-10-0-2-132 pgpool: Failed node: 1 Jul 15 09:15:08 ip-10-0-2-132 pgpool: + [ 1 != 0 ] Jul 15 09:15:08 ip-10-0-2-132 pgpool: + echo [INFO] Slave node is down. New master not promoted! Jul 15 09:15:08 ip-10-0-2-132 pgpool: [INFO] Slave node is down. New master not promoted! Jul 15 09:15:08 ip-10-0-2-132 pgpool: + exit 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [48-1] 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [49-1] 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [50-1] 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433" Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [50-2] 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node... Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433" Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node... Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [47-1] 2016-07-15 09:15:08: pid 7150: LOG: failover: set new primary node: 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [48-1] 2016-07-15 09:15:08: pid 7150: LOG: failover: set new master node: 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: failover: set new primary node: 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: failover: set new master node: 0 Jul 15 09:15:08 ip-10-0-2-132 pgpool[7361]: [5-1] 2016-07-15 09:15:08: pid 7361: LOG: worker process received restart request Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7361: LOG: worker process received restart request Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [51-1] 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [52-1] 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [53-1] 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433" Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [53-2] 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node... Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433" Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node... Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [49-1] 2016-07-15 09:15:08: pid 7150: LOG: failover done. shutdown host vir-stag-pg-01(5432) Jul 15 09:15:08 ip-10-0-2-132 pgpool: failover done. shutdown host vir-stag-pg-01(5432)2016-07-15 09:15:08: pid 7150: LOG: failover done. shutdown host vir-stag-pg-01(5432) Jul 15 09:15:09 ip-10-0-2-132 pgpool[7360]: [1334-1] 2016-07-15 09:15:09: pid 7360: LOG: restart request received in pcp child process Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7360: LOG: restart request received in pcp child process Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [50-1] 2016-07-15 09:15:09: pid 7150: LOG: PCP child 7360 exits with status 0 in failover() Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: PCP child 7360 exits with status 0 in failover() Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [51-1] 2016-07-15 09:15:09: pid 7150: LOG: fork a new PCP child pid 28732 in failover() Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: fork a new PCP child pid 28732 in failover() Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [52-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exits with status 0 Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [53-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exited with success and will not be restarted Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exits with status 0 Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [54-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7157 exits with status 0 Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exited with success and will not be restarted Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [55-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7157 exited with success and will not be restarted Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7157 exits with status 0 [ cut a lot of the same lines ] ### Host1 started Jul 15 09:15:41 ip-10-0-2-132 pgpool[7155]: [8-1] 2016-07-15 09:15:41: pid 7155: LOG: informing the node status change to watchdog Jul 15 09:15:41 ip-10-0-2-132 pgpool[7155]: [8-2] 2016-07-15 09:15:41: pid 7155: DETAIL: node id :2 status = "NODE DEAD" message:"No heartbeat signal from node" Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7155: LOG: informing the node status change to watchdog Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7155: DETAIL: node id :2 status = "NODE DEAD" message:"No heartbeat signal from node" Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [54-1] 2016-07-15 09:15:41: pid 7154: LOG: new IPC connection received Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [55-1] 2016-07-15 09:15:41: pid 7154: LOG: received node status change ipc message Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: LOG: new IPC connection received Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [55-2] 2016-07-15 09:15:41: pid 7154: DETAIL: No heartbeat signal from node Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [56-1] 2016-07-15 09:15:41: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: LOG: received node status change ipc message Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: DETAIL: No heartbeat signal from node Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down Jul 15 09:42:21 ip-10-0-2-132 pgpool[7154]: [57-1] 2016-07-15 09:42:21: pid 7154: LOG: new watchdog node connection is received from "10.0.112.190:33746" Jul 15 09:42:21 ip-10-0-2-132 pgpool: 2016-07-15 09:42:21: pid 7154: LOG: new watchdog node connection is received from "10.0.112.190:33746" Jul 15 09:42:21 ip-10-0-2-132 pgpool[7154]: [58-1] 2016-07-15 09:42:21: pid 7154: LOG: new outbond connection to vir-stag-pg-01:9000 Jul 15 09:42:21 ip-10-0-2-132 pgpool: 2016-07-15 09:42:21: pid 7154: LOG: new outbond connection to vir-stag-pg-01:9000 Jul 15 09:42:31 ip-10-0-2-132 pgpool[7155]: [9-1] 2016-07-15 09:42:31: pid 7155: LOG: informing the node status change to watchdog Jul 15 09:42:31 ip-10-0-2-132 pgpool[7155]: [9-2] 2016-07-15 09:42:31: pid 7155: DETAIL: node id :2 status = "NODE ALIVE" message:"Heartbeat signal found" Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7155: LOG: informing the node status change to watchdog Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7155: DETAIL: node id :2 status = "NODE ALIVE" message:"Heartbeat signal found" Jul 15 09:42:31 ip-10-0-2-132 pgpool[7154]: [59-1] 2016-07-15 09:42:31: pid 7154: LOG: new IPC connection received Jul 15 09:42:31 ip-10-0-2-132 pgpool[7154]: [60-1] 2016-07-15 09:42:31: pid 7154: LOG: received node status change ipc message Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7154: LOG: new IPC connection received Jul 15 09:42:31 ip-10-0-2-132 pgpool[7154]: [60-2] 2016-07-15 09:42:31: pid 7154: DETAIL: Heartbeat signal found Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7154: LOG: received node status change ipc message Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7154: DETAIL: Heartbeat signal found | ||||
| Tags | No tags attached. | ||||