View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000268 | Pgpool-II | Bug | public | 2016-12-07 23:42 | 2020-12-10 15:47 |
| Reporter | chjischj | Assigned To | Muhammad Usama | ||
| Priority | high | Severity | major | Reproducibility | always |
| Status | closed | Resolution | open | ||
| Platform | Linux | OS | CentOS | OS Version | 7.0 |
| Product Version | 3.5.4 | ||||
| Summary | 0000268: pgpool block on issue_command_to_watchdog() for ever when the primary's network is cut off | ||||
| Description | pgpool:3.5.4 postgresql:9.5.2 one primary and two standbys,stream replication and use watchdog,pgpool runs on each nodes together with postgres. one of the node(node1) is the primary of postgresql and the master of pgpool,when cut off the network of this node,failover_command fail to be called. The reset of nodes(node2 and node3) had elected a new pgpool master,but the new master blocked at issue_command_to_watchdog() [root@node3 ~]# pcp_watchdog_info -w -v Watchdog Cluster Information Total Nodes : 3 Remote Nodes : 2 Quorum state : QUORUM EXIST Alive Remote Nodes : 2 VIP up on local node : YES Master Node Name : Linux_node3_9999 Master Host Name : node3 Watchdog Node Information Node Name : Linux_node3_9999 Host Name : node3 Delegate IP : 192.168.0.220 Pgpool port : 9999 Watchdog port : 9000 Node priority : 1 Status : 4 Status Name : MASTER Node Name : Linux_node1_9999 Host Name : node1 Delegate IP : 192.168.0.220 Pgpool port : 9999 Watchdog port : 9000 Node priority : 1 Status : 8 Status Name : LOST Node Name : Linux_node2_9999 Host Name : node2 Delegate IP : 192.168.0.220 Pgpool port : 9999 Watchdog port : 9000 Node priority : 1 Status : 7 Status Name : STANDBY pgpool log -------------------------- Nov 15 23:12:37 node3 pgpool: 2016-11-15 23:12:37: pid 4088: ERROR: Failed to check replication time lag Nov 15 23:12:37 node3 pgpool: 2016-11-15 23:12:37: pid 4088: DETAIL: No persistent db connection for the node 0 Nov 15 23:12:37 node3 pgpool: 2016-11-15 23:12:37: pid 4088: HINT: check sr_check_user and sr_check_password Nov 15 23:12:37 node3 pgpool: 2016-11-15 23:12:37: pid 4088: CONTEXT: while checking replication time lag Nov 15 23:12:39 node3 pgpool: 2016-11-15 23:12:39: pid 4088: LOG: failed to connect to PostgreSQL server on "node1:5433", getsockopt() detected error "No route to host" Nov 15 23:12:39 node3 pgpool: 2016-11-15 23:12:39: pid 4088: ERROR: failed to make persistent db connection Nov 15 23:12:39 node3 pgpool: 2016-11-15 23:12:39: pid 4088: DETAIL: connection to host:"node1:5433" failed stack of pgpool -------------------- [root@node3 ~]# ps -ef|grep pgpool.conf root 4048 1 0 Nov15 ? 00:00:00 /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n root 5301 4832 0 00:10 pts/3 00:00:00 grep --color=auto pgpool.conf [root@node3 ~]# pstack 4048 #0 0x00007f73647e98d3 in __select_nocancel () from /lib64/libc.so.6 0000001 0x0000000000493d2e in issue_command_to_watchdog () 0000002 0x0000000000494ac3 in wd_degenerate_backend_set () 0000003 0x000000000040bcf3 in degenerate_backend_set_ex () 0000004 0x000000000040e1c4 in PgpoolMain () 0000005 0x0000000000406ec2 in main () | ||||
| Steps To Reproduce | 1. one primary and two standbys 2. stream replication and use watchdog 3. run pgpool on every nodes 4. cut off the network of the pgpool master(ip addr down ...) | ||||
| Tags | streaming replication, watchdog | ||||
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2016-12-07 23:42 | chjischj | New Issue | |
| 2016-12-07 23:42 | chjischj | Tag Attached: streaming replication | |
| 2016-12-07 23:42 | chjischj | Tag Attached: watchdog | |
| 2016-12-20 09:28 | t-ishii | Assigned To | => Muhammad Usama |
| 2016-12-20 09:28 | t-ishii | Status | new => assigned |
| 2017-08-29 09:41 | pengbo | Status | assigned => closed |