View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000222 | Pgpool-II | Bug | public | 2016-07-26 09:31 | 2016-08-12 11:07 |
Reporter | uehara | Assigned To | Muhammad Usama | ||
Priority | normal | Severity | major | Reproducibility | sometimes |
Status | resolved | Resolution | open | ||
Platform | VirtualBox | OS | CentOS | OS Version | 6.5 |
Product Version | 3.5.3 | ||||
Target Version | 3.5.4 | Fixed in Version | |||
Summary | 0000222: Sometimes Failover command isn't executed. | ||||
Description | Pgpool-II 3.5.3 Health check: enable PostgreSQL 9.5.2 Outlineļ¼ I'm checking Pgpool-II's behavior when stopped LAN on DB(Master) node to connect Pgpool-II nodes. There are two DB nodes in Pgpool-II cluster configured in streaming replication mode. This cluster is constructed with three Pgpool-II nodes. Sometimes Pgpool-II don't execute failover command when I stopped LAN on DB(Master) server. It occured if besides COODINATOR node detected failures of DB on health check at first. Showed these messages on Pgpool-II(Master) when it occured. ------------------ 2016-07-25 19:52:18: pid 7487: LOG: watchdog node "Linux_pgpool_01_9999" is requesting to check lock for failover command start 2016-07-25 19:52:18: pid 7487: LOG: check lock for failover command start request is denied to node "Linux_pgpool_01_9999" 2016-07-25 19:52:18: pid 7487: DETAIL: node "Linux_pgpool_02_9999" is holding the lock ------------------ I found the doubtful code at wd_command.c:764 . watchdog/wd_commands.c --------- 758 static WDFailoverCMDResults wd_issue_failover_lock_command(WDFailoverCMDTypes cmdType, char* syncReqType) 759 { 760 WDFailoverCMDResults res; 761 int x; 762 for (x=0; x < MAX_SEC_WAIT_FOR_CLUSTER_TRANSATION; x++) 763 { 764 res = wd_send_failover_sync_command(NODE_FAILBACK_CMD,syncReqType); 765 if (res != FAILOVER_RES_TRANSITION) 766 break; 767 sleep(1); 768 } 769 return res; 770 } --------------- I think it should be modified as follows. -------- res = wd_send_failover_sync_command(cmdType, syncReqType); -------- Please tell me what you think. regards, | ||||
Steps To Reproduce | First, run PostgreSQL(Master and Slave). Second, run Pgpool-II nodes in the following order. pgpool_01,pgpool_02,pgpool_03 And check the status. $ pcp_watchdog_info -h localhost -U postgres 3 YES Linux_pgpool_01_9999 192.168.2.3 Linux_pgpool_01_9999 192.168.2.3 9999 9000 4 MASTER Linux_pgpool_02_9999 192.168.2.4 9999 9000 7 STANDBY Linux_pgpool_03_9999 192.168.2.5 9999 9000 7 STANDBY $ psql -h 192.168.3.3 -p 9999 postgres -c "show pool_nodes" node_id | hostname | port | status | lb_weight | role | select_cnt ---------+-------------+------+--------+-----------+---------+------------ 0 | 192.168.1.1 | 5432 | 2 | 0.000000 | primary | 0 1 | 192.168.1.2 | 5432 | 2 | 1.000000 | standby | 0 (2 rows) Third, stop the NW of PostgreSQL(Master) server. # ip addr del 192.168.1.1/24 dev eth1 | ||||
Additional Information | I sent log-file and conf-file of Pgpool-II. Besides COODINATOR node detected failures of DB at first.. - original code : 01_Before_NG (NW stop : Mon Jul 25 19:51:36 JST 2016) - modified code : 02_After_OK (NW stop : non-measure) COODINATOR node detected DB failures of DB at first. - original code : 03_Before_OK (NW stop : Mon Jul 25 19:57:17 JST 2016) Pgpool-II node01's conf file - Pgpool_01_conf I added some debug-log. Please ignore the "LOG: debug-log". | ||||
Tags | No tags attached. | ||||
|
Pgpool-II_log.zip (28,363 bytes) |
|
Uehara-san, thank you for the report! I have assigned Usama, who is responsible for watchdog. |
|
It seems he committed the fix. Please try. https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=af8af96365c6ddf6e5741b2a9b917aa0276c5c1d |
|
Thank you for modifying. I confirmed that No errors occur. |
|
Thanks for confirmation. Issue resolved. |
Date Modified | Username | Field | Change |
---|---|---|---|
2016-07-26 09:31 | uehara | New Issue | |
2016-07-26 09:31 | uehara | File Added: Pgpool-II_log.zip | |
2016-07-26 09:58 | t-ishii | Assigned To | => Muhammad Usama |
2016-07-26 09:58 | t-ishii | Status | new => assigned |
2016-07-26 09:59 | t-ishii | Note Added: 0000930 | |
2016-07-31 08:47 | t-ishii | Product Version | => 3.5.3 |
2016-07-31 08:47 | t-ishii | Target Version | => 3.5.4 |
2016-08-02 11:50 | t-ishii | Note Added: 0000956 | |
2016-08-02 11:51 | t-ishii | Status | assigned => feedback |
2016-08-12 10:05 | uehara | Note Added: 0000981 | |
2016-08-12 10:05 | uehara | Status | feedback => assigned |
2016-08-12 11:07 | t-ishii | Note Added: 0000983 | |
2016-08-12 11:07 | t-ishii | Status | assigned => resolved |