View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000134 | pgpool-HA | Bug | public | 2015-04-26 14:05 | 2015-06-11 13:47 |
| Reporter | Qipan | Assigned To | Muhammad Usama | ||
| Priority | normal | Severity | minor | Reproducibility | have not tried |
| Status | resolved | Resolution | open | ||
| Platform | Linux | OS | Ubuntu 12.04.5 LTS | ||
| Summary | 0000134: down pgpool wd node failback failed | ||||
| Description | Hi, I used pgpool-HA wd switch to new master works good, the problem I found is when I decide to put the down pgpool node back to production,it can be repeat on my test envrionment. pgpool --version pgpool-II version 3.3.4 (tokakiboshi) | ||||
| Steps To Reproduce | vip: 192.168.0.201 pgpool1 192.168.0.105 (active): service pgpool2 stop still can see pgpool process: ps aux|grep pgpool postgres 39918 0.0 0.0 186584 15156 pts/0 S 21:17 0:00 /usr/sbin/pgpool -n postgres 39919 0.0 0.0 7152 716 pts/0 S 21:17 0:00 logger -t pgpool -p local0.info postgres 39985 0.0 0.0 252208 2028 pts/0 Sl 21:17 0:00 pgpool: lifecheck root 40257 0.0 0.0 9380 924 pts/0 S+ 21:23 0:00 grep --color pgpool Apr 26 00:23:23 pgpool[40171]: child received shutdown request signal 2 Apr 26 00:23:23 pgpool[40001]: child received shutdown request signal 2 Apr 26 00:23:23 pgpool[39996]: child received shutdown request signal 2 Apr 26 00:23:23 pgpool[39995]: child received shutdown request signal 2 Apr 26 00:23:23 pgpool[39990]: child received shutdown request signal 2 Apr 26 00:23:23 pgpool[39985]: exec_ifconfig: 'pg_ifconfig eth1:0 $_IP_$ 255.255.255.0 down' succeeded Apr 26 00:23:25 ntpd[13090]: Deleting interface 0000016 eth1:0, 192.168.0.201#123, interface stats: received=0, sent=0, dropped=0, active_time=337 secs Apr 26 00:23:25 ntpd[13090]: peers refreshed Apr 26 00:23:26 pgpool[39985]: exec_ping: failed to ping 192.168.0.201: exit code 1 Apr 26 00:23:26 pgpool[39985]: wd_IP_down: ifconfig down succeeded Apr 26 00:24:18 pgpool[39918]: received fast shutdown request Apr 26 00:24:18 pgpool[39918]: pgpool main: close listen socket pgpool2 192.168.0.106 (standby->active) works good. when I want to pgpool1 back to standby to new node. force killed process didn't shutdown. And I run start command, it didn't start normally: ps aux|grep pgpool postgres 40922 0.0 0.0 186584 15148 pts/0 Sl 22:01 0:00 /usr/sbin/pgpool -n postgres 40923 0.0 0.0 7152 716 pts/0 S 22:01 0:00 logger -t pgpool -p local0.info root 40944 0.0 0.0 9380 924 pts/0 S+ 22:01 0:00 grep --color pgpool tailf /var/log/syslog : Apr 26 01:01:50 pgpool: 2015-04-25 22:01:50 DEBUG: pid 40922: key: check_temp_table Apr 26 01:01:50 pgpool: 2015-04-25 22:01:50 DEBUG: pid 40922: value: on kind: 1 Apr 26 01:01:50 pgpool: 2015-04-25 22:01:50 DEBUG: pid 40922: key: check_unlogged_table Apr 26 01:01:50 pgpool[40922]: num_backends: 2 total_weight: 2.000000 Apr 26 01:01:50 pgpool: 2015-04-25 22:01:50 DEBUG: pid 40922: value: on kind: 1 Apr 26 01:01:50 pgpool: 2015-04-25 22:01:50 DEBUG: pid 40922: key: memory_cache_enabled Apr 26 01:01:50 pgpool[40922]: backend 0 weight: 1073741823.500000 Apr 26 01:01:50 pgpool: 2015-04-25 22:01:50 DEBUG: pid 40922: value: off kind: 1 Apr 26 01:01:50 pgpool[40922]: backend 0 flag: 0000 Apr 26 01:01:50 pgpool[40922]: backend 1 weight: 1073741823.500000 Apr 26 01:01:50 pgpool[40922]: backend 1 flag: 0000 Apr 26 01:01:50 pgpool[40922]: loading "/etc/pgpool2/pool_hba.conf" for client authentication configuration file Apr 26 01:01:50 pgpool[40922]: wd_chk_setuid: ifup[/var/lib/postgresql/bin/pg_ifconfig] doesn't have setuid bit Apr 26 01:01:52 pgpool[40922]: exec_ping: succeed to ping 192.168.0.105 Apr 26 01:01:52 pgpool[40922]: get_result: ping data: PING 192.168.0.105 (192.168.0.105) 56(84) bytes of data.0000012#012--- 192.168.0.105 ping statistics ---0000123 packets transmitted, 3 received, 0% packet loss, time 1998ms#012rtt min/avg/max/mdev = 0.008/0.010/0.016/0.005 ms Apr 26 01:01:52 pgpool[40922]: exec_ping: succeed to ping 192.168.0.106 Apr 26 01:01:52 pgpool[40922]: get_result: ping data: PING 192.168.0.106 (192.168.0.106) 56(84) bytes of data.0000012#012--- 192.168.0.106 ping statistics ---0000123 packets transmitted, 3 received, 0% packet loss, time 1998ms#012rtt min/avg/max/mdev = 0.101/0.132/0.156/0.026 ms (hang here....) | ||||
| Tags | No tags attached. | ||||
|
|
Could I know what is right process to get old down nodes back to standby? Thanks a lot! Qipan |
|
|
It works good only if I restart both two pgpool nodes. Is it a bug for pgpool watchdog? Is there some right method to get back the down wd node without restart active node? |
|
|
I tested that 3.3.6 solves this problem. - Fix to use void * type for receiving return value of thread function (Yugo Nagata) Previously int type was used and this could occur stack buffer overflow. This caused an infinity loop of ping error at bringing up or down VIP. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2015-04-26 14:05 | Qipan | New Issue | |
| 2015-04-26 14:08 | Qipan | Note Added: 0000531 | |
| 2015-04-26 14:38 | Qipan | Note Added: 0000532 | |
| 2015-05-21 08:22 | t-ishii | Assigned To | => Muhammad Usama |
| 2015-05-21 08:22 | t-ishii | Status | new => assigned |
| 2015-05-21 14:09 | Qipan | Note Added: 0000537 | |
| 2015-06-11 13:47 | t-ishii | Status | assigned => resolved |