View Issue Details

IDProjectCategoryView StatusLast Update
0000084Pgpool-IIBugpublic2013-12-04 01:17
ReportermmontgomeryAssigned Tonagata 
PrioritynormalSeveritymajorReproducibilityalways
Status assignedResolutionopen 
PlatformLinuxOSUbuntuOS Version12.04 LTS
Product Version 
Target VersionFixed in Version 
Summary0000084: Watchdog race condition - other members never leave 'not started' state
DescriptionHere's our environment.

3x pgpool servers
1x master postgres (9.2)
1x slave postgres (9.2)

root@pgpool01:/etc/pgpool2# dpkg -l | grep pgpool
ii libpgpool0 3.3.1-1.pgdg12.4+1 pgpool control protocol library
ii pgpool2 3.3.1-1.pgdg12.4+1 connection pool server and replication proxy for PostgreSQL

If you restart (or start) all 3x pgpool servers at the same time, you will find that each watchdog sets itself to ''active" and all other servers are set to 'Not Started', and never leave that state, even though the debug logs show healthy heartbeats coming from the servers.

config:
listen_addresses = '*'
port = 5432
socket_dir = '/var/run/postgresql'
pcp_port = 9898
pcp_socket_dir = '/var/run/postgresql'
backend_hostname0 = 'prdplannerpostdb01.clsvr.theknot.com'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/9.2/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'prdplannerpostdb02.clsvr.theknot.com'
backend_port1 = 5432
backend_weight1 = 1000
backend_data_directory1 = '/var/lib/postgresql/9.2/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 64
max_pool = 4
child_life_time = 0
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
print_timestamp = on
log_connections = off
log_hostname = off
log_statement = off
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'LOCAL0'
syslog_ident = 'pgpool'
debug_level = 1
pid_file_name = '/var/run/postgresql/pgpool.pid'
logdir = '/var/log/postgresql'
connection_cache = off
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = on
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'currval,lastval,nextval,setval,sp_.*'
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 0
sr_check_user = 'planner_user'
sr_check_password = 'kU95eSpaGEyU4Avu'
delay_threshold = 0
follow_master_command = ''
parallel_mode = off
pgpool2_hostname = ''
system_db_hostname = 'localhost'
system_db_port = 5432
system_db_dbname = 'pgpool'
system_db_schema = 'pgpool_catalog'
system_db_user = 'pgpool'
system_db_password = ''
health_check_period = 30
health_check_timeout = 20
health_check_user = 'pgpool'
health_check_password = 'Q9kupenA'
health_check_max_retries = 10
health_check_retry_delay = 10
failover_command = ''
failback_command = ''
fail_over_on_backend_error = off
search_primary_node_timeout = 10
recovery_user = 'nobody'
recovery_password = ''
recovery_1st_stage_command = ''
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = 'prdplannerpgpool01.clsvr.theknot.com,prdplannerpgpool02.clsvr.theknot.com,prdplannerpgpool03.clsvr.theknot.com'
ping_path = '/bin'
wd_hostname = 'prdplannerpgpool02.clsvr.theknot.com'
wd_port = 9000
wd_authkey = 'asdf'
delegate_IP = ''
ifconfig_path = '/sbin'
if_up_cmd = 'ifconfig eth0:0 inet $_IP_$ netmask 255.255.255.0'
if_down_cmd = 'ifconfig eth0:0 down'
arping_cmd = 'arping -U $_IP_$ -w 1'
clear_memqcache_on_escalation = on
wd_escalation_command = '/usr/local/scripts/reconfigure_heroku.py'
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 10
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'prdplannerpgpool01.clsvr.theknot.com'
heartbeat_destination_port0 = 9694
heartbeat_device0 = ''
heartbeat_destination1 = 'prdplannerpgpool02.clsvr.theknot.com'
heartbeat_destination_port1 = 9694
heartbeat_device1 = ''
heartbeat_destination2 = 'prdplannerpgpool03.clsvr.theknot.com'
heartbeat_destination_port2 = 9694
heartbeat_device2 = ''
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'planner_user'
wd_lifecheck_password = 'kU95eSpaGEyU4Avu'
other_pgpool_hostname0 = 'prdplannerpgpool01.clsvr.theknot.com'
other_pgpool_port0 = 5432
other_wd_port0 = 9000
other_pgpool_hostname1 = 'prdplannerpgpool03.clsvr.theknot.com'
other_pgpool_port1 = 5432
other_wd_port1 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/var/log/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''
Steps To ReproduceRestart all pgpool software at the same time.

Check watchdog status via pgpooladmin
TagsNo tags attached.

Activities

nagata

2013-11-26 11:35

developer   ~0000371

What happens if you start pgpools at intervals?
They might take time to find other pgpools.

If this doesn't work, check network and firewall condition.
Do they respond to ping each other?If so, is communication interrupted
by security systems like as iptables or SELinux?

If ping is returned, and iptables/SELinux is stopped or configured well,
there is another cause. Could you please send pgpool.log of all pgpools
for analysis?

mmontgomery

2013-11-26 23:43

reporter   ~0000372

Starting at intervals there's no issue. The problem is I can't guarantee this over time. Why don't the members of the watchdog set speak to each other after startup (which they're obviously doing from the logs) and decide that only one should be primary, and demote another? They NEVER even see that the other members have started the software according to both the state, and pgpooladmin.

Systems are in EC2

Allowed within PgPool Security group:
tcp 9000
tcp 9694
tcp 5432
udp 9694
ping

Will upload sanitized logs soon...

mmontgomery

2013-11-27 00:05

reporter  

pgpool.log.bz2 (8,759 bytes)

mmontgomery

2013-11-27 00:07

reporter   ~0000373

They never leave the below states:

Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: check_pgpool_status_by_hb: checking pgpool 0 (prdapppgpool01.randomdomain.com:5432)
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: check_pgpool_status_by_hb: OK; status 3
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: check_pgpool_status_by_hb: checking pgpool 1 (prdapppgpool02.randomdomain.com:5432)
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: wd_check_heartbeat: the latest heartbeat from prdapppgpool02.randomdomain.com:5432 received 4 seconds ago
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: check_pgpool_status_by_hb: OK; status 1
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: check_pgpool_status_by_hb: checking pgpool 2 (prdapppgpool03.randomdomain.com:5432)
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: wd_check_heartbeat: the latest heartbeat from prdapppgpool03.randomdomain.com:5432 received 3 seconds ago
Nov 26 14:47:59 prdapppgpool01 pgpool[2310]: check_pgpool_status_by_hb: OK; status 1

mmontgomery

2013-11-27 00:09

reporter   ~0000374

The others show almost exactly the same thing, just in a different order:

Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: check_pgpool_status_by_hb: checking pgpool 0 (prdapppgpool02.randomdomain.com:5432)
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: check_pgpool_status_by_hb: OK; status 3
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: check_pgpool_status_by_hb: checking pgpool 1 (prdapppgpool01.randomdomain.com:5432)
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: wd_check_heartbeat: the latest heartbeat from prdapppgpool01.randomdomain.com:5432 received 9 seconds ago
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: check_pgpool_status_by_hb: OK; status 1
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: check_pgpool_status_by_hb: checking pgpool 2 (prdapppgpool03.randomdomain.com:5432)
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: wd_check_heartbeat: the latest heartbeat from prdapppgpool03.randomdomain.com:5432 received 9 seconds ago
Nov 26 15:08:25 prdapppgpool02 pgpool[29808]: check_pgpool_status_by_hb: OK; status 1

nagata

2013-11-29 17:51

developer   ~0000375

Heartbeat signals is not used to resolve this inconsistency.

In this time, you need to start pgpool-II at some intervals.
This is necessary for pgpool to prepare receive other pgpool's connection.

mmontgomery

2013-12-03 06:19

reporter   ~0000376

Been digging through your code most of the day. Haven't really looked at C for many many years, but here's what I see:

1. You only send the 'WD_ADD_REQ' on startup to other members, which seems odd. What's the issue with sending this to other members that you don't see in a healthy state at some defined interval?
2. In wd_hb_receiver function, around line 408, you are clearly receiving heartbeat signals from members that you have in a state of '1' [WD_INIT] (as I see this in my debug logs), yet you never take this opportunity to change the state of these members... are you waiting for a WD_ADD_REQ packet before changing their state? from the wd_startup function, the packet will never be sent again, except on startup.

I'm confused as to why this isn't considered a relevant bug. A set of pgpool systems get stuck in a bad state, and never recover, requiring a restart. There are other situations other than starting pgpool on multiple systems at the same time that could cause this behavior (AWS hardware failure/network outage) and leave a set of watchdog systems in an unhealthy state that they will never recover from.

mmontgomery

2013-12-04 01:03

reporter   ~0000377

Ok, so I made some changes referenced below, and I can startup all pgpool instances at the same time, and they realize something isn't right, and recover (ending up with only one master, and all show the other's software running in standby). You certainly know more about this code, and can likely tune what I have, but I seem to have fixed the issue. Let me know if you have any questions/concerns/etc.

Thanks for the great software...

diff wd_heartbeat.c ../../tmppgpool/pgpool-II-3.3.1/watchdog/wd_heartbeat.c
489,530d488
< /* Send Add Request to other Heartbeat Members */
< wd_startup();
< WdInfo * p = WD_List;
< WD_STATUS prev_status = p->status;
< int num_masters = 0;
<
< while (p->status != WD_END)
< {
< /* find master pgpool in the other servers */
< if (p->status == WD_MASTER)
< {
< /* master found */
< num_masters++;
< }
< p++;
< }
<
< if (num_masters > 1)
< {
< pool_log("wd_heartbeat: Multiple masters found, so forcing a re-election");
< if (wd_am_I_oldest() == WD_OK)
< {
< pool_log("wd_heartbeat: I'm oldest so standing for master");
< /* stand for master */
< int rtn = wd_stand_for_master();
< if (rtn == WD_OK)
< {
< /* win */
< wd_escalation();
< }
< else
< {
< /* rejected by others */
< p->status = prev_status;
< }
< }
< else
< {
< pool_log("wd_heartbeat: I'm not oldest, so standing down from master");
< wd_set_myself(NULL, WD_NORMAL);
< }
< }

mmontgomery

2013-12-04 01:17

reporter   ~0000378

Well.... after further testing, this isn't working exactly as expected. I'm seeing up/down behavior constantly. I'll continue to look at the issue.

Issue History

Date Modified Username Field Change
2013-11-26 03:20 mmontgomery New Issue
2013-11-26 11:35 nagata Note Added: 0000371
2013-11-26 11:35 nagata Assigned To => nagata
2013-11-26 11:35 nagata Status new => feedback
2013-11-26 23:43 mmontgomery Note Added: 0000372
2013-11-26 23:43 mmontgomery Status feedback => assigned
2013-11-27 00:05 mmontgomery File Added: pgpool.log.bz2
2013-11-27 00:07 mmontgomery Note Added: 0000373
2013-11-27 00:09 mmontgomery Note Added: 0000374
2013-11-29 17:51 nagata Note Added: 0000375
2013-12-03 06:19 mmontgomery Note Added: 0000376
2013-12-04 01:03 mmontgomery Note Added: 0000377
2013-12-04 01:17 mmontgomery Note Added: 0000378