View Issue Details

IDProjectCategoryView StatusLast Update
0000218Pgpool-IIBugpublic2016-08-02 23:52
Reporterstivi21Assigned ToMuhammad Usama 
PriorityhighSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
PlatformOSUbuntuOS Version14.04
Product Version 
Target VersionFixed in Version 
Summary0000218: Inconsistent status of Postgresql nodes in pgPool instances after restart. Watchdog not syncing status.
DescriptionPgpool: 3.5.3
Postgresql: 9.4
Pgpool configuration: replication mode, using watchdog but without VIP, haproxy instead.
Environment: Amazon Web Services, EC2

Host0 (vir-stag-pg-00): pgpool and postgres instance (master)
Host1 (vir-stag-pg-01): pgpool and postgres instance (slave)
Host2 (vir-stag-pg-02): pgpool instance (for quorum)
Steps To Reproduce1) All hosts are working with statuses:
# Status od postgresql (the same status of all instances od pgpool):
 node_id | hostname | port | status | lb_weight | role | select_cnt
---------+----------------+------+--------+-----------+---------+------------
 0 | vir-stag-pg-00 | 5432 | 2 | 0.500000 | primary | 1023309
 1 | vir-stag-pg-01 | 5432 | 3 | 0.500000 | standby | 0

# Watchdog status (from Host0):
3 YES Linux_ip-10-0-1-129_5433 vir-stag-pg-00

Linux_ip-10-0-1-129_5433 vir-stag-pg-00 5433 9000 4 MASTER
Linux_ip-10-0-112-190_5433 vir-stag-pg-01 5433 9000 7 STANDBY
Linux_ip-10-0-2-132_5433 vir-stag-pg-02 5433 9000 7 STANDBY

2) Host1 going to stop for few minutes, then start again

3) After this, on Host0 and Host2 are the same status, where postgres from host2 have status 3:
 node_id | hostname | port | status | lb_weight | role | select_cnt
---------+----------------+------+--------+-----------+---------+------------
 0 | vir-stag-pg-00 | 5432 | 2 | 0.500000 | primary | 1026367
 1 | vir-stag-pg-01 | 5432 | 3 | 0.500000 | standby | 0

On Host1 both postgres nodes has status 2:
 node_id | hostname | port | status | lb_weight | role | select_cnt
---------+----------------+------+--------+-----------+---------+------------
 0 | vir-stag-pg-00 | 5432 | 2 | 0.500000 | primary | 5
 1 | vir-stag-pg-01 | 5432 | 2 | 0.500000 | standby | 0

Status of watchdog is ok on all pgpool instances:
3 YES Linux_ip-10-0-1-129_5433 vir-stag-pg-00

Linux_ip-10-0-1-129_5433 vir-stag-pg-00 5433 9000 4 MASTER
Linux_ip-10-0-112-190_5433 vir-stag-pg-01 5433 9000 7 STANDBY
Linux_ip-10-0-2-132_5433 vir-stag-pg-02 5433 9000 7 STANDBY
Additional InformationThe same situation, when I restarted Host0, but after restart pgpool on this host see 2 masters postgresql (backause slave was promoted to master), but both are active, another pgpool instances have first pgpool node as detached.



Logs from pgpool on Host2 (node "Linux_ip-10-0-112-190_5433" is pgpool on restarted Host1):

### Host1 stopped
Jul 15 09:15:07 ip-10-0-2-132 pgpool[7154]: [40-1] 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection
Jul 15 09:15:07 ip-10-0-2-132 pgpool: 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection
Jul 15 09:15:07 ip-10-0-2-132 pgpool[7154]: [41-1] 2016-07-15 09:15:07: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down
Jul 15 09:15:07 ip-10-0-2-132 pgpool: 2016-07-15 09:15:07: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down
Jul 15 09:15:07 ip-10-0-2-132 pgpool[7154]: [42-1] 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection
Jul 15 09:15:07 ip-10-0-2-132 pgpool: 2016-07-15 09:15:07: pid 7154: LOG: read from socket failed, remote end closed the connection
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [43-1] 2016-07-15 09:15:08: pid 7154: LOG: received degenerate backend request for node_id: 1 from pid [7154]
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: received degenerate backend request for node_id: 1 from pid [7154]
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [44-1] 2016-07-15 09:15:08: pid 7150: LOG: starting degeneration. shutdown host vir-stag-pg-01(5432)
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: starting degeneration. shutdown host vir-stag-pg-01(5432)
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [44-1] 2016-07-15 09:15:08: pid 7154: LOG: sending watchdog response
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [44-2] 2016-07-15 09:15:08: pid 7154: DETAIL: failover request from other pgpool is canceled because of switching
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sending watchdog response
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: failover request from other pgpool is canceled because of switching
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [45-1] 2016-07-15 09:15:08: pid 7150: LOG: Restart all children
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: Restart all children
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7156]: [5-1] 2016-07-15 09:15:08: pid 7156: LOG: child process received shutdown request signal 3
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7156: LOG: child process received shutdown request signal 3
[ cut a lot of the same lines ]
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [45-1] 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [46-1] 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [47-1] 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433"
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [47-2] 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node...
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7303: LOG: child process received shutdown request signal 3
[ cut a lot of the same lines ]
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433"
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node...
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [46-1] 2016-07-15 09:15:08: pid 7150: LOG: execute command: /etc/pgpool2/failover_stream.sh 1 vir-stag-pg-00 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: execute command: /etc/pgpool2/failover_stream.sh 1 vir-stag-pg-00 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool: Fri Jul 15 09:15:08 UTC 2016
Jul 15 09:15:08 ip-10-0-2-132 pgpool: Failed node: 1
Jul 15 09:15:08 ip-10-0-2-132 pgpool: + [ 1 != 0 ]
Jul 15 09:15:08 ip-10-0-2-132 pgpool: + echo [INFO] Slave node is down. New master not promoted!
Jul 15 09:15:08 ip-10-0-2-132 pgpool: [INFO] Slave node is down. New master not promoted!
Jul 15 09:15:08 ip-10-0-2-132 pgpool: + exit 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [48-1] 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [49-1] 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [50-1] 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433"
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [50-2] 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node...
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433"
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node...
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [47-1] 2016-07-15 09:15:08: pid 7150: LOG: failover: set new primary node: 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [48-1] 2016-07-15 09:15:08: pid 7150: LOG: failover: set new master node: 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: failover: set new primary node: 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7150: LOG: failover: set new master node: 0
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7361]: [5-1] 2016-07-15 09:15:08: pid 7361: LOG: worker process received restart request
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7361: LOG: worker process received restart request
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [51-1] 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: new IPC connection received
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [52-1] 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [53-1] 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433"
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: processing sync request from IPC socket
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7154]: [53-2] 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node...
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: LOG: sync request from IPC socket is forwarded to master watchdog node "Linux_ip-10-0-1-129_5433"
Jul 15 09:15:08 ip-10-0-2-132 pgpool: 2016-07-15 09:15:08: pid 7154: DETAIL: waiting for the reply from master node...
Jul 15 09:15:08 ip-10-0-2-132 pgpool[7150]: [49-1] 2016-07-15 09:15:08: pid 7150: LOG: failover done. shutdown host vir-stag-pg-01(5432)
Jul 15 09:15:08 ip-10-0-2-132 pgpool: failover done. shutdown host vir-stag-pg-01(5432)2016-07-15 09:15:08: pid 7150: LOG: failover done. shutdown host vir-stag-pg-01(5432)
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7360]: [1334-1] 2016-07-15 09:15:09: pid 7360: LOG: restart request received in pcp child process
Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7360: LOG: restart request received in pcp child process
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [50-1] 2016-07-15 09:15:09: pid 7150: LOG: PCP child 7360 exits with status 0 in failover()
Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: PCP child 7360 exits with status 0 in failover()
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [51-1] 2016-07-15 09:15:09: pid 7150: LOG: fork a new PCP child pid 28732 in failover()
Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: fork a new PCP child pid 28732 in failover()
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [52-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exits with status 0
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [53-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exited with success and will not be restarted
Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exits with status 0
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [54-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7157 exits with status 0
Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7156 exited with success and will not be restarted
Jul 15 09:15:09 ip-10-0-2-132 pgpool[7150]: [55-1] 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7157 exited with success and will not be restarted
Jul 15 09:15:09 ip-10-0-2-132 pgpool: 2016-07-15 09:15:09: pid 7150: LOG: child process with pid: 7157 exits with status 0
[ cut a lot of the same lines ]



### Host1 started
Jul 15 09:15:41 ip-10-0-2-132 pgpool[7155]: [8-1] 2016-07-15 09:15:41: pid 7155: LOG: informing the node status change to watchdog
Jul 15 09:15:41 ip-10-0-2-132 pgpool[7155]: [8-2] 2016-07-15 09:15:41: pid 7155: DETAIL: node id :2 status = "NODE DEAD" message:"No heartbeat signal from node"
Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7155: LOG: informing the node status change to watchdog
Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7155: DETAIL: node id :2 status = "NODE DEAD" message:"No heartbeat signal from node"
Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [54-1] 2016-07-15 09:15:41: pid 7154: LOG: new IPC connection received
Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [55-1] 2016-07-15 09:15:41: pid 7154: LOG: received node status change ipc message
Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: LOG: new IPC connection received
Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [55-2] 2016-07-15 09:15:41: pid 7154: DETAIL: No heartbeat signal from node
Jul 15 09:15:41 ip-10-0-2-132 pgpool[7154]: [56-1] 2016-07-15 09:15:41: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down
Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: LOG: received node status change ipc message
Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: DETAIL: No heartbeat signal from node
Jul 15 09:15:41 ip-10-0-2-132 pgpool: 2016-07-15 09:15:41: pid 7154: LOG: remote node "Linux_ip-10-0-112-190_5433" is shutting down
Jul 15 09:42:21 ip-10-0-2-132 pgpool[7154]: [57-1] 2016-07-15 09:42:21: pid 7154: LOG: new watchdog node connection is received from "10.0.112.190:33746"
Jul 15 09:42:21 ip-10-0-2-132 pgpool: 2016-07-15 09:42:21: pid 7154: LOG: new watchdog node connection is received from "10.0.112.190:33746"
Jul 15 09:42:21 ip-10-0-2-132 pgpool[7154]: [58-1] 2016-07-15 09:42:21: pid 7154: LOG: new outbond connection to vir-stag-pg-01:9000
Jul 15 09:42:21 ip-10-0-2-132 pgpool: 2016-07-15 09:42:21: pid 7154: LOG: new outbond connection to vir-stag-pg-01:9000
Jul 15 09:42:31 ip-10-0-2-132 pgpool[7155]: [9-1] 2016-07-15 09:42:31: pid 7155: LOG: informing the node status change to watchdog
Jul 15 09:42:31 ip-10-0-2-132 pgpool[7155]: [9-2] 2016-07-15 09:42:31: pid 7155: DETAIL: node id :2 status = "NODE ALIVE" message:"Heartbeat signal found"
Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7155: LOG: informing the node status change to watchdog
Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7155: DETAIL: node id :2 status = "NODE ALIVE" message:"Heartbeat signal found"
Jul 15 09:42:31 ip-10-0-2-132 pgpool[7154]: [59-1] 2016-07-15 09:42:31: pid 7154: LOG: new IPC connection received
Jul 15 09:42:31 ip-10-0-2-132 pgpool[7154]: [60-1] 2016-07-15 09:42:31: pid 7154: LOG: received node status change ipc message
Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7154: LOG: new IPC connection received
Jul 15 09:42:31 ip-10-0-2-132 pgpool[7154]: [60-2] 2016-07-15 09:42:31: pid 7154: DETAIL: Heartbeat signal found
Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7154: LOG: received node status change ipc message
Jul 15 09:42:31 ip-10-0-2-132 pgpool: 2016-07-15 09:42:31: pid 7154: DETAIL: Heartbeat signal found
TagsNo tags attached.

Activities

stivi21

2016-07-15 19:44

reporter   ~0000919

# Configuration on Host2
listen_addresses = '*'
port = 5433
socket_dir = '/var/run/postgresql'
listen_backlog_multiplier = 2
serialize_accept = off
pcp_listen_addresses = '*'
pcp_port = 9898
pcp_socket_dir = '/var/run/postgresql'
backend_hostname0 = 'vir-stag-pg-00'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/9.4/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'vir-stag-pg-01'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/9.4/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'
enable_pool_hba = on
pool_passwd = 'pool_passwd'
authentication_timeout = 60
ssl = off
num_init_children = 200
max_pool = 4
child_life_time = 300
child_max_connections = 0
connection_life_time = 0
client_idle_limit = 0
log_destination = 'syslog'
log_line_prefix = '%t: pid %p: ' # printf-style string to output at beginning of each log line.
log_connections = off
log_hostname = off
log_statement = off
log_per_node_statement = off
log_standby_delay = 'none'
syslog_facility = 'daemon.info'
syslog_ident = 'pgpool'
debug_level = 0
pid_file_name = '/var/run/postgresql/pgpool.pid'
logdir = '/var/log/postgresql'
connection_cache = on
reset_query_list = 'ABORT; DISCARD ALL'
replication_mode = off
replicate_select = off
insert_lock = on
lobj_lock_table = ''
replication_stop_on_mismatch = off
failover_if_affected_tuples_mismatch = off
load_balance_mode = off
ignore_leading_white_space = on
white_function_list = ''
black_function_list = 'nextval,setval,nextval,setval'
database_redirect_preference_list = ''
app_name_redirect_preference_list = ''
allow_sql_comments = off
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 10
sr_check_user = 'pgpool'
sr_check_password = '***'
sr_check_database = 'postgres'
delay_threshold = 0
follow_master_command = 'date >> /tmp/pgpool_follow_master'
health_check_period = 10
health_check_timeout = 20
health_check_user = 'pgpool'
health_check_password = '***'
health_check_database = 'postgres'
health_check_max_retries = 2
health_check_retry_delay = 10
connect_timeout = 10000
failover_command = '/etc/pgpool2/failover_stream.sh %d %H %P'
failback_command = 'date >> /tmp/pgpool_failback'
fail_over_on_backend_error = off
search_primary_node_timeout = 10
recovery_user = 'pgpool'
recovery_password = '***'
recovery_1st_stage_command = 'basebackup.sh'
recovery_2nd_stage_command = ''
recovery_timeout = 90
client_idle_limit_in_recovery = 0
use_watchdog = on
trusted_servers = ''
ping_path = '/bin'
wd_hostname = 'vir-stag-pg-02'
wd_port = 9000
wd_priority = 1
wd_authkey = ''
wd_ipc_socket_dir = '/var/run/postgresql'
delegate_IP = ''
if_cmd_path = '/sbin'
if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0'
if_down_cmd = 'ip addr del $_IP_$/24 dev eth0'
arping_path = '/usr/sbin'
arping_cmd = 'arping -U $_IP_$ -w 1'
clear_memqcache_on_escalation = on
wd_escalation_command = 'date >> /tmp/pgpool_escalation'
wd_de_escalation_command = 'date >> /tmp/pgpool_de_escalation'
wd_monitoring_interfaces_list = 'eth0' # Comma separated list of interfaces names to monitor.
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'vir-stag-pg-00'
heartbeat_destination_port0 = 9694
heartbeat_device0 = 'eth0'
heartbeat_destination1 = 'vir-stag-pg-01'
heartbeat_destination_port1 = 9694
heartbeat_device1 = 'eth0'
wd_life_point = 3
wd_lifecheck_query = 'SELECT 1'
wd_lifecheck_dbname = 'template1'
wd_lifecheck_user = 'nobody'
wd_lifecheck_password = ''
other_pgpool_hostname0 = 'vir-stag-pg-00'
other_pgpool_port0 = 5433
other_wd_port0 = 9000
other_pgpool_hostname1 = 'vir-stag-pg-01'
other_pgpool_port1 = 5433
other_wd_port1 = 9000
relcache_expire = 0
relcache_size = 256
check_temp_table = on
check_unlogged_table = on
memory_cache_enabled = off
memqcache_method = 'shmem'
memqcache_memcached_host = 'localhost'
memqcache_memcached_port = 11211
memqcache_total_size = 67108864
memqcache_max_num_cache = 1000000
memqcache_expire = 0
memqcache_auto_cache_invalidation = on
memqcache_maxcache = 409600
memqcache_cache_block_size = 1048576
memqcache_oiddir = '/var/log/pgpool/oiddir'
white_memqcache_table_list = ''
black_memqcache_table_list = ''

stivi21

2016-07-26 01:45

reporter   ~0000928

I tested one more time on Virtualbox environment. The same configuration, but was changed hostname and ip:

pg-00 - 192.168.6.20
pg-01 - 192.168.6.21
pg-02 - 192.168.6.22


Below logs form pg-01 after start, where postgres on this node have status 3 (detached) on another pgpool instances. This logs are in debug mode.

Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DEBUG: loading hba configuration
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DETAIL: loading file :"/etc/pgpool2/pool_hba.conf" for client authentication configuration file
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DEBUG: pool_coninfo_size: num_init_children (2) * max_pool (4) * MAX_NUM_BACKENDS (128) * sizeof(ConnectionInfo) (136) = 139264 bytes requested for shared memory
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DEBUG: ProcessInfo: num_init_children (2) * sizeof(ProcessInfo) (32) = 64 bytes requested for shared memory
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DEBUG: Request info are: sizeof(POOL_REQUEST_INFO) 5224 bytes requested for shared memory
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DEBUG: Recovery management area: sizeof(int) 4 bytes requested for shared memory
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: DEBUG: WD_Node_List: sizeof(unsigned char) (1) * MAX_NUM_BACKENDS (128) = 128 bytes requested for shared memory
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1412: LOG: waiting for watchdog to initialize
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: setting the local watchdog node name to "Linux_pg-01_5433"
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: watchdog cluster configured with 2 remote nodes
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: watchdog remote node:0 on pg-00:9000
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: watchdog remote node:1 on pg-02:9000
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: adding monitoring interface [0] name eth0 index 2
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: IPC socket path: "/var/run/postgresql/.s.PGPOOLWD_CMD.9000"
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface lo having flags 65609
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth0 having flags 69699
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" link is active
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" link is up
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth1 having flags 69699
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface lo having flags 65609
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth0 having flags 69699
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" link is active
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" link is up
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth1 having flags 69699
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface lo having flags 65609
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth0 having flags 69699
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" link is active
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" link is up
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth1 having flags 69699
Jul 25 18:33:02 pg-01 pgpool: message repeated 2 times: [ 2016-07-25 18:33:02: pid 1416: DEBUG: network interface eth1 having flags 69699]
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: network interface "eth0" is up and we can continue
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = STATE CHANGED Current State = LOADING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: new outbond connection to pg-00:9000
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = NEW OUTBOUND_CONNECTION Current State = LOADING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[ADD NODE], Command_ID:3, data Length:145
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: new outbond connection to pg-02:9000
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = NEW OUTBOUND_CONNECTION Current State = LOADING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: sending watchdog packet Socket:8, Type:[ADD NODE], Command_ID:4, data Length:145
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: received packet tyep I while need packet type
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = LOADING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: watchdog node state changed from [LOADING] to [INITIALIZING]
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = STATE CHANGED Current State = INITIALIZING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[NODE INFO], Command_ID:5, data Length:145
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: sending watchdog packet Socket:8, Type:[NODE INFO], Command_ID:5, data Length:145
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: new watchdog node connection is received from "192.168.6.20:18122"
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: LOG: new watchdog node connection is received from "192.168.6.22:7858"
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: received packet tyep A while need packet type A
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: ADD NODE MESSAGE from Hostname:"pg-00" PORT:9000 pgpool_port:5433
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: NOTICE: New node joined the cluster Hostname:"pg-00" PORT:9000 pgpool_port:5433
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = INITIALIZING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[NODE INFO], Command_ID:80, data Length:145
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: received packet tyep A while need packet type A
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: ADD NODE MESSAGE from Hostname:"pg-02" PORT:9000 pgpool_port:5433
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: NOTICE: New node joined the cluster Hostname:"pg-02" PORT:9000 pgpool_port:5433
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = INITIALIZING
Jul 25 18:33:02 pg-01 pgpool: 2016-07-25 18:33:02: pid 1416: DEBUG: sending watchdog packet Socket:8, Type:[NODE INFO], Command_ID:31, data Length:145
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = TIMEOUT Current State = INITIALIZING
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: LOG: watchdog node state changed from [INITIALIZING] to [STANDBY]
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = STATE CHANGED Current State = STANDBY
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[JOIN COORDINATOR], Command_ID:6, data Length:0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[NODE INFO], Command_ID:7, data Length:145
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: sending watchdog packet Socket:8, Type:[NODE INFO], Command_ID:7, data Length:145
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: received packet tyep G while need packet type
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = STANDBY
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: Watchdog node "Linux_pg-00_5433" has replied for command id 6
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = COMMAND FINISHED Current State = STANDBY
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[ASK FOR POOL CONFIG], Command_ID:8, data Length:0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: LOG: watchdog process is initialized
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: LOG: successfully joined the watchdog cluster as standby node
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DETAIL: our join coordinator request is accepted by cluster leader node "Linux_pg-00_5433"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: received packet tyep Z while need packet type
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = STANDBY
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DEBUG: I am watchdog lifecheck child with pid:1417
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: LOG: Setting up socket for 0.0.0.0:5433
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: LOG: Setting up socket for :::5433
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1416: LOG: new IPC connection received
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: LOG: 3 watchdog nodes are configured for lifecheck
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: LOG: watchdog nodes ID:0 Name:"Linux_pg-01_5433"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DETAIL: Host:"pg-01" WD Port:9000 pgpool-II port:5433
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: LOG: watchdog nodes ID:1 Name:"Linux_pg-00_5433"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DETAIL: Host:"pg-00" WD Port:9000 pgpool-II port:5433
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: LOG: watchdog nodes ID:2 Name:"Linux_pg-02_5433"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DETAIL: Host:"pg-02" WD Port:9000 pgpool-II port:5433
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: LOG: pgpool-II successfully started. version 3.5.3 (ekieboshi)
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: LOG: find_primary_node: checking backend no 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DEBUG: watchdog checking life check is ready
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DETAIL: pgpool:1 at "pg-00:5433" has not send the heartbeat signal yet
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DEBUG: watchdog checking life check is ready
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1417: DETAIL: pgpool:2 at "pg-02:5433" has not send the heartbeat signal yet
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1418: DEBUG: initializing backend status
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: I am 1424
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: initializing backend status
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1422: DEBUG: I am PCP child with pid:1422
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1420: DEBUG: initializing backend status
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 13 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate kind = 5
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:p
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 319 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate kind = 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate backend: key data received
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate backend: transaction state: I
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: extended:0 query:"SELECT pg_is_in_recovery()"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:Q
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_read: read 13 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate kind = 5
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_write: to backend: 0 kind:p
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 75 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: kind: 'T'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: received ROW DESCRIPTION ('T')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: row description: num_fileds: 1
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: kind: 'D'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: received DATA ROW ('D')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: kind: 'C'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: received COMMAND COMPLETE ('C')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_read: read 319 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate kind = 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate backend: key data received
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate backend: transaction state: I
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: kind: 'Z'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: do_query: received READY FOR QUERY ('Z')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:X
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: LOG: find_primary_node: primary node id is 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_read: read 13 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate kind = 5
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_write: to backend: 0 kind:p
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_read: read 319 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate kind = 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate backend: key data received
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: authenticate backend: transaction state: I
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: extended:0 query:"SELECT pg_current_xlog_location()"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_write: to backend: 0 kind:Q
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_read: read 91 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'T'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received ROW DESCRIPTION ('T')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: row description: num_fileds: 1
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'D'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received DATA ROW ('D')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'C'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received COMMAND COMPLETE ('C')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'Z'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received READY FOR QUERY ('Z')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: extended:0 query:"SELECT pg_last_xlog_replay_location()"
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_write: to backend: 0 kind:Q
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_read: read 95 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'T'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received ROW DESCRIPTION ('T')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: row description: num_fileds: 1
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'D'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received DATA ROW ('D')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'C'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received COMMAND COMPLETE ('C')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: kind: 'Z'
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: do_query: received READY FOR QUERY ('Z')
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: CONTEXT: while checking replication time lag
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_write: to backend: 0 kind:X
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1424: DEBUG: pool_write: to backend: 0 kind:X
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: starting health check
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: health check: clearing alarm
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: doing health check against database:postgres user:pgpool
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: Backend DB node 0 status is 2
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: Trying to make persistent DB connection to backend node 0 having status 2
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 13 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate kind = 5
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:p
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 319 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate kind = 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate backend: key data received
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate backend: transaction state: I
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: persistent DB connection to backend node 0 having status 2 is successful
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:X
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: Backend DB node 1 status is 2
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: Trying to make persistent DB connection to backend node 1 having status 2
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 13 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate kind = 5
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:p
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_read: read 319 bytes from backend 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate kind = 0
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate backend: key data received
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: authenticate backend: transaction state: I
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: persistent DB connection to backend node 1 having status 2 is successful
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: pool_write: to backend: 0 kind:X
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: health check: clearing alarm
Jul 25 18:33:03 pg-01 pgpool: 2016-07-25 18:33:03: pid 1412: DEBUG: health check: clearing alarm
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1421: LOG: creating socket for sending heartbeat
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1421: DETAIL: setsockopt(SO_BINDTODEVICE) requires root privilege
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1421: LOG: set SO_REUSEPORT option to the socket
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1421: LOG: creating socket for sending heartbeat
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1421: DETAIL: set SO_REUSEPORT
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1421: DEBUG: watchdog heartbeat: send heartbeat signal to pg-00:9694
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1419: LOG: failed to create watchdog heartbeat receive socket.
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1419: DETAIL: setsockopt(SO_BINDTODEVICE) requies root privilege
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1419: LOG: set SO_REUSEPORT option to the socket
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1419: LOG: creating watchdog heartbeat receive socket.
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1419: DETAIL: set SO_REUSEPORT
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1423: LOG: failed to create watchdog heartbeat receive socket.
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1423: DETAIL: setsockopt(SO_BINDTODEVICE) requies root privilege
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1423: LOG: set SO_REUSEPORT option to the socket
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1423: LOG: creating watchdog heartbeat receive socket.
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1423: DETAIL: set SO_REUSEPORT
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1425: LOG: creating socket for sending heartbeat
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1425: DETAIL: setsockopt(SO_BINDTODEVICE) requires root privilege
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1425: LOG: set SO_REUSEPORT option to the socket
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1425: LOG: creating socket for sending heartbeat
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1425: DETAIL: set SO_REUSEPORT
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1425: DEBUG: watchdog heartbeat: send heartbeat signal to pg-02:9694
Jul 25 18:33:04 pg-01 pgpool: 2016-07-25 18:33:04: pid 1419: DEBUG: received heartbeat signal from "pg-02:5433"
Jul 25 18:33:05 pg-01 pgpool: 2016-07-25 18:33:05: pid 1423: DEBUG: received heartbeat signal from "pg-00:5433"
Jul 25 18:33:06 pg-01 pgpool: 2016-07-25 18:33:06: pid 1421: DEBUG: watchdog heartbeat: send heartbeat signal to pg-00:9694
Jul 25 18:33:06 pg-01 pgpool: 2016-07-25 18:33:06: pid 1425: DEBUG: watchdog heartbeat: send heartbeat signal to pg-02:9694
Jul 25 18:33:06 pg-01 pgpool: 2016-07-25 18:33:06: pid 1419: DEBUG: received heartbeat signal from "pg-02:5433"
Jul 25 18:33:07 pg-01 pgpool: 2016-07-25 18:33:07: pid 1416: DEBUG: received packet tyep M while need packet type
Jul 25 18:33:07 pg-01 pgpool: 2016-07-25 18:33:07: pid 1416: DEBUG: STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = STANDBY
Jul 25 18:33:07 pg-01 pgpool: 2016-07-25 18:33:07: pid 1416: DEBUG: sending watchdog packet Socket:7, Type:[NODE INFO], Command_ID:81, data Length:145
Jul 25 18:33:07 pg-01 pgpool: 2016-07-25 18:33:07: pid 1423: DEBUG: received heartbeat signal from "pg-00:5433"

stivi21

2016-07-26 20:08

reporter   ~0000931

I prepared ready environment for test on Virtualbox:
https://s3-ap-northeast-1.amazonaws.com/pgpool/pgcluster.ova

There are 3 servers (the same configuration from last test).
Login for instances: krzysiek
Password: 1234

sudo -i withoud password.
Only change in /etc/hosts require for proper work.

Muhammad Usama

2016-07-27 05:20

developer   ~0000941

Syncing of backend node status across watchdog nodes was not implemented in the watchdog. Can you try the attached patch to implement the same if it solves the issue you are facing. After this patch the standby pgpool will load the backend node status from the master/coordinator watchdog node

Muhammad Usama

2016-07-27 05:21

developer  

backend_wd_status.diff (27,366 bytes)
diff --git a/src/include/watchdog/wd_ipc_commands.h b/src/include/watchdog/wd_ipc_commands.h
index 2ef5908..1c4175c 100644
--- a/src/include/watchdog/wd_ipc_commands.h
+++ b/src/include/watchdog/wd_ipc_commands.h
@@ -27,6 +27,7 @@
 #define WD_IPC_COMMANDS_H
 
 #include "watchdog/wd_ipc_defines.h"
+#include "watchdog/wd_json_data.h"
 
 typedef enum WdCommandResult
 {
@@ -64,6 +65,7 @@ extern WdCommandResult wd_degenerate_backend_set(int *node_id_set, int count);
 extern WdCommandResult wd_promote_backend(int node_id);
 extern WDFailoverCMDResults wd_send_failover_sync_command(WDFailoverCMDTypes cmdType, char* syncReqType);
 
+extern WDPGBackendStatus* get_pg_backend_status_from_master_wd_node(void);
 
 extern char* wd_get_watchdog_nodes(int nodeID);
 
diff --git a/src/include/watchdog/wd_ipc_defines.h b/src/include/watchdog/wd_ipc_defines.h
index 8ef3929..cd8adf5 100644
--- a/src/include/watchdog/wd_ipc_defines.h
+++ b/src/include/watchdog/wd_ipc_defines.h
@@ -63,7 +63,7 @@ typedef enum WDFailoverCMDResults
 
 #define WD_FUNCTION_COMMAND					'f'
 #define WD_FAILOVER_CMD_SYNC_REQUEST		's'
-
+#define WD_GET_MASTER_DATA_REQUEST			'd'
 
 #define WD_FUNCTION_START_RECOVERY		"START_RECOVERY"
 #define WD_FUNCTION_END_RECOVERY		"END_RECOVERY"
@@ -71,6 +71,8 @@ typedef enum WDFailoverCMDResults
 #define WD_FUNCTION_DEGENERATE_REQUEST	"DEGENERATE_BACKEND_REQUEST"
 #define WD_FUNCTION_PROMOTE_REQUEST		"PROMOTE_BACKEND_REQUEST"
 
+#define WD_DATE_REQ_PG_BACKEND_DATA		"BackendStatus"
+
 #define WD_IPC_AUTH_KEY			"IPCAuthKey"	/* JSON data key for authentication.
 												 * watchdog IPC server use the value for this key
 												 * to authenticate the external IPC clients
diff --git a/src/include/watchdog/wd_json_data.h b/src/include/watchdog/wd_json_data.h
index b3b09b8..e8f662d 100644
--- a/src/include/watchdog/wd_json_data.h
+++ b/src/include/watchdog/wd_json_data.h
@@ -41,6 +41,18 @@ typedef struct WDNodeInfo
 	int	id;
 }WDNodeInfo;
 
+/*
+ * The structure to hold the parsed PG backend node status data fetched
+ * from the master watchdog node
+ */
+typedef struct WDPGBackendStatus
+{
+	int primary_node_id;
+	int node_count;
+	BACKEND_STATUS backend_status[MAX_NUM_BACKENDS];
+	char nodeName[WD_MAX_HOST_NAMELEN];		/* name of the watchdog node that sent the data */
+}WDPGBackendStatus;
+
 extern WatchdogNode* get_watchdog_node_from_json(char* json_data, int data_len, char** authkey);
 extern char* get_watchdog_node_info_json(WatchdogNode* wdNode, char* authkey);
 extern POOL_CONFIG* get_pool_config_from_json(char* json_data, int data_len);
@@ -53,4 +65,11 @@ extern WDNodeInfo* get_WDNodeInfo_from_wd_node_json(json_value* source);
 extern char* get_wd_node_function_json(char* func_name, int *node_id_set, int count, unsigned int sharedKey, char* authKey);
 extern bool parse_wd_node_function_json(char* json_data, int data_len, char** func_name, int **node_id_set, int *count);
 extern char* get_wd_simple_error_message_json(char* message);
+
+extern WDPGBackendStatus* get_pg_backend_node_status_from_json(char* json_data, int data_len);
+extern char* get_backend_node_status_json(WatchdogNode* wdNode);
+
+extern bool parse_data_request_json(char* json_data, int data_len, char** request_type);
+extern char* get_data_request_json(char* request_type, unsigned int sharedKey, char* authKey);
+
 #endif
diff --git a/src/main/pgpool_main.c b/src/main/pgpool_main.c
index 039d4bc..becf5bc 100644
--- a/src/main/pgpool_main.c
+++ b/src/main/pgpool_main.c
@@ -131,6 +131,7 @@ static int find_primary_node_repeatedly(void);
 static void terminate_all_childrens();
 static void system_will_go_down(int code, Datum arg);
 static char* process_name_from_pid(pid_t pid);
+static void initialize_backend_status_from_watchdog(void);
 
 static struct sockaddr_un un_addr;		/* unix domain socket path */
 static struct sockaddr_un pcp_un_addr;  /* unix domain socket path for PCP */
@@ -228,7 +229,6 @@ int PgpoolMain(bool discard_status, bool clear_memcache_oidmaps)
 				(errmsg("failed to allocate memory in startup process")));
 
 	initialize_shared_mem_objects(clear_memcache_oidmaps);
-
 	if (pool_config->use_watchdog)
 	{
 		sigset_t mask;
@@ -275,6 +275,9 @@ int PgpoolMain(bool discard_status, bool clear_memcache_oidmaps)
 		 * initialize the lifecheck process
 		 */
 		wd_lifecheck_pid = initialize_watchdog_lifecheck();
+
+		/* load the backend node status from watchdog cluster */
+		initialize_backend_status_from_watchdog();
 	}
 
 	fds[0] = create_unix_domain_socket(un_addr);
@@ -371,8 +374,14 @@ int PgpoolMain(bool discard_status, bool clear_memcache_oidmaps)
 	ereport(LOG,
 			(errmsg("%s successfully started. version %s (%s)", PACKAGE, VERSION, PGPOOLVERSION)));
 
-	/* Save primary node id */
-	Req_info->primary_node_id = find_primary_node();
+	/*
+	 * if the primary node id is not loaded by watchdog, search for it
+	 */
+	if (Req_info->primary_node_id < 0)
+	{
+		/* Save primary node id */
+		Req_info->primary_node_id = find_primary_node();
+	}
 
 	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
 	{
@@ -2929,7 +2938,6 @@ static void initialize_shared_mem_objects(bool clear_memcache_oidmaps)
 	ereport(DEBUG1,
 			(errmsg("Request info are: sizeof(POOL_REQUEST_INFO) %zu bytes requested for shared memory",
 					sizeof(POOL_REQUEST_INFO))));
-
 	/*
 	 * Initialize backend status area.
 	 * From now on, VALID_BACKEND macro can be used.
@@ -2946,6 +2954,7 @@ static void initialize_shared_mem_objects(bool clear_memcache_oidmaps)
 	Req_info->conn_counter = 0;
 	Req_info->switching = false;
 	Req_info->request_queue_head = Req_info->request_queue_tail = -1;
+	Req_info->primary_node_id = -2;
 	InRecovery = pool_shared_memory_create(sizeof(int));
 	*InRecovery = RECOVERY_INIT;
 
@@ -3338,3 +3347,73 @@ int pool_frontend_exists(void)
 		return pg_frontend_exists();
 	return -1;
 }
+
+static void initialize_backend_status_from_watchdog(void)
+{
+	if (pool_config->use_watchdog)
+	{
+		WDPGBackendStatus* backendStatus = get_pg_backend_status_from_master_wd_node();
+		if (backendStatus)
+		{
+			if (backendStatus->node_count <= 0)
+			{
+				/*
+				 * -ve node count is returned by watchdog when the node itself is a master
+				 * and in that case we need to use the loacl backend node status
+				 */
+				ereport(LOG,
+						(errmsg("I am the master watchdog node"),
+						 errdetail("using the local backend node status")));
+			}
+			else
+			{
+				int i;
+				bool reload_maste_node_id = false;
+				ereport(LOG,
+						(errmsg("master watchdog node \"%s\" returned status for %d backend nodes",backendStatus->nodeName,backendStatus->node_count)));
+				ereport(LOG,
+						(errmsg("primary node on master watchdog node \"%s\" is %d",backendStatus->nodeName,backendStatus->primary_node_id)));
+
+				Req_info->primary_node_id = backendStatus->primary_node_id;
+
+				for (i = 0; i < backendStatus->node_count; i++)
+				{
+					if (backendStatus->backend_status[i] == CON_DOWN)
+					{
+						if (BACKEND_INFO(i).backend_status != CON_DOWN)
+						{
+
+							BACKEND_INFO(i).backend_status = CON_DOWN;
+							my_backend_status[i] = &(BACKEND_INFO(i).backend_status);
+							reload_maste_node_id = true;
+							ereport(LOG,
+									(errmsg("backend status from \"%s\" backend:%d is set to down status",backendStatus->nodeName, i)));
+						}
+					}
+					else if (backendStatus->backend_status[i] == CON_CONNECT_WAIT ||
+								backendStatus->backend_status[i] == CON_UP)
+					{
+						if (BACKEND_INFO(i).backend_status != CON_CONNECT_WAIT)
+						{
+							BACKEND_INFO(i).backend_status = CON_CONNECT_WAIT;
+							my_backend_status[i] = &(BACKEND_INFO(i).backend_status);
+							reload_maste_node_id = true;
+						}
+					}
+				}
+
+				if (reload_maste_node_id)
+				{
+					Req_info->master_node_id = get_next_master_node();
+				}
+			}
+			pfree(backendStatus);
+		}
+		else
+		{
+			ereport(WARNING,
+				(errmsg("failed to get the backend status from the master watchdog node"),
+					 errdetail("using the local backend node status")));
+		}
+	}
+}
diff --git a/src/watchdog/watchdog.c b/src/watchdog/watchdog.c
index fab8630..14f61d7 100644
--- a/src/watchdog/watchdog.c
+++ b/src/watchdog/watchdog.c
@@ -127,6 +127,8 @@ packet_types all_packet_types[] = {
 	{WD_INFORM_I_AM_GOING_DOWN, "INFORM I AM GOING DOWN"},
 	{WD_ASK_FOR_POOL_CONFIG, "ASK FOR POOL CONFIG"},
 	{WD_POOL_CONFIG_DATA, "CONFIG DATA"},
+	{WD_GET_MASTER_DATA_REQUEST, "DATA REQUEST"},
+
 	{WD_NO_MESSAGE,""}
 };
 
@@ -391,6 +393,7 @@ static IPC_CMD_PREOCESS_RES process_IPC_nodeStatusChange_command(WDIPCCommandDat
 static IPC_CMD_PREOCESS_RES process_IPC_nodeList_command(WDIPCCommandData* IPCCommand);
 static IPC_CMD_PREOCESS_RES process_IPC_replicate_variable(WDIPCCommandData* IPCCommand);
 static IPC_CMD_PREOCESS_RES process_IPC_failover_cmd_synchronise(WDIPCCommandData *IPCCommand);
+static IPC_CMD_PREOCESS_RES process_IPC_data_request_from_master(WDIPCCommandData *IPCCommand);
 static IPC_CMD_PREOCESS_RES execute_replicate_command(WDIPCCommandData* ipcCommand);
 static bool write_ipc_command_with_result_data(WDIPCCommandData* IPCCommand, char type, char* data, int len);
 
@@ -424,6 +427,7 @@ static bool check_and_report_IPC_authentication(WDIPCCommandData* ipcCommand);
 static void print_received_packet_info(WDPacketData* pkt,WatchdogNode* wdNode);
 static void update_interface_status(void);
 static bool any_interface_available(void);
+static WDPacketData* process_data_request(WatchdogNode* wdNode, WDPacketData* pkt);
 
 /* global variables */
 wd_cluster g_cluster;
@@ -688,7 +692,7 @@ wd_create_recv_socket(int port)
 		}
 		close(sock);
 		ereport(ERROR,
-				(errmsg("failed to create watchdog receive socket"),
+			(errmsg("failed to create watchdog receive socket"),
 				 errdetail("bind on \"%s:%s\" failed with reason: \"%s\"", host, serv, strerror(errno))));
 	}
 	
@@ -1567,6 +1571,7 @@ static IPC_CMD_PREOCESS_RES process_IPC_command(WDIPCCommandData* ipcCommand)
 
 	switch(ipcCommand->type)
 	{
+
 		case WD_NODE_STATUS_CHANGE_COMMAND:
 			return process_IPC_nodeStatusChange_command(ipcCommand);
 			break;
@@ -1589,6 +1594,9 @@ static IPC_CMD_PREOCESS_RES process_IPC_command(WDIPCCommandData* ipcCommand)
 		case WD_FAILOVER_CMD_SYNC_REQUEST:
 			return process_IPC_failover_cmd_synchronise(ipcCommand);
 
+		case WD_GET_MASTER_DATA_REQUEST:
+			return process_IPC_data_request_from_master(ipcCommand);
+
 		default:
 		{
 			char* error_json;
@@ -1775,6 +1783,69 @@ static IPC_CMD_PREOCESS_RES process_IPC_replicate_variable(WDIPCCommandData* IPC
 	return IPC_CMD_ERROR;
 }
 
+static IPC_CMD_PREOCESS_RES process_IPC_data_request_from_master(WDIPCCommandData *IPCCommand)
+{
+	char res_type = WD_IPC_CMD_RESULT_BAD;
+	/*
+	 * if cluster or myself is not in stable state
+	 * just return cluster in transaction
+	 */
+	ereport(LOG,
+			(errmsg("processing master node data request from IPC socket")));
+
+	IPCCommand->type = WD_GET_MASTER_DATA_REQUEST;
+	if (get_local_node_state() == WD_STANDBY)
+	{
+		/* I am a standby node, Just forward the request to coordinator */
+
+		WDPacketData wdPacket;
+		init_wd_packet(&wdPacket);
+		set_message_type(&wdPacket, WD_GET_MASTER_DATA_REQUEST);
+		set_next_commandID_in_message(&wdPacket);
+		set_message_data(&wdPacket, IPCCommand->data_buf , IPCCommand->data_len);
+		/* save the command ID */
+		IPCCommand->internal_command_id = wdPacket.command_id;
+		if (send_message(g_cluster.masterNode, &wdPacket) <= 0)
+		{
+			ereport(LOG,
+				(errmsg("failed to process master node data request from IPC socket"),
+					 errdetail("failed to forward the request to master watchdog node \"%s\"",g_cluster.masterNode->nodeName)));
+			/* we have failed to send to any node, return lock failed  */
+			res_type = WD_IPC_CMD_RESULT_BAD;
+		}
+		else
+		{
+			/*
+			 * we need to wait for the result
+			 */
+			ereport(LOG,
+				(errmsg("data request from IPC socket is forwarded to master watchdog node \"%s\"",g_cluster.masterNode->nodeName),
+					 errdetail("waiting for the reply from master node...")));
+
+			return IPC_CMD_PROCESSING;
+		}
+	}
+	else if (get_local_node_state() == WD_COORDINATOR)
+	{
+		/* This node is itself a master node, So send the empty result with OK tag */
+		res_type = WD_IPC_CMD_RESULT_OK;
+	}
+	else /* we are not in any stable state at the moment */
+	{
+		res_type = WD_IPC_CMD_CLUSTER_IN_TRAN;
+	}
+
+	if (write_ipc_command_with_result_data(IPCCommand, res_type, NULL, 0))
+	{
+		/*
+		 * This is the complete lifecycle of command.
+		 * we are done with it
+		 */
+		return IPC_CMD_COMPLETE;
+	}
+	return IPC_CMD_ERROR;
+
+}
 
 static IPC_CMD_PREOCESS_RES process_IPC_failover_cmd_synchronise(WDIPCCommandData *IPCCommand)
 {
@@ -2792,24 +2863,79 @@ static void cleanUpIPCCommand(WDIPCCommandData* ipcCommand)
 	MemoryContextDelete(ipcCommand->memoryContext);
 }
 
+static WDPacketData* process_data_request(WatchdogNode* wdNode, WDPacketData* pkt)
+{
+	char* request_type;
+	char* data = NULL;
+	WDPacketData* replyPkt = NULL;
+
+	if (pkt->data == NULL || pkt->len <= 0)
+	{
+		ereport(WARNING,
+			(errmsg("invalid data request packet from watchdog node \"%s\"",wdNode->nodeName),
+				 errdetail("no data found in the packet")));
+
+		replyPkt = get_minimum_message(WD_ERROR_MESSAGE,pkt);
+		return replyPkt;
+	}
+
+	if (!parse_data_request_json(pkt->data,pkt->len, &request_type))
+	{
+		ereport(WARNING,
+			(errmsg("invalid data request packet from watchdog node \"%s\"",wdNode->nodeName),
+				 errdetail("no data found in the packet")));
+
+		replyPkt = get_minimum_message(WD_ERROR_MESSAGE,pkt);
+		return replyPkt;
+	}
+
+	if (strcasecmp(request_type, WD_DATE_REQ_PG_BACKEND_DATA) == 0)
+	{
+		data = get_backend_node_status_json(g_cluster.localNode);
+	}
+
+	if (data)
+	{
+		replyPkt = get_empty_packet();
+		set_message_type(replyPkt, WD_DATA_MESSAGE);
+		set_message_commandID(replyPkt, pkt->command_id);
+		set_message_data(replyPkt, data , strlen(data));
+	}
+	else
+	{
+		replyPkt = get_minimum_message(WD_ERROR_MESSAGE,pkt);
+	}
+
+	return replyPkt;
+}
+
 static int standard_packet_processor(WatchdogNode* wdNode, WDPacketData* pkt)
 {
 	WDPacketData* replyPkt = NULL;
 	switch (pkt->type)
 	{
+		case WD_GET_MASTER_DATA_REQUEST:
+		{
+			replyPkt = process_data_request(wdNode, pkt);
+		}
+			break;
+
 		case WD_ASK_FOR_POOL_CONFIG:
 		{
 			char* config_data = get_pool_config_json();
 			
-			if (config_data == NULL)
-				reply_with_minimal_message(wdNode, WD_ERROR_MESSAGE, pkt);
-			else
+			if (config_data)
 			{
 				replyPkt = get_empty_packet();
 				set_message_type(replyPkt, WD_POOL_CONFIG_DATA);
 				set_message_commandID(replyPkt, pkt->command_id);
 				set_message_data(replyPkt, config_data , strlen(config_data));
 			}
+			else
+			{
+				replyPkt = get_minimum_message(WD_ERROR_MESSAGE,pkt);
+
+			}
 		}
 			break;
 			
@@ -2900,9 +3026,13 @@ static int standard_packet_processor(WatchdogNode* wdNode, WDPacketData* pkt)
 			 * otherwise reject
 			 */
 			if (g_cluster.localNode == g_cluster.masterNode)
-				reply_with_minimal_message(wdNode, WD_ACCEPT_MESSAGE, pkt);
+			{
+				replyPkt = get_minimum_message(WD_ACCEPT_MESSAGE,pkt);
+			}
 			else
-				reply_with_minimal_message(wdNode, WD_REJECT_MESSAGE, pkt);
+			{
+				replyPkt = get_minimum_message(WD_REJECT_MESSAGE,pkt);
+			}
 		}
 			break;
 			
@@ -2916,10 +3046,12 @@ static int standard_packet_processor(WatchdogNode* wdNode, WDPacketData* pkt)
 			{
 				ereport(NOTICE,
 						(errmsg("cluster is in split brain")));
-				reply_with_minimal_message(wdNode, WD_ERROR_MESSAGE, pkt);
+				replyPkt = get_minimum_message(WD_ERROR_MESSAGE,pkt);
 			}
 			else
+			{
 				replyPkt = get_mynode_info_message(pkt);
+			}
 		}
 			break;
 
@@ -2931,8 +3063,7 @@ static int standard_packet_processor(WatchdogNode* wdNode, WDPacketData* pkt)
 		if (send_message_to_node(wdNode,replyPkt) == false)
 			ereport(LOG,
 				(errmsg("sending packet to node \"%s\" failed", wdNode->nodeName)));
-
-		pfree(replyPkt);
+		free_packet(replyPkt);
 	}
 	return 1;
 }
@@ -3583,14 +3714,19 @@ static int watchdog_state_machine(WD_EVENTS event, WatchdogNode* wdNode, WDPacke
 		gettimeofday(&wdNode->last_rcv_time, NULL);
 
 		if (pkt->type == WD_INFO_MESSAGE)
+		{
 			standard_packet_processor(wdNode, pkt);
+		}
+
 		if (pkt->type == WD_INFORM_I_AM_GOING_DOWN)		/* TODO do it better way */
 		{
 			wdNode->state = WD_SHUTDOWN;
 			return watchdog_state_machine(WD_EVENT_REMOTE_NODE_LOST, wdNode, NULL);
 		}
 		if (watchdog_internal_command_packet_processor(wdNode,pkt) == true)
+		{
 			return 0;
+		}
 	}
 	else if (event == WD_EVENT_NEW_OUTBOUND_CONNECTION)
 	{
@@ -5422,6 +5558,7 @@ static bool check_and_report_IPC_authentication(WDIPCCommandData* ipcCommand)
 
 		case WD_FUNCTION_COMMAND:
 		case WD_FAILOVER_CMD_SYNC_REQUEST:
+		case WD_GET_MASTER_DATA_REQUEST:
 			/* only allowed internaly.*/
 			internal_client_only = true;
 			break;
diff --git a/src/watchdog/wd_commands.c b/src/watchdog/wd_commands.c
index f017aff..c1ec5a6 100644
--- a/src/watchdog/wd_commands.c
+++ b/src/watchdog/wd_commands.c
@@ -59,6 +59,7 @@
 #define PROMOTE_REQUEST_NODE_MASK		0x04
 
 static void sleep_in_waiting(void);
+static void FreeCmdResult(WDIPCCmdResult* res);
 static WDFailoverCMDResults wd_issue_failover_lock_command(WDFailoverCMDTypes cmdType, char* syncReqType);
 
 
@@ -263,6 +264,70 @@ issue_command_to_watchdog(char type, int timeout_sec, char* data, int data_len,
 	return result;
 }
 
+/*
+ * function gets the PG backend status of all attached nodes from
+ * the master watchdog node.
+ */
+WDPGBackendStatus* get_pg_backend_status_from_master_wd_node(void)
+{
+	unsigned int *shared_key = get_ipc_shared_key();
+	char *data = get_data_request_json(WD_DATE_REQ_PG_BACKEND_DATA,
+									   shared_key?*shared_key:0,pool_config->wd_authkey);
+
+	WDIPCCmdResult *result = issue_command_to_watchdog(WD_GET_MASTER_DATA_REQUEST,
+													   WD_DEFAULT_IPC_COMMAND_TIMEOUT,
+													   data, strlen(data), true);
+	pfree(data);
+
+	if (result == NULL)
+	{
+		ereport(WARNING,
+			(errmsg("get backend node status from master watchdog failed"),
+				 errdetail("issue command to watchdog returned NULL")));
+		return NULL;
+	}
+	if (result->type == WD_IPC_CMD_CLUSTER_IN_TRAN)
+	{
+		ereport(WARNING,
+			(errmsg("get backend node status from master watchdog failed"),
+				 errdetail("watchdog cluster is not in stable state"),
+					errhint("try again when the cluster is fully initialized")));
+		FreeCmdResult(result);
+		return NULL;
+	}
+	else if (result->type == WD_IPC_CMD_TIMEOUT)
+	{
+		ereport(WARNING,
+				(errmsg("get backend node status from master watchdog failed"),
+				 errdetail("ipc command timeout")));
+		FreeCmdResult(result);
+		return NULL;
+	}
+	else if (result->type == WD_IPC_CMD_RESULT_OK)
+	{
+		WDPGBackendStatus* backendStatus =  get_pg_backend_node_status_from_json(result->data, result->length);
+		/*
+		 * Watchdog returns the zero length data when the node itself is a master
+		 * watchdog node
+		 */
+		if (result->length <= 0)
+		{
+			backendStatus = palloc0(sizeof(WDPGBackendStatus));
+			backendStatus->node_count = -1;
+		}
+		else
+		{
+			backendStatus =  get_pg_backend_node_status_from_json(result->data, result->length);
+		}
+		FreeCmdResult(result);
+		return backendStatus;
+	}
+
+	ereport(WARNING,
+		(errmsg("get backend node status from master watchdog failed")));
+	FreeCmdResult(result);
+	return NULL;
+}
 
 WdCommandResult
 wd_start_recovery(void)
@@ -287,7 +352,7 @@ wd_start_recovery(void)
 	}
 
 	type = result->type;
-	pfree(result);
+	FreeCmdResult(result);
 	if (type == WD_IPC_CMD_CLUSTER_IN_TRAN)
 	{
 		ereport(WARNING,
@@ -334,7 +399,8 @@ wd_end_recovery(void)
 	}
 	
 	type = result->type;
-	pfree(result);
+	FreeCmdResult(result);
+
 	if (type == WD_IPC_CMD_CLUSTER_IN_TRAN)
 	{
 		ereport(WARNING,
@@ -387,7 +453,7 @@ wd_send_failback_request(int node_id)
 	}
 	
 	type = result->type;
-	pfree(result);
+	FreeCmdResult(result);
 	if (type == WD_IPC_CMD_CLUSTER_IN_TRAN)
 	{
 		ereport(WARNING,
@@ -457,6 +523,7 @@ wd_send_failover_sync_command(WDFailoverCMDTypes cmdType, char* syncReqType)
 		ereport(WARNING,
 			(errmsg("watchdog failed to send failover command"),
 				 errdetail("ipc command timeout")));
+		FreeCmdResult(result);
 		return FAILOVER_RES_ERROR;
 	}
 	if (result->length <= 0)
@@ -464,6 +531,7 @@ wd_send_failover_sync_command(WDFailoverCMDTypes cmdType, char* syncReqType)
 		ereport(WARNING,
 			(errmsg("watchdog failed to send failover command"),
 				 errdetail("issue command to watchdog returned no data")));
+		FreeCmdResult(result);
 		return FAILOVER_RES_ERROR;
 	}
 
@@ -473,21 +541,24 @@ wd_send_failover_sync_command(WDFailoverCMDTypes cmdType, char* syncReqType)
 	{
 		ereport(NOTICE,
 				(errmsg("unable to parse json data from replicate command")));
+		FreeCmdResult(result);
 		return FAILOVER_RES_ERROR;
 	}
 	
 	if (json_get_int_value_for_key(root, "FailoverCMDType", &failoverResCmdType))
 	{
 		json_value_free(root);
+		FreeCmdResult(result);
 		return FAILOVER_RES_ERROR;
 	}
 	if (root && json_get_int_value_for_key(root, "InterlockingResult", &interlockingResult))
 	{
 		json_value_free(root);
+		FreeCmdResult(result);
 		return FAILOVER_RES_ERROR;
 	}
 	json_value_free(root);
-	pfree(result);
+	FreeCmdResult(result);
 	
 	if (failoverResCmdType != cmdType)
 		return FAILOVER_RES_ERROR;
@@ -526,7 +597,7 @@ wd_degenerate_backend_set(int *node_id_set, int count)
 	}
 	
 	type = result->type;
-	pfree(result);
+	FreeCmdResult(result);
 	if (type == WD_IPC_CMD_CLUSTER_IN_TRAN)
 	{
 		ereport(WARNING,
@@ -579,7 +650,7 @@ wd_promote_backend(int node_id)
 	}
 	
 	type = result->type;
-	pfree(result);
+	FreeCmdResult(result);
 	if (type == WD_IPC_CMD_CLUSTER_IN_TRAN)
 	{
 		ereport(WARNING,
@@ -644,7 +715,7 @@ char* wd_get_watchdog_nodes(int nodeID)
 			(errmsg("get watchdog nodes command failed"),
 				 errdetail("watchdog cluster is not in stable state"),
 					errhint("try again when the cluster is fully initialized")));
-		pfree(result);
+		FreeCmdResult(result);
 		return NULL;
 	}
 	else if (result->type == WD_IPC_CMD_TIMEOUT)
@@ -652,16 +723,17 @@ char* wd_get_watchdog_nodes(int nodeID)
 		ereport(WARNING,
 			(errmsg("get watchdog nodes command failed"),
 				 errdetail("ipc command timeout")));
-		pfree(result);
+		FreeCmdResult(result);
 		return NULL;
 	}
 	else if (result->type == WD_IPC_CMD_RESULT_OK)
 	{
 		char* data = result->data;
+		/* do not free the result->data, Save the data copy */
 		pfree(result);
 		return data;
 	}
-	pfree(result);
+	FreeCmdResult(result);
 	return NULL;
 }
 
@@ -761,7 +833,7 @@ static WDFailoverCMDResults wd_issue_failover_lock_command(WDFailoverCMDTypes cm
 	int x;
 	for (x=0; x < MAX_SEC_WAIT_FOR_CLUSTER_TRANSATION; x++)
 	{
-		res = wd_send_failover_sync_command(NODE_FAILBACK_CMD, syncReqType);
+		res = wd_send_failover_sync_command(cmdType, syncReqType);
 		if (res != FAILOVER_RES_TRANSITION)
 			break;
 		sleep(1);
@@ -847,3 +919,9 @@ wd_chk_node_mask_for_promote_req(int *node_id_set, int count)
 	return wd_chk_node_mask (PROMOTE_REQUEST_NODE_MASK, node_id_set, count);
 }
 
+static void FreeCmdResult(WDIPCCmdResult* res)
+{
+	if (res->data)
+		pfree(res->data);
+	pfree (res);
+}
diff --git a/src/watchdog/wd_json_data.c b/src/watchdog/wd_json_data.c
index a0710f8..36b563d 100644
--- a/src/watchdog/wd_json_data.c
+++ b/src/watchdog/wd_json_data.c
@@ -30,6 +30,9 @@
 #include "watchdog/watchdog.h"
 #include "watchdog/wd_json_data.h"
 #include "watchdog/wd_ipc_defines.h"
+#include "pool.h"
+
+#define WD_JSON_KEY_DATA_REQ_TYPE	"DataRequestType"
 
 
 POOL_CONFIG* get_pool_config_from_json(char* json_data, int data_len)
@@ -224,6 +227,133 @@ char* get_pool_config_json(void)
 	return json_str;
 }
 
+char* get_data_request_json(char* request_type, unsigned int sharedKey, char* authKey)
+{
+	char* json_str;
+
+	JsonNode* jNode = jw_create_with_object(true);
+
+	jw_put_int(jNode, WD_IPC_SHARED_KEY, sharedKey); /* put the shared key*/
+
+	if (authKey != NULL && strlen(authKey) > 0)
+		jw_put_string(jNode, WD_IPC_AUTH_KEY, authKey); /*  put the auth key*/
+
+	jw_put_string(jNode, WD_JSON_KEY_DATA_REQ_TYPE, request_type);
+	jw_finish_document(jNode);
+	json_str = pstrdup(jw_get_json_string(jNode));
+	jw_destroy(jNode);
+	return json_str;
+}
+
+bool parse_data_request_json(char* json_data, int data_len, char** request_type)
+{
+	json_value *root, *value;
+	char* ptr;
+
+	*request_type = NULL;
+
+	root = json_parse(json_data,data_len);
+
+	/* The root node must be object */
+	if (root == NULL || root->type != json_object)
+	{
+		json_value_free(root);
+		ereport(LOG,
+			(errmsg("watchdog is unable to parse data request json"),
+				 errdetail("invalid json data \"%s\"",json_data)));
+		return false;
+	}
+	ptr = json_get_string_value_for_key(root, WD_JSON_KEY_DATA_REQ_TYPE);
+	if (ptr == NULL)
+	{
+		json_value_free(root);
+		ereport(LOG,
+			(errmsg("watchdog is unable to parse data request json"),
+				 errdetail("request name node not found in json data \"%s\"",json_data)));
+		return false;
+	}
+	*request_type = pstrdup(ptr);
+	json_value_free(root);
+	return true;
+}
+
+
+/* The function reads the backend node status from shared memory
+ * and creates a json packet from it
+ */
+char* get_backend_node_status_json(WatchdogNode* wdNode)
+{
+	int i;
+	char* json_str;
+	JsonNode* jNode = jw_create_with_object(true);
+
+	jw_start_array(jNode, "BackendNodeStatusList");
+
+	for (i=0;i< pool_config->backend_desc->num_backends;i++)
+	{
+		BACKEND_STATUS backend_status = pool_config->backend_desc->backend_info[i].backend_status;
+		jw_put_int_value(jNode, backend_status);
+	}
+	/* put the primary node id */
+	jw_end_element(jNode);
+	jw_put_int(jNode, "PrimaryNodeId", Req_info->primary_node_id);
+	jw_put_string(jNode, "NodeName",wdNode->nodeName);
+
+	jw_finish_document(jNode);
+	json_str = pstrdup(jw_get_json_string(jNode));
+	jw_destroy(jNode);
+	return json_str;
+}
+
+WDPGBackendStatus* get_pg_backend_node_status_from_json(char* json_data, int data_len)
+{
+	json_value *root = NULL;
+	json_value *value = NULL;
+	char *ptr;
+	int i;
+	WDPGBackendStatus* backendStatus = NULL;
+
+	root = json_parse(json_data,data_len);
+	/* The root node must be object */
+	if (root == NULL || root->type != json_object)
+		return NULL;
+
+	/* backend status array */
+	value = json_get_value_for_key(root,"BackendNodeStatusList");
+	if (value == NULL || value->type != json_array)
+		return NULL;
+
+	if (value->u.array.length <=0 || value->u.array.length > MAX_NUM_BACKENDS )
+		return NULL;
+
+	backendStatus = palloc(sizeof(WDPGBackendStatus));
+	backendStatus->node_count = value->u.array.length;
+
+	for (i = 0; i < backendStatus->node_count; i++)
+	{
+		backendStatus->backend_status[i] = value->u.array.values[i]->u.integer;
+	}
+
+	if (json_get_int_value_for_key(root, "PrimaryNodeId", &backendStatus->primary_node_id))
+	{
+		ereport(ERROR,
+			(errmsg("invalid json data"),
+				 errdetail("unable to find Watchdog Node ID")));
+	}
+
+	ptr = json_get_string_value_for_key(root, "NodeName");
+	if (ptr)
+	{
+		strncpy(backendStatus->nodeName, ptr, sizeof(backendStatus->nodeName) -1);
+	}
+	else
+	{
+		backendStatus->nodeName[0] = 0;
+	}
+
+	return backendStatus;
+}
+
 
 char* get_watchdog_node_info_json(WatchdogNode* wdNode, char* authkey)
 {
backend_wd_status.diff (27,366 bytes)

stivi21

2016-07-27 23:01

reporter   ~0000946

Thank you for answer. I'm thinking that watchdog have to syncing status of nodes. I'ts very important for me, because I have configuration without Virtual IP, and I using haproxy, so if pgpool see 2 masters (new master, and old after failed), this pgpool transmit queries to both postgresql nodes.
I thinking about patch, but can be not so easy for me. If I will do this, I will send patch.

stivi21

2016-07-27 23:06

reporter   ~0000947

I'm Ssrry, I did'n see your patch, I will test it and report.

stivi21

2016-07-28 18:21

reporter   ~0000950

The patch works like I expected :)
Thank you very much, it's resolves my problems.

Muhammad Usama

2016-08-02 23:52

developer   ~0000958

Hi

Thanks for verifying the fix. I have committed the same to 3.5 and master branches

http://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=72573ca6d45ca2fc985a06936c0c52808fe78ef1

Issue History

Date Modified Username Field Change
2016-07-15 19:41 stivi21 New Issue
2016-07-15 19:44 stivi21 Note Added: 0000919
2016-07-26 01:45 stivi21 Note Added: 0000928
2016-07-26 07:50 t-ishii Assigned To => Muhammad Usama
2016-07-26 07:50 t-ishii Status new => assigned
2016-07-26 20:08 stivi21 Note Added: 0000931
2016-07-27 05:20 Muhammad Usama Status assigned => feedback
2016-07-27 05:20 Muhammad Usama Note Added: 0000941
2016-07-27 05:21 Muhammad Usama File Added: backend_wd_status.diff
2016-07-27 23:01 stivi21 Note Added: 0000946
2016-07-27 23:01 stivi21 Status feedback => assigned
2016-07-27 23:06 stivi21 Note Added: 0000947
2016-07-28 18:21 stivi21 Note Added: 0000950
2016-08-02 23:52 Muhammad Usama Status assigned => resolved
2016-08-02 23:52 Muhammad Usama Resolution open => fixed
2016-08-02 23:52 Muhammad Usama Note Added: 0000958