[pgpool-general: 4783] Re: show pool_nodes returns different result when run on master and on slave

Krzysztof Mościcki stivi at kity.pl
Wed Jul 13 06:47:36 JST 2016


Hi,
I have the same problem. I tested on on Ubuntu 14.04 with pgpool 3.4.3.
I've upgarded pgpool to the newest version 3.5.3. I have 3 servers (in
AWS), host1 and host2 have pgpool and postgres instances, host3 with
pgpool only. Postgres uses native streaming replication
(master_slave_mode = on, master_slave_sub_mode = 'stream').
If I restared host2, after this pgpool on host1 and host 3 shows:

$ psql -h host1 -p 5433 -U admin repmgr -c "show pool_nodes;"
 node_id |    hostname    | port | status | lb_weight |  role   |
select_cnt
---------+----------------+------+--------+-----------+---------+------------
 0       | host1 | 5432 | 2      | 0.500000  | primary | 8491
 1       | host2 | 5432 | 3      | 0.500000  | standby | 0

$ psql -h host3 -p 5433 -U admin repmgr -c "show pool_nodes;"
 node_id |    hostname    | port | status | lb_weight |  role   |
select_cnt
---------+----------------+------+--------+-----------+---------+------------
 0       | host1 | 5432 | 2      | 0.500000  | primary | 4
 1       | host2 | 5432 | 3      | 0.500000  | standby | 0

but on host 2 have status:

$ psql -h host2 -p 5433 -U admin repmgr -c "show pool_nodes;"
 node_id |    hostname    | port | status | lb_weight |  role   |
select_cnt
---------+----------------+------+--------+-----------+---------+------------
 0       | host1 | 5432 | 2      | 0.500000  | primary | 7
 1       | host2 | 5432 | 2      | 0.500000  | standby | 0

My configuration of watchdog (from host1):
use_watchdog = on
trusted_servers = ''
wd_hostname = 'host1'
wd_port = 9000
wd_priority = 3
wd_authkey = ''
wd_ipc_socket_dir = '/var/run/postgresql'
delegate_IP = ''
wd_monitoring_interfaces_list = 'eth0'
wd_lifecheck_method = 'heartbeat'
wd_interval = 10
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = 'host2'
heartbeat_destination_port0 = 9694
heartbeat_device0 = 'eth0'
heartbeat_destination1 = 'host3'
heartbeat_destination_port1 = 9694
heartbeat_device1 = 'eth0'

other_pgpool_hostname0 = 'host2'
other_pgpool_port0 = 5433
other_wd_port0 = 9000
other_pgpool_hostname1 = 'host3'
other_pgpool_port1 = 5433
other_wd_port1 = 9000

All instances see themeselves, but host2 have different status.
Replication of postgres after restart working ok.

Logs from host3 when host2 returns (10.0.112.190 = host2):
Jul 11 14:35:50 ip-10-0-2-132 pgpool[1057]: [56-1] 2016-07-11 14:35:50:
pid 1057: LOG:  new watchdog node connection is received from
"10.0.112.190:39849"
Jul 11 14:35:50 ip-10-0-2-132 pgpool: 2016-07-11 14:35:50: pid 1057:
LOG:  new watchdog node connection is received from "10.0.112.190:39849"
Jul 11 14:35:50 ip-10-0-2-132 pgpool[1057]: [57-1] 2016-07-11 14:35:50:
pid 1057: WARNING:  we have not received a beacon message from master
node "Linux_ip-10-0-1-129_5433"
Jul 11 14:35:50 ip-10-0-2-132 pgpool[1057]: [57-2] 2016-07-11 14:35:50:
pid 1057: DETAIL:  requesting info message from master node
Jul 11 14:35:50 ip-10-0-2-132 pgpool: 2016-07-11 14:35:50: pid 1057:
WARNING:  we have not received a beacon message from master node
"Linux_ip-10-0-1-129_5433"
Jul 11 14:35:50 ip-10-0-2-132 pgpool: 2016-07-11 14:35:50: pid 1057:
DETAIL:  requesting info message from master node
Jul 11 14:35:50 ip-10-0-2-132 pgpool[1057]: [58-1] 2016-07-11 14:35:50:
pid 1057: LOG:  new outbond connection to host2:9000
Jul 11 14:35:50 ip-10-0-2-132 pgpool: 2016-07-11 14:35:50: pid 1057:
LOG:  new outbond connection to host2:9000
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1147]: [9-1] 2016-07-11 14:36:00:
pid 1147: LOG:  informing the node status change to watchdog
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1147]: [9-2] 2016-07-11 14:36:00:
pid 1147: DETAIL:  node id :2 status = "NODE ALIVE" message:"Heartbeat
signal found"
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1147:
LOG:  informing the node status change to watchdog
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1147:
DETAIL:  node id :2 status = "NODE ALIVE" message:"Heartbeat signal found"
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1057]: [59-1] 2016-07-11 14:36:00:
pid 1057: LOG:  new IPC connection received
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1057:
LOG:  new IPC connection received
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1057]: [60-1] 2016-07-11 14:36:00:
pid 1057: LOG:  received node status change ipc message
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1057]: [60-2] 2016-07-11 14:36:00:
pid 1057: DETAIL:  Heartbeat signal found
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1057:
LOG:  received node status change ipc message
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1057:
DETAIL:  Heartbeat signal found
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1057]: [61-1] 2016-07-11 14:36:00:
pid 1057: WARNING:  we have not received a beacon message from master
node "Linux_ip-10-0-1-129_5433"
Jul 11 14:36:00 ip-10-0-2-132 pgpool[1057]: [61-2] 2016-07-11 14:36:00:
pid 1057: DETAIL:  requesting info message from master node
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1057:
WARNING:  we have not received a beacon message from master node
"Linux_ip-10-0-1-129_5433"
Jul 11 14:36:00 ip-10-0-2-132 pgpool: 2016-07-11 14:36:00: pid 1057:
DETAIL:  requesting info message from master node

Logs from host2:
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1217]: [11-1] 2016-07-11
14:35:50: pid 1217: LOG:  watchdog node state changed from
[INITIALIZING] to [STANDBY]
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1217:
LOG:  watchdog node state changed from [INITIALIZING] to [STANDBY]
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1217]: [12-1] 2016-07-11
14:35:50: pid 1217: LOG:  successfully joined the watchdog cluster as
standby node
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1217]: [12-2] 2016-07-11
14:35:50: pid 1217: DETAIL:  our join coordinator request is accepted by
cluster leader node "Linux_ip-10-0-1-129_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1217:
LOG:  successfully joined the watchdog cluster as standby node
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1217:
DETAIL:  our join coordinator request is accepted by cluster leader node
"Linux_ip-10-0-1-129_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1190]: [2-1] 2016-07-11 14:35:50:
pid 1190: LOG:  watchdog process is initialized
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1190:
LOG:  watchdog process is initialized
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1217]: [13-1] 2016-07-11
14:35:50: pid 1217: LOG:  new IPC connection received
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1217:
LOG:  new IPC connection received
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1190]: [3-1] 2016-07-11 14:35:50:
pid 1190: LOG:  Setting up socket for 0.0.0.0:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1190:
LOG:  Setting up socket for 0.0.0.0:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1190]: [4-1] 2016-07-11 14:35:50:
pid 1190: LOG:  Setting up socket for :::5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1190:
LOG:  Setting up socket for :::5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [3-1] 2016-07-11 14:35:50:
pid 1353: LOG:  3 watchdog nodes are configured for lifecheck
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
LOG:  3 watchdog nodes are configured for lifecheck
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [4-1] 2016-07-11 14:35:50:
pid 1353: LOG:  watchdog nodes ID:0 Name:"Linux_ip-10-0-112-190_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [4-2] 2016-07-11 14:35:50:
pid 1353: DETAIL:  Host:"host2" WD Port:9000 pgpool-II port:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
LOG:  watchdog nodes ID:0 Name:"Linux_ip-10-0-112-190_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
DETAIL:  Host:"host2" WD Port:9000 pgpool-II port:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [5-1] 2016-07-11 14:35:50:
pid 1353: LOG:  watchdog nodes ID:1 Name:"Linux_ip-10-0-1-129_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [5-2] 2016-07-11 14:35:50:
pid 1353: DETAIL:  Host:"host1" WD Port:9000 pgpool-II port:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
LOG:  watchdog nodes ID:1 Name:"Linux_ip-10-0-1-129_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
DETAIL:  Host:"host1" WD Port:9000 pgpool-II port:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [6-1] 2016-07-11 14:35:50:
pid 1353: LOG:  watchdog nodes ID:2 Name:"Linux_ip-10-0-2-132_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1353]: [6-2] 2016-07-11 14:35:50:
pid 1353: DETAIL:  Host:"host3" WD Port:9000 pgpool-II port:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
LOG:  watchdog nodes ID:2 Name:"Linux_ip-10-0-2-132_5433"
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1353:
DETAIL:  Host:"host3" WD Port:9000 pgpool-II port:5433
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1190]: [5-1] 2016-07-11 14:35:50:
pid 1190: LOG:  pgpool-II successfully started. version 3.5.3 (ekieboshi)
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1190:
LOG:  pgpool-II successfully started. version 3.5.3 (ekieboshi)
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1190]: [6-1] 2016-07-11 14:35:50:
pid 1190: LOG:  find_primary_node: checking backend no 0
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1190:
LOG:  find_primary_node: checking backend no 0
Jul 11 14:35:50 ip-10-0-112-190 pgpool[1190]: [7-1] 2016-07-11 14:35:50:
pid 1190: LOG:  find_primary_node: primary node id is 0
Jul 11 14:35:50 ip-10-0-112-190 pgpool: 2016-07-11 14:35:50: pid 1190:
LOG:  find_primary_node: primary node id is 0




And another test. Configuration the same, but after test host2 = postgresql master. Host2 was stopped, so on host1 postgresql is new master. When host2 was started, statuses on each nodes of pgpool (SHOW pool_nodes):

Host1:
 node_id |    hostname    | port | status | lb_weight |  role   | select_cnt 
---------+----------------+------+--------+-----------+---------+------------
 0       | host1 | 5432 | 2      | 0.500000  | primary | 2798
 1       | host2 | 5432 | 3      | 0.500000  | standby | 878

Host2:
 node_id |    hostname    | port | status | lb_weight |  role   | select_cnt 
---------+----------------+------+--------+-----------+---------+------------
 0       | host1 | 5432 | 2      | 0.500000  | primary | 2
 1       | host2 | 5432 | 2      | 0.500000  | standby | 0

Host3:
 node_id |    hostname    | port | status | lb_weight |  role   | select_cnt 
---------+----------------+------+--------+-----------+---------+------------
 0       | host1 | 5432 | 2      | 0.500000  | primary | 5
 1       | host2 | 5432 | 3      | 0.500000  | standby | 7



I didn't create extension pgpool_adm. I need it for proper work?



Best regards,
Kris

> Usama,
>
> This might be a watchdog issue since this configuration uses wachdog.
> Can you please take a look at this?
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
>> Hi Experts,
>>
>> I have one setup with one master and one slave.  For some reason the command psql -h 172.18.255.42 -U postgres   -p 9999 -t -c "show pool_nodes;" returns different result when it is running on the master or on the slave.
>> The 172.18.255.41 server shows status 2 when the query is run locally but when it is run on its partner it shows status 3.
>> What might have caused it.  How can I know which value is correct.
>>
>>
>> psql -h 172.18.255.42 -U postgres   -p 9999 -t -c "show pool_nodes;"
>>
>> 0       | 172.18.255.41 | 5432 | 3      | 0.500000  | standby | 0
>> 1       | 172.18.255.42 | 5432 | 2      | 0.500000  | primary | 0
>>
>>
>> psql -h 172.18.255.41 -U postgres   -p 9999 -t -c "show pool_nodes;"
>> 0       | 172.18.255.41 | 5432 | 2      | 0.500000  | standby | 0
>> 1       | 172.18.255.42 | 5432 | 2      | 0.500000  | primary | 0
>>
>> Thanks!
>> Avi
>> IMPORTANT - This email and any attachments is intended for the above named addressee(s), and may contain information which is confidential or privileged. If you are not the intended recipient, please inform the sender immediately and delete this email: you should not copy or use this e-mail for any purpose nor disclose its contents to any person.
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general



More information about the pgpool-general mailing list