[pgpool-general: 8315] Degenerate cluster

Jon SCHEWE jon.schewe at raytheon.com
Fri Jul 8 03:34:19 JST 2022


Is it possible to have pgpool keep running with a single node in the cluster?
I have 3 nodes in my cluster and I want to upgrade postgresql on them with minimal downtime.
I figured I would shutdown postgresql and pgpool on 2 of them and leave the 3rd one up until the others are upgraded and ready.
However when I stopped pgpool on 2 nodes the 3rd node (psql-dev-03) gave up the virtual IP and I have no database connection.
Is this expected?

pgpool logs from psql-dev-03 are below.

2022-07-07 13:10:08.210: watchdog pid 338248: LOG:  remote node "psql-dev-01:9898 Linux psql-dev-01" is shutting down
2022-07-07 13:10:08.211: watchdog pid 338248: LOG:  watchdog cluster has lost the coordinator node
2022-07-07 13:10:08.211: watchdog pid 338248: LOG:  removing the remote node "psql-dev-01:9898 Linux psql-dev-01" from watchdog cluster leader
2022-07-07 13:10:08.211: watchdog pid 338248: LOG:  We have lost the cluster leader node "psql-dev-01:9898 Linux psql-dev-01"
2022-07-07 13:10:08.211: watchdog pid 338248: LOG:  watchdog node state changed from [STANDBY] to [JOINING]
2022-07-07 13:10:08.211: watchdog pid 338248: LOG:  watchdog node state changed from [JOINING] to [INITIALIZING]
2022-07-07 13:10:09.213: watchdog pid 338248: LOG:  watchdog node state changed from [INITIALIZING] to [STANDING FOR LEADER]
2022-07-07 13:10:09.213: watchdog pid 338248: LOG:  watchdog node state changed from [STANDING FOR LEADER] to [LEADER]
2022-07-07 13:10:09.213: watchdog pid 338248: LOG:  I am announcing my self as leader/coordinator watchdog node
2022-07-07 13:10:09.214: watchdog pid 338248: LOG:  I am the cluster leader node
2022-07-07 13:10:09.214: watchdog pid 338248: DETAIL:  our declare coordinator message is accepted by all nodes
2022-07-07 13:10:09.214: watchdog pid 338248: LOG:  setting the local node "psql-dev-03:9898 Linux psql-dev-03" as watchdog cluster leader
2022-07-07 13:10:09.214: watchdog pid 338248: LOG:  signal_user1_to_parent_with_reason(1)
2022-07-07 13:10:09.214: watchdog pid 338248: LOG:  I am the cluster leader node but we do not have enough nodes in cluster
2022-07-07 13:10:09.214: watchdog pid 338248: DETAIL:  waiting for the quorum to start escalation process
2022-07-07 13:10:09.214: main pid 338225: LOG:  Pgpool-II parent process received SIGUSR1
2022-07-07 13:10:09.214: main pid 338225: LOG:  Pgpool-II parent process received watchdog state change signal from watchdog
2022-07-07 13:10:09.214: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:10.216: watchdog pid 338248: LOG:  adding watchdog node "psql-dev-02:9898 Linux psql-dev-02" to the standby list
2022-07-07 13:10:10.216: watchdog pid 338248: LOG:  quorum found
2022-07-07 13:10:10.216: watchdog pid 338248: DETAIL:  starting escalation process
2022-07-07 13:10:10.217: watchdog pid 338248: LOG:  escalation process started with PID:937636
2022-07-07 13:10:10.217: watchdog pid 338248: LOG:  signal_user1_to_parent_with_reason(3)
2022-07-07 13:10:10.217: main pid 338225: LOG:  Pgpool-II parent process received SIGUSR1
2022-07-07 13:10:10.217: watchdog_utility pid 937636: LOG:  watchdog: escalation started
2022-07-07 13:10:10.217: main pid 338225: LOG:  Pgpool-II parent process received watchdog quorum change signal from watchdog
2022-07-07 13:10:10.218: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:10.218: main pid 338225: LOG:  watchdog cluster now holds the quorum
2022-07-07 13:10:10.218: main pid 338225: DETAIL:  updating the state of quarantine backend nodes
2022-07-07 13:10:10.218: watchdog pid 338248: LOG:  new IPC connection received
+ PGPOOLS=(psql-dev-01 psql-dev-02 psql-dev-03)
+ VIP=XXX.XXX.XXX.60
+ DEVICE=ens192
+ for pgpool in "${PGPOOLS[@]}"
+ '[' psql-dev-03 = psql-dev-01 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at psql-dev-01 -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
        /usr/bin/sudo /sbin/ip addr del XXX.XXX.XXX.60/24 dev ens192
    '
Warning: Permanently added 'psql-dev-01,XXX.XXX.XXX.61' (ECDSA) to the list of known hosts.
RTNETLINK answers: Cannot assign requested address
+ for pgpool in "${PGPOOLS[@]}"
+ '[' psql-dev-03 = psql-dev-02 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at psql-dev-02 -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
        /usr/bin/sudo /sbin/ip addr del XXX.XXX.XXX.60/24 dev ens192
    '
Warning: Permanently added 'psql-dev-02,XXX.XXX.XXX.62' (ECDSA) to the list of known hosts.
RTNETLINK answers: Cannot assign requested address
+ for pgpool in "${PGPOOLS[@]}"
+ '[' psql-dev-03 = psql-dev-03 ']'
+ ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at psql-dev-03 -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
        /usr/bin/sudo /sbin/ip addr del XXX.XXX.XXX.60/24 dev ens192
    '
Warning: Permanently added 'psql-dev-03,XXX.XXX.XXX.63' (ECDSA) to the list of known hosts.
RTNETLINK answers: Cannot assign requested address
+ exit 0
2022-07-07 13:10:11.325: watchdog_utility pid 937636: LOG:  watchdog escalation successful
2022-07-07 13:10:13.961: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:15.636: watchdog_utility pid 937636: LOG:  successfully acquired the delegate IP:"XXX.XXX.XXX.60"
2022-07-07 13:10:15.636: watchdog_utility pid 937636: DETAIL:  'if_up_cmd' returned with success
2022-07-07 13:10:15.638: watchdog pid 338248: LOG:  watchdog escalation process with pid: 937636 exit with SUCCESS.
2022-07-07 13:10:23.989: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:31.733: psql pid 937339: LOG:  pool_reuse_block: blockid: 0
2022-07-07 13:10:31.733: psql pid 937339: CONTEXT:  while searching system catalog, When relcache is missed
2022-07-07 13:10:34.017: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:40.399: life_check pid 338253: LOG:  informing the node status change to watchdog
2022-07-07 13:10:40.399: life_check pid 338253: DETAIL:  node id :0 status = "NODE DEAD" message:"No heartbeat signal from node"
2022-07-07 13:10:40.399: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:40.399: watchdog pid 338248: LOG:  received node status change ipc message
2022-07-07 13:10:40.399: watchdog pid 338248: DETAIL:  No heartbeat signal from node
2022-07-07 13:10:40.399: watchdog pid 338248: LOG:  remote node "psql-dev-01:9898 Linux psql-dev-01" is shutting down
2022-07-07 13:10:44.045: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:10:54.073: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:04.102: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:07.206: watchdog pid 338248: LOG:  remote node "psql-dev-02:9898 Linux psql-dev-02" is shutting down
2022-07-07 13:11:07.206: watchdog pid 338248: LOG:  removing watchdog node "psql-dev-02:9898 Linux psql-dev-02" from the standby list
2022-07-07 13:11:07.206: watchdog pid 338248: LOG:  We have lost the quorum
2022-07-07 13:11:07.207: watchdog pid 338248: LOG:  signal_user1_to_parent_with_reason(3)
2022-07-07 13:11:07.207: main pid 338225: LOG:  Pgpool-II parent process received SIGUSR1
2022-07-07 13:11:07.207: main pid 338225: LOG:  Pgpool-II parent process received watchdog quorum change signal from watchdog
2022-07-07 13:11:07.207: watchdog_utility pid 937919: LOG:  watchdog: de-escalation started
2022-07-07 13:11:07.207: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:07.408: watchdog_utility pid 937919: LOG:  successfully released the delegate IP:"XXX.XXX.XXX.60"
2022-07-07 13:11:07.408: watchdog_utility pid 937919: DETAIL:  'if_down_cmd' returned with success
2022-07-07 13:11:07.411: watchdog pid 338248: LOG:  watchdog de-escalation process with pid: 937919 exit with SUCCESS.
2022-07-07 13:11:14.130: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:24.158: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:34.186: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:40.400: life_check pid 338253: LOG:  informing the node status change to watchdog
2022-07-07 13:11:40.400: life_check pid 338253: DETAIL:  node id :1 status = "NODE DEAD" message:"No heartbeat signal from node"
2022-07-07 13:11:40.400: watchdog pid 338248: LOG:  new IPC connection received
2022-07-07 13:11:40.400: watchdog pid 338248: LOG:  received node status change ipc message
2022-07-07 13:11:40.400: watchdog pid 338248: DETAIL:  No heartbeat signal from node
2022-07-07 13:11:40.400: watchdog pid 338248: LOG:  remote node "psql-dev-02:9898 Linux psql-dev-02" is shutting down


Jon


More information about the pgpool-general mailing list