[pgpool-general: 7681] Re: Failover question

Wolf Schwurack wolf at uen.org
Thu Sep 2 01:54:25 JST 2021


Hello

This shows pcp_watchdog_info after node1 is added back in
postgres at pgtest-02:~$ pcp_watchdog_info -h localhost -U wolf
Password: 
3 YES pgtest-02:9999 Linux pgtest-02 pgtest-02

pgtest-02:9999 Linux pgtest-02 pgtest-02 9999 9000 4 LEADER
pgtest-01:9999 Linux pgtest-01 pgtest-01 9999 9000 7 STANDBY
pgtest-03:9999 Linux pgtest-03 pgtest-03 9999 9000 7 STANDBY

Here's the pgpool.log from node3 after shutdown of pgpool on node2
2021-09-01 10:43:38: pid 417478: LOG:  adding watchdog node "pgtest-01:9999 Linux pgtest-01" to the standby list
2021-09-01 10:43:38: pid 417478: LOG:  quorum found
2021-09-01 10:43:38: pid 417478: DETAIL:  starting escalation process
2021-09-01 10:43:38: pid 417478: LOG:  escalation process started with PID:554759
2021-09-01 10:43:38: pid 417478: LOG:  signal_user1_to_parent_with_reason(3)
2021-09-01 10:43:38: pid 417474: LOG:  Pgpool-II parent process received SIGUSR1
2021-09-01 10:43:38: pid 417474: LOG:  Pgpool-II parent process received watchdog quorum change signal from watchdog
2021-09-01 10:43:38: pid 417478: LOG:  new IPC connection received
2021-09-01 10:43:38: pid 417474: LOG:  watchdog cluster now holds the quorum
2021-09-01 10:43:38: pid 417474: DETAIL:  updating the state of quarantine backend nodes
2021-09-01 10:43:38: pid 417478: LOG:  new IPC connection received
2021-09-01 10:43:38: pid 554759: LOG:  watchdog: escalation started
RTNETLINK answers: Operation not permitted
2021-09-01 10:43:38: pid 554759: LOG:  failed to acquire the delegate IP address
2021-09-01 10:43:38: pid 554759: DETAIL:  'if_up_cmd' failed
2021-09-01 10:43:38: pid 554759: WARNING:  watchdog escalation failed to acquire delegate IP

Here's pcp_watchdog_info on node3 after showdown of pgpool on node2
postgres at pgtest-03:~$ pcp_watchdog_info -h localhost -U wolf
Password: 
3 YES pgtest-03:9999 Linux pgtest-03 pgtest-03

pgtest-03:9999 Linux pgtest-03 pgtest-03 9999 9000 4 LEADER
pgtest-01:9999 Linux pgtest-01 pgtest-01 9999 9000 7 STANDBY
pgtest-02:9999 Linux pgtest-02 pgtest-02 9999 9000 10 SHUTDOWN

Here's pcp_watchdog_info on node3 after start of pgpool on node2
postgres at pgtest-03:~$ pcp_watchdog_info -h localhost -U wolf
Password: 
3 YES pgtest-03:9999 Linux pgtest-03 pgtest-03

pgtest-03:9999 Linux pgtest-03 pgtest-03 9999 9000 4 LEADER
pgtest-01:9999 Linux pgtest-01 pgtest-01 9999 9000 7 STANDBY
pgtest-02:9999 Linux pgtest-02 pgtest-02 9999 9000 7 STANDBY

Still no watchdog IP enabled It seems this is the issue on node3 maybe a permission issue?
RTNETLINK answers: Operation not permitted

Wolf

On 8/29/21, 9:18 PM, "Bo Peng" <pengbo at sraoss.co.jp> wrote:

    Hello,

    > Sorry but you miss the part where node 1 was added back to a standby after the failover to node 2. At the point of when I turn off pgpool on node 2, node 1 and node 3 are the standby nodes which node 3 should take over watchdog

    I have tested Pgpool-II 4.2.4, but I could not reproduce this issue.
    Could you share the following information?

    - result of "pcp_watchdog_info" after adding back node1 as a standby
    - pgpool logs of node 1 and node 3 after turning off pgpool on node2.


    > Wolf 
    > 
    > On 8/27/21, 10:07 AM, "Bo Peng" <pengbo at sraoss.co.jp> wrote:
    > 
    >     Hello,
    > 
    >     > My question is why watchdog doesn’t come up on node 3. Pgpool.conf is set the same on all 3 nodes.
    > 
    >     If you shut down pgpool node1 and node2, the number of alive pgpool is one,
    >     the quorum does not exist.
    > 
    >     If you want to enable watchdog even if the quorum does not exist,
    >     you need to enable the parameter "enable_consensus_with_half_votes".
    > 
    >     See more detail about "enable_consensus_with_half_votes":
    >     https://www.pgpool.net/docs/latest/en/html/runtime-watchdog-config.html#GUC-ENABLE-CONSENSUS-WITH-HALF-VOTES
    > 
    >     > I have a 3 nodes setup for pgpool/postgresql using watchdog, When testing the failover of pgpool, I turn off pgpool on node 1 which fails over watchdog to node 2. Then I turn on pgpool on node 1 that set node 1 as a standby node. The next step I turn off pgpool on node 2 which watchdog try’s to fail over to node 3 but watchdog IP never comes up on node 3 or any of the nodes. So I turn off pgpool on node 3 and watchdog fails over to node 1.
    >     > My question is why watchdog doesn’t come up on node 3. Pgpool.conf is set the same on all 3 nodes.
    >     > 
    >     > Here’s my output of show pool_nodes
    >     > 
    >     >  node_id | hostname  | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
    >     > 
    >     > ---------+-----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
    >     > 
    >     >  0       | pgtest-01 | 5432 | up     | 0.500000  | primary | 2003       | true              | 0                 |                   |                        | 2021-08-24 14:06:20
    >     > 
    >     >  1       | pgtest-02 | 5432 | up     | 0.500000  | standby | 667        | false             | 0                 | streaming         | async                  | 2021-08-24 14:06:20
    >     > 
    >     >  2       | pgtest-03 | 5432 | up     | 0.000000  | standby | 0          | false             | 0                 | streaming         | async                  | 2021-08-24 14:06:20
    >     > 
    >     > Not sure if this is an issue but the lb_weight show node 1(pgtest-01) and node 2(pgtest-02) as 0.5000 and node 3(pgtest-03) as 0.0000
    >     > 
    >     > In pgpool.conf I have backend_weight for each node set to 0.3
    >     > 
    >     > Hosts = Ubuntu 20.4
    >     > Pgpool = 4.2.4
    >     > PostgreSQL = 12.8
    >     > 
    >     > 
    >     > -- Wolf
    >     > 
    >     > 
    > 
    > 
    >     -- 
    >     Bo Peng <pengbo at sraoss.co.jp>
    >     SRA OSS, Inc. Japan
    >     http://www.sraoss.co.jp/
    > 


    -- 
    Bo Peng <pengbo at sraoss.co.jp>
    SRA OSS, Inc. Japan
    http://www.sraoss.co.jp/



More information about the pgpool-general mailing list