[pgpool-general: 8172] Re: Problems taking node offline

Bo Peng pengbo at sraoss.co.jp
Tue May 24 21:54:17 JST 2022


Hello,

> I was able to do some experimentation this today.
> 
> To take a frontend offline I believe I just need to stop the pgpool process on that system and then let pgpool figure out the new primary and grab the virtual IP address. Correct?

Yes.
 
> To take a backend offline I believe I use pcp_detach_node. Correct?

Yes.

If failover_command and follow_primary_command are specified, they are executed after you run pcp_detach_node.

If you run pcp_detach_node to detach the primary node,
failover_command promotes a standby to primary and 
follow_primary_command enables the other alive nodes to follow the new primary.

It seems that the failover_command and follow_primary_command didn't executed in your logs.
Can you successfully detach a standby PostgreSQL node?

If you shutdown the primary PostgreSQL node, will failover be performed correctly?

I found that the setting parameters of 4.1 and 4.2 are mixed in pgpool.conf.
I am not sure if it affects the behavior of watchdog.

> As far as testing with "use_watchdog = off", I just tried that. I changed the parameter on all 3 of my pgpool hosts. I then restarted the pgpool process on all 3 hosts. I noticed that no host picked up the virtual IP, I'm assuming that is because the watchdog is off, correct?
> I manually assigned the virtual IP to one of the hosts.
> Before I execute any commands:
> postgres=# show pool_nodes;
>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 282959     | true              | 0                 |                   |                        | 2022-05-13 12:46:54
>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 230916     | false             | 0                 | streaming         | potential              | 2022-05-13 12:46:54
>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 370021     | false             | 0                 | streaming         | sync                   | 2022-05-13 12:46:54
> (3 rows)
> 
> I then executed:
> $ pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
> Password: 
> pcp_detach_node -- Command Successful
> 
> This took a long time (60 seconds) to finish.
> 
> The node does not appear to be offline:
> postgres=# show pool_nodes;
>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 311652     | true              | 0                 |                   |                        | 2022-05-13 12:46:54
>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 248369     | false             | 0                 | streaming         | potential              | 2022-05-13 12:46:54
>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 410891     | false             | 0                 | streaming         | sync                   | 2022-05-13 12:46:54
> (3 rows)
> 
> The system log for pgpool during this time is attached.
> 
>  What am I doing wrong here?
> 
> 
> 
> 
> 
> 
> 
> 
> Jon Schewe
> Principal Software Systems Technologist
> C: +1 612.263.2718
> O: +1 952.545.5720
> jon.schewe at raytheon.com
> 
> Raytheon BBN
> Raytheon Intelligence & Space
> 5775 Wayzata Blvd. Suite 630
> St. Louis Park, MN 55416
> 
> 
> 
> 
> From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
> Sent: Thursday, May 12, 2022 16:14
> To: Bo Peng <pengbo at sraoss.co.jp>
> Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
> Subject: [External] [pgpool-general: 8155] Re: Problems taking node offline 
>  
> I have not been able to test with the watchdog off, however I am wondering about the proper commands to switch backends and frontends.
> 
> I see pcp_detatch_node removes a pgpool frontend. 
> What command can I use to tell pgpool to switch to a different primary backend? Do I just stop the postgresql process?
> 
> Jon Schewe
> Principal Software Systems Technologist
> 
> C: +1 612.263.2718
> O: +1 952.545.5720
> jon.schewe at raytheon.com
> 
> Raytheon BBN
> Raytheon Intelligence & Space
> 5775 Wayzata Blvd. Suite 630
> St. Louis Park, MN 55416 
> 
> RTX.com | LinkedIn | Twitter | Instagram 
> 
> 
> From: pgpool-general <pgpool-general-bounces at pgpool.net> on behalf of Jon SCHEWE <jon.schewe at raytheon.com>
> Sent: Wednesday, April 27, 2022 13:07
> To: Bo Peng <pengbo at sraoss.co.jp>
> Cc: pgpool-general at pgpool.net <pgpool-general at pgpool.net>
> Subject: [External] [pgpool-general: 8108] Re: Problems taking node offline 
>  
> > On Tue, 26 Apr 2022 15:01:15 +0000
> > Jon SCHEWE <jon.schewe at raytheon.com> wrote:
> > 
> > > >> I want to take a backend node offline and having some trouble with it.
> > > >>
> > > >> I check the status of my notes:
> > > >> template1=> show pool_nodes;
> > > >>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> > > >> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > > >>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 646198     | false             | 0                 | streaming         | sync                   | 2022-04-25 14:19:57
> > > >>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 2115353    | true              | 0                 |                   |                        | 2022-04-25 14:16:24
> > > >>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 2913       | false             | 0                 | streaming         | potential              | 2022-04-25 14:24:25
> > > >> (3 rows)
> > > >>
> > > >> I want to take psql-02 offline.
> > > >>
> > > >> pcp_detach_node -h psql.mgmt.bbn.com -p 9897 -U pgpool -g -n 1
> > > >> Password:
> > > >> pcp_detach_node -- Command Successful
> > > >>
> > > >>
> > > >> I check the status again:
> > > >> template1=> show pool_nodes;
> > > >>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
> > > >> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
> > > >>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 718555     | true              | 0                 | streaming         | sync                   | 2022-04-25 14:19:57
> > > >>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 2373454    | false             | 0                 |                   |                        | 2022-04-25 14:16:24
> > > >>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 3310       | false             | 0                 | streaming         | potential              | 2022-04-25 14:24:25
> > > >> (3 rows)
> > > >>
> > > >>
> > > >> I still see psql-02 online. Why is that?
> > > >
> > > >Could you share pgpool.conf
> > >
> > > Yes, attached.
> > >
> > > > and full log after running pcp_detach_node?
> > >
> > > The only log messages are what I sent originally.
> > >
> > > >Which version of Pgpool-II are you using?
> > >
> > > 4.1.4
> > 
> > Thank you.
> > 
> > I think watchdog may not be working properly.
> > If you run pcp_detach_node, failover_command and follow_master_command should be executed.
> > But I could not see the related logs.
> > 
> > Could you check the watchdog status using "pcp_watchdog_info" command?
> 
> 
> [jschewe-adm at psql-01 ~]$ pcp_watchdog_info -h psql.mgmt.bbn.com -p 9897 -U pgpool 
> Password: 
> 3 YES psql-02.mgmt.bbn.com:9898 Linux psql-02 psql-02.mgmt.bbn.com
> 
> psql-02.mgmt.bbn.com:9898 Linux psql-02 psql-02.mgmt.bbn.com 9898 9000 4 MASTER
> psql-01.mgmt.bbn.com:9898 Linux psql-01 psql-01.mgmt.bbn.com 9898 9000 7 STANDBY
> Not_Set psql-03.mgmt.bbn.com 9898 9000 0 DEAD
> [jschewe-adm at psql-01 ~]$ psql -h psql.mgmt.bbn.com -p 9898 -U postgres
> Password for user postgres: 
> psql (13.6)
> SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
> Type "help" for help.
> 
> postgres=# show pool_nodes;
>  node_id |       hostname       | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
> ---------+----------------------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | psql-01.mgmt.bbn.com | 5432 | up     | 0.333333  | primary | 30419799   | false             | 0                 |                   |                        | 2022-04-26 00:19:21
>  1       | psql-02.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 20228026   | false             | 0                 | streaming         | potential              | 2022-04-26 10:57:15
>  2       | psql-03.mgmt.bbn.com | 5432 | up     | 0.333333  | standby | 2974278    | true              | 0                 | streaming         | sync                   | 2022-04-26 11:04:41
> (3 rows)
> 
> postgres=# 
> 
> 
> > Does this issue occur if you disable watchdog "use_watchdog = off"?
> 
> I will give that a try when I have some downtime.
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan
http://www.sraoss.co.jp/


More information about the pgpool-general mailing list