[pgpool-general: 9010] Re: can simply restart a streaming backup system.

Mon Jan 29 10:27:44 JST 2024

Hi,

> I used this https://www.pgpool.net/docs/45/en/html/example-cluster.html
> tutorial to setup my three servers.   It is working really well.  I am just
> having one issue.
> 
> If I shut down on of the standby servers, or simply shutdown pgpool and
> postgres, when I restart postgres it is not getting itself back up to
> date.

It is the expected behavior.
After restarting PostgreSQL standby node, you need to make sure
the streaming replication between primary and standby works well
then attach the standby to Pgpool-II by using "pcp_attach_node".

https://www.pgpool.net/docs/45/en/html/pcp-attach-node.html

Alternatively, you can enable "auto_failback" to automatically attach
a healthy standby to Pgpool-II.
"A healthy standby" means the node status is down and streaming replication works normally.

https://www.pgpool.net/docs/45/en/html/runtime-config-failover.html#GUC-AUTO-FAILBACK

On Wed, 24 Jan 2024 10:54:48 -0700
Mark Gordon <mark at ordertech.com> wrote:

> I have a three node pgpool system running watchdog.
> pgpool-II version 4.5.0 (hotooriboshi)
> Postgres version 16
> the three servers are running Rocky 9
> I am using streaming backup for replication.
> 
> I used this https://www.pgpool.net/docs/45/en/html/example-cluster.html
> tutorial to setup my three servers.   It is working really well.  I am just
> having one issue.
> 
> If I shut down on of the standby servers, or simply shutdown pgpool and
> postgres, when I restart postgres it is not getting itself back up to
> date.
> 
> node_id |  hostname  | port | status | pg_status | lb_weight |  role   |
> pg_role | select_cnt | load_balance_node | replication_delay |
> replication_state | replication_sync_state | last_status_change
> 
> ---------+------------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | 10.30.0.40 | 5432 | up     | up        | 0.333333  | primary |
> primary | 93681      | false             | 0                 |
>       |                        | 2024-01-23 12:24:03
>  1       | 10.30.0.41 | 5432 | up     | up        | 0.333333  | standby |
> standby | 35143      | true              | 0                 |
>       |                        | 2024-01-23 12:24:03
>  2       | 10.30.0.42 | 5432 | down   | down      | 0.333333  | standby |
> unknown | 2005       | false             | 0                 |
>       |                        | 2024-01-24 10:45:00
> (3 rows)
> 
> I then restart postgres and pgpool
> 
> node_id |  hostname  | port | status | pg_status | lb_weight |  role   |
> pg_role | select_cnt | load_balance_node | replication_delay |
> replication_state | replication_sync_state | last_status_change
> ---------+------------+------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>  0       | 10.30.0.40 | 5432 | up     | up        | 0.333333  | primary |
> primary | 93709      | false             | 0                 |
>       |                        | 2024-01-23 12:24:03
>  1       | 10.30.0.41 | 5432 | up     | up        | 0.333333  | standby |
> standby | 35143      | true              | 0                 |
>       |                        | 2024-01-23 12:24:03
>  2       | 10.30.0.42 | 5432 | down   | up        | 0.333333  | standby |
> standby | 2005       | false             | 0                 |
>       |                        | 2024-01-24 10:45:00
> (3 rows)
> 
> 
> Here is watchdog info after restarting
> 10.30.0.40:9999 Linux pg0 10.30.0.40 9999 9000 4 LEADER 0 MEMBER
> 10.30.0.41:9999 Linux pg1 10.30.0.41 9999 9000 7 STANDBY 0 MEMBER
> 10.30.0.42:9999 Linux pg2 10.30.0.42 9999 9000 7 STANDBY 0 MEMBER
> 
> it looks like it is restoring log files but it never finishes... and
> nothing really happening on the system.
> 
> 
> postgres   89859       1  0 10:49 ?        00:00:00
> /usr/pgsql-16/bin/postgres -D /var/lib/pgsql/16/data
> postgres   89860   89859  0 10:49 ?        00:00:00 postgres: logger
> postgres   89861   89859  0 10:49 ?        00:00:00 postgres: checkpointer
> postgres   89862   89859  0 10:49 ?        00:00:00 postgres: background
> writer
> postgres   89863   89859  0 10:49 ?        00:00:00 postgres: startup
> recovering 0000000300000008000000A7
> 
> 
> To fix i have to shut off postgres and run:
> pcp_recovery_node -h 10.30.0.45 -p 9898 -U pgpool -n 2 -W
> 
> This gets everything going again but takes a long time of course because it
> is doing a full restore.   I have used streaming backup in the past and
> normally it just starts back up where it left off after restarting the
> standby server.
> 
> What am I doing wrong?
> 
> THanks
> Mark
> 
> 
> 
> 
> 
> 
> 
> -- 
> *Mark Gordon*
> Phoenix, AZ
> mark at ordertech.com
> 480.285.1403
> 
> [image: OrderTech Logo]

-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS LLC
TEL: 03-5979-2701 FAX: 03-5979-2702
URL: https://www.sraoss.co.jp/