[pgpool-general: 7501] Re: Problems with pg_rewind

Tue Apr 13 08:56:18 JST 2021

Hi,

On Mon, 12 Apr 2021 09:48:26 +0200
Emond Papegaaij <emond.papegaaij at gmail.com> wrote:

> Hi all,
> 
> In our test setup we see frequent failures were a database does not
> start properly after a forced failover. In this test we start with a
> cluster of 3 pgpool (4.1.4)+pg (12) nodes (every node runs both pgpool
> and pg). Node 0 is master+primary and needs to be restarted so we
> trigger a forced failover. First we stop and restart pgpool to force
> another node in the cluster to become cluster master. Next, we detach
> the backend at node 0 and this is where the trouble starts. These log
> lines from the server show the steps being executed (I've removed some
> to keep it concise):
> 
> Apr 11 05:31:51 tkh-server tkh-force-failover: Node 0 status:
> 172.29.30.1 5432 1 0.333333 waiting primary 0   2021-04-11 07:31:49
> Apr 11 05:31:51 tkh-server tkh-force-failover: Detaching node 0 from
> the pgpool cluster
> Apr 11 05:31:51 tkh-server tkh-force-failover: pcp_detach_node --
> Command Successful
> Apr 11 05:31:51 tkh-server tkh-force-failover: Waiting for failover to complete
> Apr 11 05:31:55 tkh-server tkh-force-failover: Node 0 status:
> 172.29.30.1 5432 3 0.333333 down standby 0   2021-04-11 07:31:52
> Apr 11 05:31:55 tkh-server webhook: pg_rewind: source and target
> cluster are on the same timeline
> Apr 11 05:31:55 tkh-server webhook: pg_rewind: no rewind required
> Apr 11 05:31:56 tkh-server webhook: Creating tkh-db ... done
> Apr 11 05:31:57 tkh-server tkh-force-failover: Node 0 status:
> 172.29.30.1 5432 1 0.333333 waiting standby 160 streaming async
> 2021-04-11 07:31:56
> Apr 11 05:31:57 tkh-server tkh-force-failover: Failover complete
> 
> As you can see, pg_rewinds decides that a rewind is not needed.
> However, when the database starts, it starts at a replication delay of
> 160. This is the first indication of something being wrong. After
> this, the database is stopped again, some docker packages are being
> modified and the database is started again. Now, the database does not
> start anymore and reports:
> 
> {"log":"2021-04-11 07:32:34.145 CEST [1] LOG:  starting PostgreSQL
> 12.6 (Debian 12.6-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc
> (Debian 8.3.0-6) 8.3.0,
> 64-bit\n","stream":"stderr","time":"2021-04-11T05:32:34.146994187Z"}
> {"log":"2021-04-11 07:32:34.145 CEST [1] LOG:  listening on IPv4
> address \"0.0.0.0\", port
> 5432\n","stream":"stderr","time":"2021-04-11T05:32:34.147013641Z"}
> {"log":"2021-04-11 07:32:34.145 CEST [1] LOG:  listening on IPv6
> address \"::\", port
> 5432\n","stream":"stderr","time":"2021-04-11T05:32:34.147017054Z"}
> {"log":"2021-04-11 07:32:34.146 CEST [1] LOG:  listening on Unix
> socket \"/var/run/postgresql/.s.PGSQL.5432\"\n","stream":"stderr","time":"2021-04-11T05:32:34.147019855Z"}
> {"log":"2021-04-11 07:32:34.162 CEST [28] LOG:  database system was
> shut down in recovery at 2021-04-11 07:32:07
> CEST\n","stream":"stderr","time":"2021-04-11T05:32:34.165175019Z"}
> {"log":"2021-04-11 07:32:34.163 CEST [28] LOG:  entering standby
> mode\n","stream":"stderr","time":"2021-04-11T05:32:34.165196461Z"}
> {"log":"2021-04-11 07:32:34.163 CEST [28] FATAL:  requested timeline 2
> is not a child of this server's
> history\n","stream":"stderr","time":"2021-04-11T05:32:34.165209297Z"}
> {"log":"2021-04-11 07:32:34.163 CEST [28] DETAIL:  Latest checkpoint
> is at 0/50354F8 on timeline 1, but in the history of the requested
> timeline, the server forked off from that timeline at
> 0/50354F8.\n","stream":"stderr","time":"2021-04-11T05:32:34.165212383Z"}
> {"log":"2021-04-11 07:32:34.164 CEST [1] LOG:  startup process (PID
> 28) exited with exit code
> 1\n","stream":"stderr","time":"2021-04-11T05:32:34.168416599Z"}
> {"log":"2021-04-11 07:32:34.164 CEST [1] LOG:  aborting startup due to
> startup process
> failure\n","stream":"stderr","time":"2021-04-11T05:32:34.168429202Z"}
> {"log":"2021-04-11 07:32:34.166 CEST [1] LOG:  database system is shut
> down\n","stream":"stderr","time":"2021-04-11T05:32:34.168432379Z"}
> 
> The database (node 1) that now is the primary reports the following in the logs:
> 
> {"log":"2021-04-11 07:30:34.303 CEST [1] LOG:  database system is
> ready to accept read only
> connections\n","stream":"stderr","time":"2021-04-11T05:30:34.305997834Z"}
> {"log":"2021-04-11 07:30:34.320 CEST [32] LOG:  started streaming WAL
> from primary at 0/5000000 on timeline
> 1\n","stream":"stderr","time":"2021-04-11T05:30:34.321296398Z"}
> {"log":"2021-04-11 07:31:51.674 CEST [28] LOG:  received promote
> request\n","stream":"stderr","time":"2021-04-11T05:31:51.675934327Z"}
> {"log":"2021-04-11 07:31:51.674 CEST [32] FATAL:  terminating
> walreceiver process due to administrator
> command\n","stream":"stderr","time":"2021-04-11T05:31:51.675968345Z"}
> {"log":"2021-04-11 07:31:51.676 CEST [28] LOG:  redo done at
> 0/50354C0\n","stream":"stderr","time":"2021-04-11T05:31:51.677261588Z"}
> {"log":"2021-04-11 07:31:51.676 CEST [28] LOG:  last completed
> transaction was at log time 2021-04-11
> 07:21:00.991231+02\n","stream":"stderr","time":"2021-04-11T05:31:51.677276189Z"}
> {"log":"2021-04-11 07:31:51.681 CEST [28] LOG:  selected new timeline
> ID: 2\n","stream":"stderr","time":"2021-04-11T05:31:51.682159124Z"}
> {"log":"2021-04-11 07:31:51.798 CEST [28] LOG:  archive recovery
> complete\n","stream":"stderr","time":"2021-04-11T05:31:51.798821424Z"}
> {"log":"2021-04-11 07:31:51.809 CEST [1] LOG:  database system is
> ready to accept
> connections\n","stream":"stderr","time":"2021-04-11T05:31:51.810598738Z"}
> 
> >From what I can see, the database on node 0 thinks it is still on
> timeline 1, where node 1 is at timeline 2. Why does pg_rewind decide
> that a rewind is not required? Could this be a timing issue? Maybe the
> rewind is started before node 1 actually started the new timeline? 

I think it is not a timing issue, because follow_master_command is executed after failover_command.
It may be caused by pg_rewind command.

I can't reproduce the pg_rewind issue reported above.
If you can share your failover and follow_master script,
it helps to identify the cause.

> In other tests we do see a rewind happening and in those cases the
> database does start properly. The scripts we used are based on the
> examples but modified to our setup. So pgpool_follow_master.sh first
> checks if the database is up, then tries a pg_rewind and if this fails
> a pg_basebackup. Finally it starts and attaches the node.

After executing pg_rewind, you may need to try to start PostgreSQL.
If it fails, then execute pg_basebackup.

See the latest sample script in 4.1 branch:

https://git.postgresql.org/gitweb/?p=pgpool2.git;a=blob_plain;f=src/sample/scripts/follow_master.sh.sample;hb=refs/heads/V4_1_STABLE

> Best regards,
> Emond
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general

-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan