[pgpool-general: 1402] Re: pgPool Online Recovery Streaming Replication

Sun Feb 17 18:29:03 JST 2013

Hi Tatsuo,
Thank you so much for your reply.
Actually in my case, i was using the pcp_recovery command and execute it on
the current primary server.
However, if the remote node (recover node) database is off, i got the
following message on the primary server pgpool log:

Jan 31 16:58:10 server0 pgpool[2723]: starting recovery command: "SELECT
pgpool_recovery('basebackup.sh', 'server1', '/opt/postgres/9.2/data')"
Jan 31 16:58:11 server0 pgpool[2723]: 1st stage is done
Jan 31 16:58:11 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:postgres user:postgres
(retry 0 times)
Jan 31 16:58:11 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:postgres user:postgres
Jan 31 16:58:13 server0 pgpool[2719]: connection received:
host=server0.local port=58446
Jan 31 16:58:14 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:postgres user:postgres
(retry 1 times)
Jan 31 16:58:14 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:postgres user:postgres
Jan 31 16:58:14 server0 pgpool[2719]: connection received:
host=server1.local port=39928
Jan 31 16:58:17 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:postgres user:postgres
(retry 2 times)
Jan 31 16:58:17 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:postgres user:postgres
Jan 31 16:58:20 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:postgres user:postgres
(retry 3 times)
Jan 31 16:58:20 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:postgres user:postgres
Jan 31 16:58:23 server0 pgpool[2719]: connection received:
host=server0.local port=58464
Jan 31 16:58:23 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:template1 user:postgres
(retry 0 times)
Jan 31 16:58:23 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:template1 user:postgres
Jan 31 16:58:26 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:template1 user:postgres
(retry 1 times)
Jan 31 16:58:26 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:template1 user:postgres
Jan 31 16:58:26 server0 pgpool[2719]: connection received:
host=server1.local port=39946
Jan 31 16:58:29 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:template1 user:postgres
(retry 2 times)
Jan 31 16:58:29 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:template1 user:postgres
Jan 31 16:58:32 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:template1 user:postgres
(retry 3 times)
Jan 31 16:58:32 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:template1 user:postgres
Jan 31 16:58:33 server0 pgpool[2719]: connection received:
host=server0.local port=58483
Jan 31 16:58:35 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:template1 user:postgres
(retry 4 times)
Jan 31 16:58:35 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:template1 user:postgres
Jan 31 16:58:38 server0 pgpool[2723]: check_postmaster_started: try to
connect to postmaster on hostname:server1 database:template1 user:postgres
(retry 5 times)
Jan 31 16:58:38 server0 pgpool[2723]: check_postmaster_started: failed to
connect to postmaster on hostname:server1 database:template1 user:postgres

here is the exact command i execute on server0 to recover server1,
/usr/local/bin/pcp_recovery_node 10 localhost 9898 pgpool cisco 1

Do you have any idea why?

Just FYI, we cannot pgpoolAdmin in our environment.

On Sun, Feb 17, 2013 at 12:13 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:

> > Hi all,
> > I have the following question regarding the recovery of a filed Primary
> > Database Server.
> >
> > Question 1: in the documentation, under Streaming Replication Online
> > Recovery section.
> >
> > http://www.pgpool.net/docs/latest/pgpool-en.html#stream
> >
> > in steps 6:
> >
> >    1. After completing online recovery, pgpool-II will start PostgreSQL
> on
> >    the standby node. Install the script for this purpose on each DB
> > nodes. Sample
> >    script <http://www.pgpool.net/docs/latest/pgpool_remote_start> is
> >    included in "sample" directory of the source code. This script uses
> ssh.
> >    You need to allow recovery_user to login from the primary node to the
> >    standby node without being asked password.
> >
> > To my understanding, i think the postgreSQL doesn't not need to be online
> > for the recovery process right? Since later on it mentions that
> > pgpool_remote_start will start up the DB on the failed node.
>
> Acually standby PostgreSQL node should not be started.
>
> > Question 2: in my configuration, i have 2 pgpool server with two
> backends.
> > Will it work for oneline recovery?
>
> Yes, but online recovery process should be initiated by one of pgpool,
> not both. If you enable pgpool-II 3.2's watchdog, it will take care of
> neccessary interlocking.
>
> > Question 3: when the failed node comes back online, should i use
> > pcp_recovery from DB primary or should i use pcp_attach on the failed
> node
> > to recover the failed system? Actually in my case, both methods do not
> > recover my system every time.
>
> I'm confused. Didn't you start the online recovery process by using
> pcp_recovery_node?(of course you could do it via pgpoolAdmin).
>
> Anyway pcp_recovery_node automatically attach recovered node, and you
> don't need to execute pcp_attach_node.
>
> I suggest you read tutorial:
> http://www.pgpool.net/pgpool-web/contrib_docs/simple_sr_setting2/index.html
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20130217/0f5b107f/attachment.htm>