[pgpool-general: 1403] Re: pgPool Online Recovery Streaming Replication

Tatsuo Ishii ishii at postgresql.org
Sun Feb 17 22:05:45 JST 2013


Hi,

It seems the standby was unable to start up. Can you show standby
PostgreSQL's log? Maybe we could find the cause of the problem.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hi Tatsuo,
> Thank you so much for your reply.
> Actually in my case, i was using the pcp_recovery command and execute it on
> the current primary server.
> However, if the remote node (recover node) database is off, i got the
> following message on the primary server pgpool log:
> 
> Jan 31 16:58:10 server0 pgpool[2723]: starting recovery command: "SELECT
> pgpool_recovery('basebackup.sh', 'server1', '/opt/postgres/9.2/data')"
> Jan 31 16:58:11 server0 pgpool[2723]: 1st stage is done
> Jan 31 16:58:11 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> (retry 0 times)
> Jan 31 16:58:11 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> Jan 31 16:58:13 server0 pgpool[2719]: connection received:
> host=server0.local port=58446
> Jan 31 16:58:14 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> (retry 1 times)
> Jan 31 16:58:14 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> Jan 31 16:58:14 server0 pgpool[2719]: connection received:
> host=server1.local port=39928
> Jan 31 16:58:17 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> (retry 2 times)
> Jan 31 16:58:17 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> Jan 31 16:58:20 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> (retry 3 times)
> Jan 31 16:58:20 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:postgres user:postgres
> Jan 31 16:58:23 server0 pgpool[2719]: connection received:
> host=server0.local port=58464
> Jan 31 16:58:23 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> (retry 0 times)
> Jan 31 16:58:23 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> Jan 31 16:58:26 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> (retry 1 times)
> Jan 31 16:58:26 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> Jan 31 16:58:26 server0 pgpool[2719]: connection received:
> host=server1.local port=39946
> Jan 31 16:58:29 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> (retry 2 times)
> Jan 31 16:58:29 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> Jan 31 16:58:32 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> (retry 3 times)
> Jan 31 16:58:32 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> Jan 31 16:58:33 server0 pgpool[2719]: connection received:
> host=server0.local port=58483
> Jan 31 16:58:35 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> (retry 4 times)
> Jan 31 16:58:35 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> Jan 31 16:58:38 server0 pgpool[2723]: check_postmaster_started: try to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> (retry 5 times)
> Jan 31 16:58:38 server0 pgpool[2723]: check_postmaster_started: failed to
> connect to postmaster on hostname:server1 database:template1 user:postgres
> 
> here is the exact command i execute on server0 to recover server1,
> /usr/local/bin/pcp_recovery_node 10 localhost 9898 pgpool password 1
> 
> Do you have any idea why?
> 
> Just FYI, we cannot pgpoolAdmin in our environment.
> 
> 
> On Sun, Feb 17, 2013 at 12:13 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> 
>> > Hi all,
>> > I have the following question regarding the recovery of a filed Primary
>> > Database Server.
>> >
>> > Question 1: in the documentation, under Streaming Replication Online
>> > Recovery section.
>> >
>> > http://www.pgpool.net/docs/latest/pgpool-en.html#stream
>> >
>> > in steps 6:
>> >
>> >    1. After completing online recovery, pgpool-II will start PostgreSQL
>> on
>> >    the standby node. Install the script for this purpose on each DB
>> > nodes. Sample
>> >    script <http://www.pgpool.net/docs/latest/pgpool_remote_start> is
>> >    included in "sample" directory of the source code. This script uses
>> ssh.
>> >    You need to allow recovery_user to login from the primary node to the
>> >    standby node without being asked password.
>> >
>> > To my understanding, i think the postgreSQL doesn't not need to be online
>> > for the recovery process right? Since later on it mentions that
>> > pgpool_remote_start will start up the DB on the failed node.
>>
>> Acually standby PostgreSQL node should not be started.
>>
>> > Question 2: in my configuration, i have 2 pgpool server with two
>> backends.
>> > Will it work for oneline recovery?
>>
>> Yes, but online recovery process should be initiated by one of pgpool,
>> not both. If you enable pgpool-II 3.2's watchdog, it will take care of
>> neccessary interlocking.
>>
>> > Question 3: when the failed node comes back online, should i use
>> > pcp_recovery from DB primary or should i use pcp_attach on the failed
>> node
>> > to recover the failed system? Actually in my case, both methods do not
>> > recover my system every time.
>>
>> I'm confused. Didn't you start the online recovery process by using
>> pcp_recovery_node?(of course you could do it via pgpoolAdmin).
>>
>> Anyway pcp_recovery_node automatically attach recovered node, and you
>> don't need to execute pcp_attach_node.
>>
>> I suggest you read tutorial:
>> http://www.pgpool.net/pgpool-web/contrib_docs/simple_sr_setting2/index.html
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>


More information about the pgpool-general mailing list