[Pgpool-general] Not able to recover a node using pgpool recovery

DM dm.aeqa at gmail.com
Tue Mar 17 18:37:52 UTC 2009


Hi Gerd,

(To answer your question: I execute all the commands in base-backup script
without any errors.)

Here is what I did from the begining to recover a database, ( Copied all the
scripts base-backup.sh, pgpool_remote_start.sh and pgpool-recovery.sh to
data folders of both machines. I also did the configuration changes to the
postgres and pgpool)

Step 1:
I executed the below command below
pcp_recovery_node -d 10.80.16.16 9898 pgpuser <password> 1

Got an error at pgpool_recovery
2009-03-13 18:07:26 LOG:   pid 17559: starting recovery command: "SELECT
pgpool_recovery('base-backup', 'fdbr-res0002', '/mnt/data/pns/pgsql8.3.6')"
2009-03-13 18:07:26 ERROR: pid 17559: exec_recovery: base-backup command
failed at 1st stage

I started debugging the issue,

Step 2:
I logged into database on first machine fbr-res0001

(psql -p 2345 -U postgres -d template1) (template 1 because I have installed
the function pgpool_recovery on template 1 as per the documentation)

I executed the sql on db1 fdbr-res0001

template1=# SELECT pgpool_recovery('base-backup', 'fdbr-res0002',
'/mnt/data/pns/pgsql8.3.6');
 pgpool_recovery
-----------------
 t
(1 row)


I checked the postgres log file and saw the below

postgres|template1|localhost.localdomain(62302)|14229|2009-03-16
15:11:58.887 PDT|idle|49becea6.3795|3|2009-03-16 15:11:50 PDT|0LOG:
statement: SELECT pgpool_recovery('base-backup.sh', 'fdbr-res0002',
'/mnt/data/pns/pgsql8.3.6');
[unknown]|[unknown]||14234|2009-03-16 15:11:58.891
PDT|/usr/postgres/8.3.6/bin/postgres|49beceae.379a|1|2009-03-16 15:11:58
PDT|0LOG:  connection received: host=[local]
 database=postgres9-03-16 15:11:58.891
PDT|authentication|49beceae.379a|2|2009-03-16 15:11:58 PDT|0LOG:  connection
authorized: user=postgres
" does not exist009-03-16 15:11:58.892
PDT|startup|49beceae.379a|3|2009-03-16 15:11:58 PDT|0FATAL:  database
"postgres
" does not existtabase "postgres
: command not found8.3.6/base-backup.sh: line 2:
: command not found8.3.6/base-backup.sh: line 4:
tar: Removing leading `/' from member names
tar: /mnt/data/pns/pgsql8.3.6\r: Cannot stat: No such file or directory
tar: Error exit delayed from previous errors
: command not found8.3.6/base-backup.sh: line 6:
[unknown]|[unknown]||14241|2009-03-16 15:11:58.898
PDT|/usr/postgres/8.3.6/bin/postgres|49beceae.37a1|1|2009-03-16 15:11:58
PDT|0LOG:  connection received: host=[local]
 database=postgres9-03-16 15:11:58.898
PDT|authentication|49beceae.37a1|2|2009-03-16 15:11:58 PDT|0LOG:  connection
authorized: user=postgres
" does not exist009-03-16 15:11:58.898
PDT|startup|49beceae.37a1|3|2009-03-16 15:11:58 PDT|0FATAL:  database
"postgres
" does not existtabase "postgres
: command not found8.3.6/base-backup.sh: line 8:
postgres|template1|localhost.localdomain(62302)|14229|2009-03-16
15:11:59.061 PDT|SELECT|49becea6.3795|4|2009-03-16 15:11:50 PDT|0LOG:
duration: 174.224 ms


Step 3:

I logged into database on second machine fbr-res0002 and checked the db and
there was no recovery done.

I am not able to figure out where I am going wrong.

Thanks
Deepak Murthy





2009/3/17 Gerd König <koenig at transporeon.com>

> Hello Deepak,
>
> I don't think you have to call "pgpool_recovery" if you want to perform the
> recovery manually. The log entries are completely mixed up and I've no idea
> what's going wrong there...
> Try to call the commands from your base-backup script manually and check if
> it's
> working / or at which step the error occurs.
>
> please keep me up-to-date, regards.....GERD.....
>
>
>
> DM schrieb:
> > Hi Gerd,
> >
> > Thanks for looking into this issue.
> >
> > # I would favor /tmp/pgpool-recovery instead of /data,
> > # as it contains live data
> >
> > Please ignore the comment, the comment, I copied the script and modified
> > according to my system configurations.
> >
> > Are you able to perform a scp command without typing a password (->ssh
> > key-exchange) ?
> > Yes i was able to scp without typing a password.
> >
> > I was able to replace RECOVERY_TARGET with the Ip address in the script,
> > here is what i got.
> >
> > Here is my *base-backup.sh *script--
> > #########
> > /usr/postgres/8.3.6/bin/psql -p 2345 -c "select
> > pg_start_backup('pgpool-recovery')" -U postgres
> >
> > echo "restore_command = 'scp
> > $HOSTNAME:/mnt/data/pns/pgsql8.3.6/archive_log/%f %p'" >
> > /mnt/data/pns/pgsql8.3.6/recovery.conf
> >
> > tar -zcf /mnt/data/pns/pgsql8.3.6.tar.gz mnt/data/pns/pgsql8.3.6
> >
> > /usr/postgres/8.3.6/bin/psql -p 2345 -c 'select pg_stop_backup()' -U
> > postgres
> >
> > scp /mnt/data/pns/pgsql8.3.6.tar.gz 10.80.16.17:/mnt/data/pns/
> >
> > #########
> >
> > Step 1:
> > I logged into Postgres
> >
> > /usr/postgres/8.3.6/bin/psql -p 2345 -d template1
> >
> > Step 2: Executed the below command
> > template1=# SELECT pgpool_recovery('base-backup.sh', 'fdbr-res0002',
> > '/mnt/data/pns/pgsql8.3.6');
> >  pgpool_recovery
> > -----------------
> >  t
> > (1 row)
> >
> > But postgres log file showed the below error message.
> >
> > Here is the postgres log
> >
> > postgres|template1|localhost.localdomain(62302)|14229|2009-03-16
> > 15:11:58.887 PDT|idle|49becea6.3795|3|2009-03-16 15:11:50 PDT|0LOG:
> > statement: SELECT pgpool_recovery('base-backup.sh', 'fdbr-res0002',
> > '/mnt/data/pns/pgsql8.3.6');
> > [unknown]|[unknown]||14234|2009-03-16 15:11:58.891
> > PDT|/usr/postgres/8.3.6/bin/postgres|49beceae.379a|1|2009-03-16 15:11:58
> > PDT|0LOG:  connection received: host=[local]
> >  database=postgres9-03-16 15:11:58.891
> > PDT|authentication|49beceae.379a|2|2009-03-16 15:11:58 PDT|0LOG:
> > connection authorized: user=postgres
> > " does not exist009-03-16 15:11:58.892
> > PDT|startup|49beceae.379a|3|2009-03-16 15:11:58 PDT|0FATAL:  database
> > "postgres
> > " does not existtabase "postgres
> > : command not found8.3.6/base-backup.sh: line 2:
> > : command not found8.3.6/base-backup.sh: line 4:
> > tar: Removing leading `/' from member names
> > tar: /mnt/data/pns/pgsql8.3.6\r: Cannot stat: No such file or directory
> > tar: Error exit delayed from previous errors
> > : command not found8.3.6/base-backup.sh: line 6:
> > [unknown]|[unknown]||14241|2009-03-16 15:11:58.898
> > PDT|/usr/postgres/8.3.6/bin/postgres|49beceae.37a1|1|2009-03-16 15:11:58
> > PDT|0LOG:  connection received: host=[local]
> >  database=postgres9-03-16 15:11:58.898
> > PDT|authentication|49beceae.37a1|2|2009-03-16 15:11:58 PDT|0LOG:
> > connection authorized: user=postgres
> > " does not exist009-03-16 15:11:58.898
> > PDT|startup|49beceae.37a1|3|2009-03-16 15:11:58 PDT|0FATAL:  database
> > "postgres
> > " does not existtabase "postgres
> > : command not found8.3.6/base-backup.sh: line 8:
> > postgres|template1|localhost.localdomain(62302)|14229|2009-03-16
> > 15:11:59.061 PDT|SELECT|49becea6.3795|4|2009-03-16 15:11:50 PDT|0LOG:
> > duration: 174.224 ms
> >
> >
> > I am not understanding how to solve this issue. Guide me if i am doing
> > something worng here.
> >
> > Thanks
> > Deepak Murthy
> >
> > On Sat, Mar 14, 2009 at 12:27 PM, Gerd Koenig <koenig at transporeon.com
> > <mailto:koenig at transporeon.com>> wrote:
> >
> >     Hello Deepak,
> >
> >     DM schrieb:
> >
> >         Also here is teh base-backup script.
> >
> >     # I would favor /tmp/pgpool-recovery instead of /data,
> >     # as it contains live data
> >
> >     I don't understand your comment. You have to sync your $PGDATA
> >     directory, the directory which includes several directories like
> >     "base", "global", "pg_clog" and files like pg_hba.conf,
> postgresql.conf
> >
> >
> >         Should I need to set these environment variables
> >         -RECOVERY_TARGET, RECOVERY_DATA ?
> >
> >
> >     I don't think so, because e.g. RECOVERY_DATA doesn't appear in the
> >     script (except for setting it). Try setting real values for
> >     $HOSTNAME, $RECOVERY_TARGET for testing.
> >     Since the logs doesn't tell too much, what about calling the single
> >     commands manually to be able to locate the error.
> >     Are you able to perform a scp command without typing a password
> >     (->ssh key-exchange) ?
> >
> >     Are there some entries in postgres.log on the node you want to
> >     create the checkpoint ?
> >
> >     regards....GERD....
> >
> >
> >
> >         On Fri, Mar 13, 2009 at 7:04 PM, DM <dm.aeqa at gmail.com
> >         <mailto:dm.aeqa at gmail.com> <mailto:dm.aeqa at gmail.com
> >         <mailto:dm.aeqa at gmail.com>>> wrote:
> >
> >            Hi All,
> >
> >            Not able to recover node using pgpool recovery
> >
> >            # I did all the configuration changes - Gerd Koenig's document
> >
> >            pcp_recovery_node -d 10.80.16.16 9898 pgpuser xyz 1
> >
> >            2009-03-13 18:07:26 LOG:   pid 17559: starting recovering node
> 1
> >            2009-03-13 18:07:26 LOG:   pid 17559: CHECKPOINT in the 1st
> >         stage done
> >            2009-03-13 18:07:26 LOG:   pid 17559: starting recovery
> command:
> >            "SELECT pgpool_recovery('base-backup', 'fdbr-res0002',
> >            '/mnt/data/pns/pgsql8.3.6')"
> >            2009-03-13 18:07:26 ERROR: pid 17559: exec_recovery:
> base-backup
> >            command failed at 1st stage
> >            2009-03-13 18:10:01 LOG:   pid 17559: starting recovering node
> 1
> >
> >
> >            #####################################################
> >
> >            I logged to psql on fdbr-res0001
> >
> >            #===========
> >            [postgres at fdbr-res0001 scripts]$ /usr/postgres/8.3.6/bin/psql
> >         -p9999
> >            Welcome to psql 8.3.6, the PostgreSQL interactive terminal.
> >
> >            Type:  \copyright for distribution terms
> >                   \h for help with SQL commands
> >                   \? for help with psql commands
> >                   \g or terminate with semicolon to execute query
> >                   \q to quit
> >
> >            postgres=# \c template1
> >            You are now connected to database "template1".
> >            template1=# SELECT pgpool_recovery('base-backup',
> 'fdbr-res0002',
> >            '/mnt/data/pns/pgsql8.3.6');
> >            ERROR:  pgpool_recovery failed
> >            template1=#
> >            #===========
> >
> >            Not able to figure out how to solve this issue
> >            ERROR:  pgpool_recovery failed
> >
> >            Help me with this error
> >
> >            Thanks
> >            Deepak Murthy
> >
> >
> >
> >
> ------------------------------------------------------------------------
> >
> >         _______________________________________________
> >         Pgpool-general mailing list
> >         Pgpool-general at pgfoundry.org <mailto:
> Pgpool-general at pgfoundry.org>
> >         http://pgfoundry.org/mailman/listinfo/pgpool-general
> >
> >
> >
>
> --
> /===============================\
> | Gerd König
> | - Infrastruktur -
> |
> | TRANSPOREON GmbH
> | Pfarrer-Weiss-Weg 12
> | DE - 89077 Ulm
> |
> |
> | Tel: +49 [0]731 16906 16
> | Fax: +49 [0]731 16906 99
> | Web: www.transporeon.com
> |
> \===============================/
>
>
>
> Bleiben Sie auf dem Laufenden.
> Jetzt den Transporeon Newsletter abonnieren!
> http://www.transporeon.com/unternehmen_newsletter.shtml
>
>
> TRANSPOREON GmbH, Amtsgericht Ulm, HRB 722056
> Geschäftsf.: Axel Busch, Peter Förster, Roland Hötzl, Marc-Oliver Simon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://pgfoundry.org/pipermail/pgpool-general/attachments/20090317/651e1518/attachment-0001.html 


More information about the Pgpool-general mailing list