[Pgpool-general] Error during online recovery with pgpool!!!

Lazaro Rubén García Martinez lgarciam at vnz.uci.cu
Thu May 26 21:26:00 UTC 2011


Hello every one in the list, in many moments when I execute an online recovery with pgpool, the process of online recovery works well, but in the  PostgreSQL log file, of the server that I need to recover there are many error lines. If the process for recovering a node works well, why are there in PostgreSQL log file, error lines?


The base-backup script used in first stage of recovery is this:

#!/bin/sh 

DATA=$1
RECOVERY_TARGET=$2
RECOVERY_DATA=$3
SSH=/usr/bin/ssh
RSYNC="/usr/bin/rsync -arz --rsh=$SSH --delete"
SCP=/usr/bin/scp

PGENGINE=/usr/bin
PGCTL=$PGENGINE/pg_ctl
SED=/bin/sed
PSQL=/usr/bin/psql
PGCONF="postgresql.conf"
PGCONFTMP=/media/pgsql

PGSQLPATH=/media/pgsql
mkdir $PGSQLPATH/prueba
$SED -r -e "s/\s*archive_command\s*=.*/archive_command = '\/bin\/cp %p \/media\/pgsql\/pg_xlog_archive\/%f'/" $RECOVERY_DATA/$PGCONF > $PGSQLPATH/prueba/$PGCONF
        chmod 644 $PGSQLPATH/prueba/$PGCONF
        mv --force $PGSQLPATH/prueba/$PGCONF $RECOVERY_DATA/

rm -Rf $PGSQLPATH/prueba
$PGCTL reload -D $RECOVERY_DATA -s > /dev/null        

psql -c "select pg_start_backup('pgpool-recovery')" postgres
echo "restore_command = '$RSYNC 10.13.4.181:/media/pgsql/pg_xlog_archive/%f %p'" > /media/pgsql/data/recovery.conf

$RSYNC $DATA/base/ $RECOVERY_TARGET:$RECOVERY_DATA/base/

$RSYNC $DATA/global/ $RECOVERY_TARGET:$RECOVERY_DATA/global/ 

$RSYNC $DATA/pg_clog/ $RECOVERY_TARGET:$RECOVERY_DATA/pg_clog/ 

$RSYNC $DATA/pg_multixact/ $RECOVERY_TARGET:$RECOVERY_DATA/pg_multixact/ 

$RSYNC $DATA/pg_subtrans/ $RECOVERY_TARGET:$RECOVERY_DATA/pg_subtrans/ 

$RSYNC $DATA/pg_tblspc/ $RECOVERY_TARGET:$RECOVERY_DATA/pg_tblspc/ 

$RSYNC $DATA/pg_twophase/ $RECOVERY_TARGET:$RECOVERY_DATA/pg_twophase/ 
 
$RSYNC $DATA/pg_xlog/ $RECOVERY_TARGET:$RECOVERY_DATA/pg_xlog/ 

psql -c "select pg_stop_backup()" postgres

$RSYNC $DATA/recovery.conf $RECOVERY_TARGET:$RECOVERY_DATA/
rm $DATA/recovery.conf

exit 0

The pgpol_recovery_pitr script used in second stage of recovery is this:

#! /bin/sh
# Online recovery 2nd stage script
#
datadir=$1		# master dabatase cluster
DEST=$2			# hostname of the DB node to be recovered
DESTDIR=$3		# database cluster of the DB node to be recovered
#port=5432		# PostgreSQL port number
archdir=/media/pgsql/pg_xlog_archive	# archive log directory

# Force to flush current value of sequences to xlog 
psql -U postgres -t -c 'SELECT datname FROM pg_database WHERE NOT datistemplate AND datallowconn' template1|
while read i
do
  if [ "$i" != "" ];then
    psql -U postgres -c "SELECT setval(oid, nextval(oid)) FROM pg_class WHERE relkind = 'S'" $i
  fi
done

psql -U postgres -c "SELECT pgpool_switch_xlog('$archdir')" template1
PGENGINE=/usr/bin
PGCTL=$PGENGINE/pg_ctl
SED=/bin/sed
PSQL=/usr/bin/psql
PGCONF="postgresql.conf"
PGCONFTMP=/media/pgsql
PGSQLPATH=/media/pgsql
mkdir $PGSQLPATH/prueba

$SED -r -e "s/\s*archive_command\s*=.*/archive_command = 'exit 0'/" $DESTDIR/$PGCONF > $PGSQLPATH/prueba/$PGCONF
        chmod 644 $PGSQLPATH/prueba/$PGCONF
        mv --force $PGSQLPATH/prueba/$PGCONF $DESTDIR/

rm -Rf $PGSQLPATH/prueba

$PGCTL reload -D $DESTDIR -s > /dev/null

The PostgreSQL log file is this:

LOG:  database system was interrupted; last known up at 2011-05-25 00:02:39 VET
LOG:  starting archive recovery
LOG:  restore_command = '/usr/bin/rsync -arz --rsh=/usr/bin/ssh --delete 10.13.4.181:/media/pgsql/pg_xlog_archive/%f %p'
rsync: link_stat "/media/pgsql/pg_xlog_archive/00000001.history" failed: No such file or directory (2)
rsync error: some files could not be transferred (code 23) at main.c(1298) [receiver=2.6.8]
FATAL:  the database system is starting up
LOG:  restored log file "000000010000000000000007" from archive
LOG:  automatic recovery in progress
FATAL:  the database system is starting up
LOG:  redo starts at 0/7000060
FATAL:  the database system is starting up
LOG:  restored log file "000000010000000000000008" from archive
FATAL:  the database system is starting up
rsync: link_stat "/media/pgsql/pg_xlog_archive/000000010000000000000009" failed: No such file or directory (2)
rsync error: some files could not be transferred (code 23) at main.c(1298) [receiver=2.6.8]
LOG:  could not open file "pg_xlog/000000010000000000000009" (log file 0, segment 9): No existe el fichero o el directorio
LOG:  redo done at 0/8000060
FATAL:  the database system is starting up
LOG:  restored log file "000000010000000000000008" from archive
rsync: link_stat "/media/pgsql/pg_xlog_archive/00000002.history" failed: No such file or directory (2)
rsync error: some files could not be transferred (code 23) at main.c(1298) [receiver=2.6.8]
LOG:  selected new timeline ID: 2
rsync: link_stat "/media/pgsql/pg_xlog_archive/00000001.history" failed: No such file or directory (2)
rsync error: some files could not be transferred (code 23) at main.c(1298) [receiver=2.6.8]
FATAL:  the database system is starting up
FATAL:  the database system is starting up
LOG:  archive recovery complete
FATAL:  the database system is starting up
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

I hope your answer.

Regards and thank you very much for your time.




More information about the Pgpool-general mailing list