[Pgpool-general] Online recovery by PITR questions

Sun May 4 06:24:49 UTC 2008

Hi,

From: Nico -telmich- Schottelius <nico-pgpool at schottelius.org>
Subject: [Pgpool-general] Online recovery by PITR questions
Date: Fri, 2 May 2008 16:03:30 +0200

> I am currently trying to integrate online recovery with pgpool and I am
> a bit confused by your script:

Here is a revised version. Recovery script has three arguments.

--------------------------------------------------------------------------------
#! /bin/sh

# PGDATA path for master node
DATA=$1

# Recovery host name
RECOVERY_TARGET=$2

# PGDATA path for recovery node
RECOVERY_DATA=$3

# ok, this needs to be issued on some running backend, so I
# personally use pcp_node_count and pcp_node_info to get the
# ip address of a master (script follows, as soon as the
# cluster is running well
psql -c "select pg_start_backup('pgpool-recovery')" postgres

# I would favor /tmp/pgpool-recovery instead of /data,
# as it contains live data
echo "restore_command = 'scp $HOSTNAME:/data/archive_log/%f %p'" > /data/recovery.conf

# I guess pgsql is the datadir I've with /var/lib/postgresql/8.3/main
# under debian?
tar -C /data -zcf pgsql.tar.gz pgsql

psql -c 'select pg_stop_backup()' postgres
scp pgsql.tar.gz $RECOVERY_TARGET:/data
--------------------------------------------------------------------------------

> The whole script seems to imply that the master is running on the pgpool2 server,
> which may not be the fact.
> 
> How will pgpool2 call copy-base-backup?

pgpool_recovery function calls copy-base-backup. Its source code is in
pgpool-II/sql/pgpool-recovery directory.

pgpool calls it with the following steps.

1. Pgpool connects to a master node.

2. Execute "CHECKPOINT" on the master node.

2. Execute "SELECT pgpool_recovery('recovery-script,
   'target','PGDATA')" on the master node.

3. pgpool_recovery function executes the 1st stage script by system(3).

> And has it to be deployed on all database backends?

Yes.

> Where is the pgpool_recovery_pitr script executed?

On master node.

> And where is the pgsql.tar.gz used that was created in the first
> stage?

In pgpool document, pgpool_remote_start script expands the gzip
file. If you use rsync instead of using tar and scp, you don't need
it.

----
#! /bin/sh
DEST=$1
DESTDIR=$2
PGCTL=/usr/local/pgsql/bin/pg_ctl

# Expand a base backup
ssh -T $DEST 'cd /data/; tar zxf pgsql.tar.gz' 2>/dev/null 1>/dev/null < /dev/null
# Startup PostgreSQL server
ssh -T $DEST $PGCTL -w -D $DESTDIR start 2>/dev/null 1>/dev/null < /dev/null &
----

> When and from where will pgpool_remote_start be called?

It is after the second stage on master node.

> From the manual, seen in the example with pgpool_recovery,
>  it seems that pgpool_remote_start
> scripts get two parameters, but I am not sure where it gets called
> 
> And from the sample directory it seems the
> recovery_1st_stage_command and recovery_2nd_stage_command
> get three parameters, correct?

Yes.

> And is it no problem to do rsync recovery from a running database
> server?

Yes, PITR does not require to stop a database server.

> And when and where is online-recovery triggered?

You need to execute pcp_recovery_node command to start online recovery.

> And how does it relate to the failover and fallback command?

You don't need to these parameters.

> Is attaching automatically done?

Yes.

> Sorry for all the questions, I am a bit confused.

No problem!

Ishii-san and I will attend PGCon 2008.

http://lists.pgfoundry.org/pipermail/pgpool-general/2008-April/000996.html

If you or anyone have any questions, let's talk at PGcon.
(I am studying English conversation.)

Regards,
--
Yoshiyuki Asaba
y-asaba at sraoss.co.jp