[pgpool-general: 3990] Re: Archive Cleanup

Thu Aug 20 06:02:46 JST 2015

I could not respond earlier.

I have my test environments setup with a crontab to delete the WALs after
a day; but the production environment, I kick off basebackups daily,
archive that and the daily wals to the same off-site location minimally.

Postgres Streaming Replication (ŒSR¹) and Hot Standby are outgrowths of
PITR (point in time recovery), an evolution of functionality and it¹s a
bit messy, but the documentation is excellent.  In a case of SR and hot
standby where you don¹t care about PITR backups, you are effectively
orphaning the WALs created from the archive_command, but it¹s OK.  The SR
WALs are created by 'wal_level = hot_standby¹ which also gives you
Œarchive¹ whether you want it or not.  With wal_level set to archive or
higher (hot_standby), all the WALs are collecting in pg_xlog on the
primary, but you are not supposed to touch files in there or you will
corrupt your DB, so you have to use the archive_command to purge pg_xlog.
You don¹t seem to have a need for PITR so you can just nuke your archived
WALs as long as the standby is keeping up with SR.   It¹s not ideal for SR
/ hot standby to have all this complexity of PITR if you don¹t want PITR,
but it¹s how the functionality organically grew.

Pgpool has no direct interaction with WALs and basebackup.  Pgpool is
acting as a semi-automation tool to initiate postgresql SR and hot standby
to the standby; the functionality of the SR and hot standby is pure
postgresql.  Pgpool does not touch the data (basebackup and WALs).  Pgpool
uses a bunch of Œglue¹ to Œrecover¹ a SR/hot standby, but it doesn¹t try
to mangle postgres¹s data.  It uses other glue to cause the standby to
Œtouch¹ the recovery trigger on the standby node to cause it to become
primary, when things don¹t look so good on the old primary.

What happens to the first basebackup that pgpool helped be created (pgpool
ran some clever postgresql extension and then some shell, eventually
running pg_basebackup) to get me to SR/hot standby, if I also want to
create daily basebackups for PITR?  I re-read the postgresql docs many
time before the meaning of 'backup history file¹ sunk in.  The answer is
nothing.  The original basebackup has a history file with WALs that age
back to the history file.  The next basebackup creates a backup history
file for WALs to correspond back to it.

"To recover successfully using continuous archiving (also called "online
backup" by many database vendors), you need a continuous sequence of
archived WAL files that extends back at least as far as the start time of
your backup." 

http://www.postgresql.org/docs/9.4/static/continuous-archiving.html

-dkw 

From:  Jonathan Eastgate <jonathan.eastgate at simpro.co>
Date:  Thursday, August 13, 2015 at 8:51 PM
To:  Darryl Wisneski <darryl.wisneski at finalsite.com>
Cc:  pgpool-general <pgpool-general at pgpool.net>
Subject:  Re: [pgpool-general: 3963] Re: Archive Cleanup

Hi guys.

Thanks for those replies.

Not quite what I'm questioning though so let me try and explain better.

When the server is acting as a primary - running streaming replication to
a standby - does PGpool require any of the older files in the Wal archive
directory prior to the execution of the basebackup.sh script during a
resync of the standby?

I ask this because during resync of a standby it appears the Master does
calls for base_backup - which would indicate the the slave node won't
require any Wal archives prior to that point?

Hope that makes it a little clearer?

Thanks. 

Jonathan J. Eastgate

Chief Technology Officer | simPRO Software Group
Ph: 1300 139 467    +61
 7 3147 8777

Keep up to date with simPRO at: simpro.co/blog <http://simpro.co/blog>
The contents of this email are subject to our email disclaimer
<http://simpro.com.au/legal/email-confidentiality-notice/>.

On Fri, Aug 14, 2015 at 2:03 AM, Darryl Wisneski
<darryl.wisneski at finalsite.com> wrote:

gzip in the archive_command helps with disk usage.

A daily basebackup out of cron would enable you to nuke WALs older than
the most recent basebackup¹s 'backup history file.'

Adjust your WAL and basebackup archive retention based on business
requirements.

-dkw

From: <pgpool-general-bounces at pgpool.net> on behalf of Jonathan Eastgate
<jonathan.eastgate at simpro.co>
Date: Wednesday, August 12, 2015 at 10:44 PM
To: pgpool-general <pgpool-general at pgpool.net>
Subject: [pgpool-general: 3958] Archive Cleanup

Hi everyone.

We have a successful PGPoolII installation running nicely using streaming
replication.

However we're trying to work out what we need to keep in our WAL archives
- as these are filling up quickly.

Do we need to run a manual base_backup regularly and clean up anything
prior to that in the Wal archive folder? Or based on the fact that PGPool
runs a base_backup during a resync when required - that in fact we don't
need any of the wal archives?

Would be interested to know how people are handling this?

Thanks

Jonathan

-- 

_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general

--