[pgpool-general: 2056] Re: trouble with recovery of a downed node

Tatsuo Ishii ishii at postgresql.org
Wed Aug 21 11:44:07 JST 2013


> Tatsuo, thanks very much for your response.  I've tried online
> recovery with the database clients disconnected, but it didn't have
> any effect.  The database recovers correctly, works for a while and
> then when a certain update takes place one of the nodes blows up. It's
> not always the same statement but one of them is:
> 
> 2013-08-20 17:04:02 ERROR: pid 30688: pgpool detected difference of
> the number of inserted, updated or deleted tuples. Possible last query
> was: "UPDATE ws_cached_searches SET search_data = $1, cache_time =
> current_timestamp WHERE cached_search_id = $2"
> 2013-08-20 17:04:02 LOG: pid 30688: CommandComplete: Number of
> affected tuples are: 1 0 1
> 
> 
> I've posted my configuration to
> https://github.com/clixtec/redundant-pgpool-config - pgpool and
> PostgreSQL config files in the root, and the four script files in a
> subdirectory.  Again it is modelled on the approach described at
> http://zetetic.net/blog/2012/3/9/point-in-time-recovery-from-backup-using-postgresql-continuo.html. Am
> I doing anything fundamentally wrong?

I think there's no instance way to find out the cause of the
problem. I recommend following steps:

1) right after recovery, make sure that the contents of the DBs are
   identical. For this you can use pg_dump or some existing tool to
   compare database (e.g. export as text file and compare).

2) enable log_per_node_statement and wait until the error happens.

3) examine the log of SQL command issued to each DB node to look for
   cause of the problem.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


More information about the pgpool-general mailing list