[pgpool-general: 2062] Re: trouble with recovery of a downed node

Tatsuo Ishii ishii at postgresql.org
Thu Aug 22 07:53:41 JST 2013


> On 13-08-21 12:14 AM, Tatsuo Ishii wrote:
>>> Tatsuo, thanks very much for your response.  I've tried online
>>> recovery with the database clients disconnected, but it didn't have
>>> any effect.  The database recovers correctly, works for a while and
>>> then when a certain update takes place one of the nodes blows up. It's
>>> not always the same statement but one of them is:
>>>
>>> 2013-08-20 17:04:02 ERROR: pid 30688: pgpool detected difference of
>>> the number of inserted, updated or deleted tuples. Possible last query
>>> was: "UPDATE ws_cached_searches SET search_data = $1, cache_time =
>>> current_timestamp WHERE cached_search_id = $2"
>>> 2013-08-20 17:04:02 LOG: pid 30688: CommandComplete: Number of
>>> affected tuples are: 1 0 1
>> I think there's no instance way to find out the cause of the
>> problem. I recommend following steps:
>>
>> 1) right after recovery, make sure that the contents of the DBs are
>>     identical. For this you can use pg_dump or some existing tool to
>>     compare database (e.g. export as text file and compare).
>>
>> 2) enable log_per_node_statement and wait until the error happens.
>>
>> 3) examine the log of SQL command issued to each DB node to look for
>>     cause of the problem.
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
> 
> 
> Ah, I believe I understand the problem now: the sequence values are
> different (off by one) between the original and recovered databases.
> The sample pgpool_sample_pitr script has:
> 
>     SELECT setval(oid, nextval(oid)) FROM pg_class WHERE relkind = 'S'
> 
> When I change that to
> 
>     SELECT setval(oid, 1*currval(oid)) FROM pg_class WHERE relkind = 'S'
> 
> I get identical databases and the problem has not reoccurred!  (The
> "1*" is to defeat optimization; not sure if it is needed.)
> 
> Thanks for your excellent advice.

Thanks for the report but your changes seems a little bit strange to
me. According to the PostgreSQL manual, calling currval() without
prior calling nextval() will raise an error. Don't you get any error?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


More information about the pgpool-general mailing list