[pgpool-general: 1652] Re: inserts sent to standby

Sun Apr 21 12:24:32 JST 2013

Seems like a famouse "canceling statement due to conflict with
recovery".  This is caused by VACUUM on primary being removing old
rows before your query on standby tries to read them. I saw similar
issues in the thread and the poster found that hot_standby_feedback
helped.

[pgpool-general: 1589] Re: pgpool 3.2.3 node gets down after some hours of work.
http://www.pgpool.net/pipermail/pgpool-general/2013-April/001616.html
> I've set hot_standby_feedback = on on slave - it seems help.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Tatsuo,
> 
> I did find an error in the standby log:
> 
> 2013-04-20 08:01:11 UTC [29975]: [171-1] user=xxxx,db=rad ERROR:  canceling statement due to conflict with recovery
> 2013-04-20 08:01:11 UTC [29975]: [172-1] user=xxxx,db=rad DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-04-20 08:01:11 UTC [29975]: [173-1] user=xxxx,db=rad STATEMENT:  select distinct(ar.id), ar.name as nm, ar.name         from artist ar         join album al on ar.id=al.artist_id
> 2013-04-20 08:01:11 UTC [29975]: [174-1] user=xxxx,db=rad LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp29975.14", size 107708416
> 2013-04-20 08:01:11 UTC [29975]: [175-1] user=xxxx,db=rad FATAL:  terminating connection due to conflict with recovery
> 2013-04-20 08:01:11 UTC [29975]: [176-1] user=xxxx,db=rad DETAIL:  User query might have needed to see row versions that must be removed.
> 2013-04-20 08:01:11 UTC [29975]: [177-1] user=xxxx,db=rad HINT:  In a moment you should be able to reconnect to the database and repeat your command.
> 2013-04-20 08:01:11 UTC [29975]: [178-1] user=xxxx,db=rad LOG:  disconnection: session time: 0:04:21.267 user=xxxx database=rad host=10.140.76.210 port=44700
> 
> Thanks,
> 
> Mike
> 
> 
> On Apr 20, 2013, at 6:29 PM, Tatsuo Ishii wrote:
> 
>>> Tatsuo,
>>> 
>>> I did this but don't have enough information as rsyslogd is dropping messages due to rate limiting.  I will try and reproduce this problem in a test environment.
>>> 
>>> Meanwhile, I turned delay_threshold off and debug off and restarted pgpool. I no longer see inserts being sent to the standby, however, I encountered another problem which is that pgpool becomes unresponsive and clients can't connect.  The cpu usage spikes to 100%.  Right before this happened, I see this in the log file:
>>> 
>>> Apr 20 08:01:07 ip-10-140-76-210 pgpool[2133]: pool_process_query: discard E packet from backend 1
>> 
>> This means pgpool receives certain error message from backend 1. Can
>> you find an error around this time in PostgreSQL log?
>> 
>>> Apr 20 08:01:07 ip-10-140-76-210 pgpool[2133]: pool_read: EOF encountered with backend
>>> Apr 20 08:01:07 ip-10-140-76-210 pgpool[2133]: pool_read: EOF encountered with backend
>>> Apr 20 08:01:07 ip-10-140-76-210 pgpool[2133]: detect_stop_postmaster_error: detect_error error
>>> Apr 20 08:01:07 ip-10-140-76-210 pgpool[2133]: detect_postmaster_down_error returns error on backend 1. Going to close this session.
>>> 
>>> To me it looks like there was a communication error with the standby. Is this what this means? Show pool_nodes shows both nodes with status of 2 after this. Shouldn't show pool_nodes show the standby with a status of 3 after this? And does pgpool recover from this or do I need to restart?
>>> 
>>> Thanks again for your great product and assistance,
>>> 
>>> Mike
>>> 
>>> 
>>> On Apr 18, 2013, at 6:49 PM, Tatsuo Ishii wrote:
>>> 
>>>> I had hard times to reproduce your problem. Can you change pgpool.conf
>>>> (debug = 1) and do pgpool reload, then send an insert to pgpool? That
>>>> should produce detailed debug information.
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>> 
>>>>> Tatsuo,
>>>>> 
>>>>> Thank you for confirming this.  I have experienced the same behavior again.  Is there something I can do to help provide information about this problem?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Mike
>>>>> 
>>>>> On Apr 16, 2013, at 7:41 PM, Tatsuo Ishii wrote:
>>>>> 
>>>>>>> I have been running pgpool 3.2.3 in production and everything has been working great.  However, I recently had an incident where pgpool started sending inserts to the the read-only standby and not the master.  I'm using master/slave mode with streaming replication and both the master and slave were up and running and when I checked the pgpool status, everything seemed normal.  I'm also using delay_threshold = 100000.  When this incident occurred, it was after a large data update in which the lag exceed this threshold.  I believe that once the threshold was no longer exceeded, it started sending queries back to the standby and also started sending inserts there too.
>>>>>>> 
>>>>>>> Am I understanding this feature correctly?  If the lag exceeds the delay_threshold, it stops sending queries to the standby and when it drops below the threshold, it resumes.  Any idea as to why it would send inserts to the standby?  I did not have debug on, so I don't have much information to go on, except I do know that the delay_threshold was exceeded and shortly after it was no longer exceeded, this started happening.  Restarting pgpool cleared it up.
>>>>>> 
>>>>>> Your understanding of delay_threshold behavior is correct and sounds
>>>>>> like the incident you observed is a bug. I will try to reproduce the
>>>>>> problem to find out what is wrong.
>>>>>> --
>>>>>> Tatsuo Ishii
>>>>>> SRA OSS, Inc. Japan
>>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>>> Japanese: http://www.sraoss.co.jp
>>>>> 
>>> 
>