[pgpool-general: 4355] Re: Pgpool - connection hangs in DISCARD ALL

Tatsuo Ishii ishii at postgresql.org
Sat Jan 23 10:31:14 JST 2016


Usama,

I still cannot reproduce the problem and cannot confirm if your patch
fixes the problem or not. But one thing I notice while playing with
bug #147 (that's simiar case with some test script), the pgpool log
reported something like:

2016-01-23 09:49:17: pid 17950: ERROR:  unable to read data from frontend
2016-01-23 09:49:17: pid 17950: DETAIL:  EOF encountered with frontend

2016-01-23 10:01:07: pid 19988: ERROR:  unable to read data from frontend
2016-01-23 10:01:07: pid 19988: DETAIL:  socket read failed with an error "Connection reset by peer

2016-01-23 09:56:18: pid 19679: ERROR:  unable to flush data to frontend

The last one is rare (like 1/50 probability), but it seems your patch
does not cover the last case. Attched is a patch to deal with the
case.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Hi
> 
> I am looking into this issue. and unfortunately like Ishii-San I am also
> not able to reproduce it. But I found one issue in 3.4 that might cause the
> problem. Can you please try the attached patch if it solves the problem.
> Also, if the problem still persists, it would be really helpful if you
> could share the pgpool-II log.
> 
> Thanks
> Best regards
> Muhammad Usama
> 
> 
> 
> On Fri, Jan 8, 2016 at 12:00 PM, Gerhard Wiesinger <lists at wiesinger.com>
> wrote:
> 
>> On 08.01.2016 07:32, Tatsuo Ishii wrote:
>>
>>> On 07.01.2016 22:32, Tatsuo Ishii wrote:
>>>>
>>>>> I heard similar reports but problem for developers is, nobody knows
>>>>> how to reproduce the hang (a stack trace after the hang is not very
>>>>> helpful).
>>>>>
>>>>>
>>>>> Yes, there were already 2? on the list.
>>>>
>>>> I can reproduce it randomly in about one day. I think stack traces
>>>> should be sufficient when e.g. debug code has been added.
>>>>
>>> No. That's just a consequence of many problems including this one.
>>>
>>> How to track the bug down?
>>>> Debug code?
>>>> Enhanced logging?
>>>>
>>> A debug logging would be helpful (start pgpool with -d option) but it
>>> will consume huge disk space if you want to save all off them. Is it
>>> possible to keep only last 30 minues debug logs before the problem
>>> occurs? I guess even 10 minutes is enough to know the cause of the
>>> problem.
>>>
>>
>> Can you try to reproduce it, too?
>>
>> My use case is very simple:
>> 1.) One persistent connection with username1/database1 doing an insert
>> every minute
>> 2.) Nagios bombs out about 10 simple SELECT queries every minute with
>> username2/database1 nearly at the same time (so it looks like sometimes
>> same backend connection can be used)
>> So should be easy to test. I close the unused backend connections after
>> 2s, so if you bomb e.g. every 10s should be fast to reproduce.
>>
>> Happens in my case in 0.5-24h.
>>
>>
>> Ciao,
>> Gerhard
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pool_stream.diff
Type: text/x-patch
Size: 881 bytes
Desc: not available
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20160123/fbc2f319/attachment.bin>


More information about the pgpool-general mailing list