[pgpool-general: 4362] Re: Pgpool - connection hangs in DISCARD ALL

Wed Jan 27 15:38:54 JST 2016

Usama,

I have got no response from you but I have committed/pushed the patch
to master and 3.4 stable branch. Please reply to me if this is not
appropreate.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

From: Tatsuo Ishii <ishii at postgresql.org>
Subject: [pgpool-general: 4355] Re: Pgpool - connection hangs in DISCARD ALL
Date: Sat, 23 Jan 2016 10:31:14 +0900 (JST)
Message-ID: <20160123.103114.1108765446319109145.t-ishii at sraoss.co.jp>

> Usama,
> 
> I still cannot reproduce the problem and cannot confirm if your patch
> fixes the problem or not. But one thing I notice while playing with
> bug #147 (that's simiar case with some test script), the pgpool log
> reported something like:
> 
> 2016-01-23 09:49:17: pid 17950: ERROR:  unable to read data from frontend
> 2016-01-23 09:49:17: pid 17950: DETAIL:  EOF encountered with frontend
> 
> 2016-01-23 10:01:07: pid 19988: ERROR:  unable to read data from frontend
> 2016-01-23 10:01:07: pid 19988: DETAIL:  socket read failed with an error "Connection reset by peer
> 
> 2016-01-23 09:56:18: pid 19679: ERROR:  unable to flush data to frontend
> 
> The last one is rare (like 1/50 probability), but it seems your patch
> does not cover the last case. Attched is a patch to deal with the
> case.
> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
> 
>> Hi
>> 
>> I am looking into this issue. and unfortunately like Ishii-San I am also
>> not able to reproduce it. But I found one issue in 3.4 that might cause the
>> problem. Can you please try the attached patch if it solves the problem.
>> Also, if the problem still persists, it would be really helpful if you
>> could share the pgpool-II log.
>> 
>> Thanks
>> Best regards
>> Muhammad Usama
>> 
>> 
>> 
>> On Fri, Jan 8, 2016 at 12:00 PM, Gerhard Wiesinger <lists at wiesinger.com>
>> wrote:
>> 
>>> On 08.01.2016 07:32, Tatsuo Ishii wrote:
>>>
>>>> On 07.01.2016 22:32, Tatsuo Ishii wrote:
>>>>>
>>>>>> I heard similar reports but problem for developers is, nobody knows
>>>>>> how to reproduce the hang (a stack trace after the hang is not very
>>>>>> helpful).
>>>>>>
>>>>>>
>>>>>> Yes, there were already 2? on the list.
>>>>>
>>>>> I can reproduce it randomly in about one day. I think stack traces
>>>>> should be sufficient when e.g. debug code has been added.
>>>>>
>>>> No. That's just a consequence of many problems including this one.
>>>>
>>>> How to track the bug down?
>>>>> Debug code?
>>>>> Enhanced logging?
>>>>>
>>>> A debug logging would be helpful (start pgpool with -d option) but it
>>>> will consume huge disk space if you want to save all off them. Is it
>>>> possible to keep only last 30 minues debug logs before the problem
>>>> occurs? I guess even 10 minutes is enough to know the cause of the
>>>> problem.
>>>>
>>>
>>> Can you try to reproduce it, too?
>>>
>>> My use case is very simple:
>>> 1.) One persistent connection with username1/database1 doing an insert
>>> every minute
>>> 2.) Nagios bombs out about 10 simple SELECT queries every minute with
>>> username2/database1 nearly at the same time (so it looks like sometimes
>>> same backend connection can be used)
>>> So should be easy to test. I close the unused backend connections after
>>> 2s, so if you bomb e.g. every 10s should be fast to reproduce.
>>>
>>> Happens in my case in 0.5-24h.
>>>
>>>
>>> Ciao,
>>> Gerhard
>>> _______________________________________________
>>> pgpool-general mailing list
>>> pgpool-general at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>