[pgpool-general: 4363] Re: Pgpool - connection hangs in DISCARD ALL

Wed Jan 27 17:12:46 JST 2016

On Wed, Jan 27, 2016 at 11:38 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:

> Usama,
>
> I have got no response from you but I have committed/pushed the patch
> to master and 3.4 stable branch. Please reply to me if this is not
> appropreate.
>

Sorry I forgot to reply to the thread, actually I was waiting for the reply
from the reporter if the patch solves the problem at his end, since we were
not able to reproduce the issue locally.
I think the patch and enhancements made by you are exactly what was needed
to solve the problem. Much thanks for pushing the commit, hope we will be
able to close all stuck connection issues whit this commit.

Thanks
Best regards
Muhammad Usama

>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> From: Tatsuo Ishii <ishii at postgresql.org>
> Subject: [pgpool-general: 4355] Re: Pgpool - connection hangs in DISCARD
> ALL
> Date: Sat, 23 Jan 2016 10:31:14 +0900 (JST)
> Message-ID: <20160123.103114.1108765446319109145.t-ishii at sraoss.co.jp>
>
> > Usama,
> >
> > I still cannot reproduce the problem and cannot confirm if your patch
> > fixes the problem or not. But one thing I notice while playing with
> > bug #147 (that's simiar case with some test script), the pgpool log
> > reported something like:
> >
> > 2016-01-23 09:49:17: pid 17950: ERROR:  unable to read data from frontend
> > 2016-01-23 09:49:17: pid 17950: DETAIL:  EOF encountered with frontend
> >
> > 2016-01-23 10:01:07: pid 19988: ERROR:  unable to read data from frontend
> > 2016-01-23 10:01:07: pid 19988: DETAIL:  socket read failed with an
> error "Connection reset by peer
> >
> > 2016-01-23 09:56:18: pid 19679: ERROR:  unable to flush data to frontend
> >
> > The last one is rare (like 1/50 probability), but it seems your patch
> > does not cover the last case. Attched is a patch to deal with the
> > case.
> >
> > Best regards,
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > English: http://www.sraoss.co.jp/index_en.php
> > Japanese:http://www.sraoss.co.jp
> >
> >> Hi
> >>
> >> I am looking into this issue. and unfortunately like Ishii-San I am also
> >> not able to reproduce it. But I found one issue in 3.4 that might cause
> the
> >> problem. Can you please try the attached patch if it solves the problem.
> >> Also, if the problem still persists, it would be really helpful if you
> >> could share the pgpool-II log.
> >>
> >> Thanks
> >> Best regards
> >> Muhammad Usama
> >>
> >>
> >>
> >> On Fri, Jan 8, 2016 at 12:00 PM, Gerhard Wiesinger <lists at wiesinger.com
> >
> >> wrote:
> >>
> >>> On 08.01.2016 07:32, Tatsuo Ishii wrote:
> >>>
> >>>> On 07.01.2016 22:32, Tatsuo Ishii wrote:
> >>>>>
> >>>>>> I heard similar reports but problem for developers is, nobody knows
> >>>>>> how to reproduce the hang (a stack trace after the hang is not very
> >>>>>> helpful).
> >>>>>>
> >>>>>>
> >>>>>> Yes, there were already 2? on the list.
> >>>>>
> >>>>> I can reproduce it randomly in about one day. I think stack traces
> >>>>> should be sufficient when e.g. debug code has been added.
> >>>>>
> >>>> No. That's just a consequence of many problems including this one.
> >>>>
> >>>> How to track the bug down?
> >>>>> Debug code?
> >>>>> Enhanced logging?
> >>>>>
> >>>> A debug logging would be helpful (start pgpool with -d option) but it
> >>>> will consume huge disk space if you want to save all off them. Is it
> >>>> possible to keep only last 30 minues debug logs before the problem
> >>>> occurs? I guess even 10 minutes is enough to know the cause of the
> >>>> problem.
> >>>>
> >>>
> >>> Can you try to reproduce it, too?
> >>>
> >>> My use case is very simple:
> >>> 1.) One persistent connection with username1/database1 doing an insert
> >>> every minute
> >>> 2.) Nagios bombs out about 10 simple SELECT queries every minute with
> >>> username2/database1 nearly at the same time (so it looks like sometimes
> >>> same backend connection can be used)
> >>> So should be easy to test. I close the unused backend connections after
> >>> 2s, so if you bomb e.g. every 10s should be fast to reproduce.
> >>>
> >>> Happens in my case in 0.5-24h.
> >>>
> >>>
> >>> Ciao,
> >>> Gerhard
> >>> _______________________________________________
> >>> pgpool-general mailing list
> >>> pgpool-general at pgpool.net
> >>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> >>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20160127/0ae4ac0b/attachment.html>