[pgpool-general: 3001] Re: Connections stuck in CLOSE_WAIT, again

Paul Wood pauwood at gumtree.com
Wed Jul 2 17:31:04 JST 2014


On Tue, Jun 24, 2014 at 12:59:30PM +0200, Juan Jose Perez wrote:
> 
>     The backend processes (there are two backends) are both
> connected in idle state.
>     Network connections between pgpool and postgres backends are in
> ESTABLISHED state. There are four connections with each backend
> since we are using four databases.
>     The network connection between pgpool child and the client,
> remains in CLOSE_WAIT state for a while.
> 
>     Regards,

I think this is the same or a related problem to the one I am seeing:
http://www.sraoss.jp/pipermail/pgpool-general/2014-May/002935.html

I've since seen the issue with connection caching turned off as well. In that
case, the backend connection shows as ESTABLISHED, but netstat shows unread
data in the Recv-Q.

> El 24/06/14 11:03, Tatsuo Ishii escribió:
> >Looks like pgpool-II waiting for data coming from the socket connected
> >to the backend connection.  I would like to know corresponding backend
> >was gone or not and network state (CLOSE_WAIT etc.).
> >
> >You can find backend pid (if exists) by using "show pool_pools"
> >command. Then you can get network state by using netstat.
> >
> >Could you please do that?
> >
> >Best regards,
> >--
> >Tatsuo Ishii
> >SRA OSS, Inc. Japan
> >English: http://www.sraoss.co.jp/index_en.php
> >Japanese:http://www.sraoss.co.jp
> >
> >>Thanks. I'll look into this.
> >>
> >>Best regards,
> >>--
> >>Tatsuo Ishii
> >>SRA OSS, Inc. Japan
> >>English: http://www.sraoss.co.jp/index_en.php
> >>Japanese:http://www.sraoss.co.jp
> >>
> >>>     This is the full output of gdb attached to one of the pgpool children
> >>>     in DISCARD state:
> >>>
> >>>#0 0x00007fa8da59fbe3 in select () from /lib/x86_64-linux-gnu/libc.so.6
> >>>#1 0x0000000000416621 in pool_check_fd (cp=cp at entry=0x1cd86c0) at
> >>>#pool_process_query.c:951
> >>>#2 0x0000000000416826 in pool_check_fd (cp=cp at entry=0x1cd86c0) at
> >>>#pool_process_query.c:971
> >>>#3 0x000000000041d99c in pool_read (cp=0x1cd86c0,
> >>>#buf=buf at entry=0x7fffb797e053, len=len at entry=1) at pool_stream.c:139
> >>>#4 0x000000000041c4ce in read_kind_from_backend
> >>>#(frontend=frontend at entry=0x1ccb7c0, backend=backend at entry=0x1cb5c28,
> >>>     decided_kind=decided_kind at entry=0x7fffb797e45f "") at
> >>>     pool_process_query.c:3771
> >>>#5 0x000000000044e291 in ProcessBackendResponse
> >>>#(frontend=frontend at entry=0x1ccb7c0, backend=backend at entry=0x1cb5c28,
> >>>     state=state at entry=0x7fffb797e4c8,
> >>>     num_fields=num_fields at entry=0x7fffb797e4c6) at
> >>>     pool_proto_modules.c:2742
> >>>#6 0x0000000000419d85 in pool_process_query
> >>>#(frontend=frontend at entry=0x1ccb7c0, backend=backend at entry=0x1cb5c28,
> >>>     reset_request=reset_request at entry=1) at pool_process_query.c:289
> >>>#7 0x000000000040c68f in do_child (unix_fd=unix_fd at entry=4,
> >>>#inet_fd=inet_fd at entry=5) at child.c:403
> >>>#8 0x00000000004077ff in fork_a_child (unix_fd=4, inet_fd=5, id=42) at
> >>>#main.c:1238
> >>>#9  0x0000000000407e63 in reaper () at main.c:2457
> >>>#10 reaper () at main.c:2369
> >>>#11 0x00000000004087ed in pool_sleep (second=<optimized out>) at
> >>>#main.c:2654
> >>>#12 0x00000000004064c9 in main (argc=<optimized out>, argv=<optimized
> >>>#out>) at main.c:836
> >>>
> >>>     Many thanks,
> >>>
> >>>-- 
> >>>Juanjo Pérez
> >>>
> >>>www.oteara.com
> >>>
> >>>El 20/06/14 01:19, Tatsuo Ishii escribió:
> >>>>Can you please post stack trace with symbols?  I need identify the
> >>>>code path. There are too may similar code path like this.
> >>>>
> >>>>Best regards,
> >>>>--
> >>>>Tatsuo Ishii
> >>>>SRA OSS, Inc. Japan
> >>>>English: http://www.sraoss.co.jp/index_en.php
> >>>>Japanese:http://www.sraoss.co.jp
> >>>>
> >>>>>I have the same problem with PuppetDB fortend.
> >>>>>When I stop the service of PuppetDB, all connections from pgpool is
> >>>>>CLOSE_WAIT and in DISCARD state, even with connection timeout set to 1
> >>>>>minute.
> >>>>>here is the summery from gdb:
> >>>>>
> >>>>>the hanged function is:
> >>>>>__select_no_cancel() from /lib64/libc.so.6.
> >>>>>
> >>>>>stacktrace output:
> >>>>>
> >>>>>0 __select_no_cancel() from /lib64/libc.so.6.
> >>>>>1 pool_check_fd
> >>>>>2 pool_read
> >>>>>3 read_kind_from_backend
> >>>>>4 ProcessBackendResponse
> >>>>>5 pool_process_query
> >>>>>6 fork_a_child()
> >>>>>7 reaper
> >>>>>8 pool_sleep
> >>>>>9 main
> >>>>>
> >>>>>
> >>>>>
> >>>>>On Thu, Jun 19, 2014 at 2:29 AM, Tatsuo Ishii <ishii at postgresql.org>
> >>>>>wrote:
> >>>>>
> >>>>>>You said your developer found pgpool child process in DISCARD state.
> >>>>>>
> >>>>>>Please attach gdb to the process in DISCARD state and take
> >>>>>>backtrace. Also I need actual netsta -anp outputs.
> >>>>>>
> >>>>>>Best regards,
> >>>>>>--
> >>>>>>Tatsuo Ishii
> >>>>>>SRA OSS, Inc. Japan
> >>>>>>English: http://www.sraoss.co.jp/index_en.php
> >>>>>>Japanese:http://www.sraoss.co.jp
> >>>>>>
> >>>>>>>      Hi Tatsuo,
> >>>>>>>
> >>>>>>>      We are facing this same problem, and as I see it remains unsolved (or
> >>>>>>>      perhaps they didn't report the solution).
> >>>>>>>      We are using pgpool 3.3.3, with two postgres 9.1 in streaming
> >>>>>>>      replication. The OS is Debian 3.2.54-2 (64bits).
> >>>>>>>
> >>>>>>>      The log shows messages like this:
> >>>>>>>
> >>>>>>>          ProcessFrontendResponse: failed to read kind from frontend.
> >>>>>>frontend
> >>>>>>>          abnormally exited
> >>>>>>>
> >>>>>>>      I'm not sure but it looks like one for each connection left unclosed.
> >>>>>>>
> >>>>>>>      Also, a ps -ef returns a lot of pgpool children in DISCARD state. And
> >>>>>>>      the netstat -anp, gives the correspondent TCP connection in
> >>>>>>CLOSE_WAIT
> >>>>>>>      state.
> >>>>>>>
> >>>>>>>      The developer has checked the client process is closing the
> >>>>>>>      connections and we try connecting directly to the postgres and it
> >>>>>>>      worked fine.
> >>>>>>>
> >>>>>>>      When you say "to attach debugger", do you mean change log_statment =
> >>>>>>>      true or debug_level to a value other than 0 or any other procedure?
> >>>>>>>
> >>>>>>>      Many thanks,
> >>>>>>>
> >>>>>>>--
> >>>>>>>Juanjo Pérez
> >>>>>>>
> >>>>>>>www.oteara.com
> >>>>>>>
> >>>>>>>El 02/04/14 01:05, Tatsuo Ishii escribió:
> >>>>>>>>>We have a client on 3.3 experiencing the the problem noted here:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>http://www.sraoss.jp/pipermail/pgpool-general/2012-December/001283.html
> >>>>>>>>>strace is showing the child processes at:
> >>>>>>>>>
> >>>>>>>>>      clone(child_stack=0,
> >>>>>>>>>      flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> >>>>>>>>>      child_tidptr=0x7fcf659f3a10) = 6258
> >>>>>>>>>
> >>>>>>>>>I didn't see any resolution of that issue; is there data we can gather
> >>>>>>>>>to assist?
> >>>>>>>>I followed the old 2012 posting with:
> >>>>>>>>
> >>>>>>>>>That means pgpool does not close the socket connected to by your
> >>>>>>>>>applications. Is it possible to attach debugger to such that pgpool
> >>>>>>>>>process to see what pgpool is doing?
> >>>>>>>>But I got no response until now. Can you please do this?
> >>>>>>>>
> >>>>>>>>Best regards,


More information about the pgpool-general mailing list