[pgpool-general: 3023] Re: Hanging sockets and pool shutdown failure with connection caching

Yugo Nagata nagata at sraoss.co.jp
Mon Jul 14 16:34:46 JST 2014


Hi,

I'm sorry for late reply. However, I can't reproduce this problem.

If possible, could you please provide backtrace of the hanged procces
using gdb? In addition, pgpool.log and pgpool.conf also would be helpful.

As a workarould, client_idle_limit is woakaround

On Fri, 30 May 2014 15:46:24 +0100
Paul Wood <pauwood at gumtree.com> wrote:

> Hi,
> 
> I have a pgpool with four backends in master/slave streaming mode with
> connection caching, running 3.3.3 and postgres 9.3. I'm seeing a problem
> where if I restart the app, the existing pools don't shut down properly,
> leaving connections hanging in CLOSE_WAIT. stracing the processes shows
> that they're waiting on a select() on an connection to a backend.  The
> strange thing is, it's always the same backend; I've tried swapping
> backends around and the problem always stays on the 2nd backend (node 1,
> or node 2 if node 1 is omitted). In every other regard, the backend
> behaves normally and I can see the disconnections in the postgres logs
> on the backend in question. If I set connection_cache = off, the problem
> goes away and the pools shut down properly when the app restarts.
> 
> A search of the archives threw up this:
> http://www.sraoss.jp/pipermail/pgpool-general/2012-December/001283.html
> 
> Here are the connections in CLOSE_WAIT:
> 
> # netstat -ap | grep app001.*CLO
> tcp 0 0 pgpool001:postgresql app001:58410 CLOSE_WAIT  14149/pgpool:
> tcp 0 0 pgpool001:postgresql app001:58046 CLOSE_WAIT  13684/pgpool:
> tcp 0 0 pgpool001:postgresql app001:58442 CLOSE_WAIT  13835/pgpool:
> 
> They all seem to be waiting on the same FD (12 in this case):
> 
> # strace -p 14149
> Process 14149 attached - interrupt to quit
> select(13, [12], NULL, [12], NULL^C <unfinished ...>
> 
> These sockets are all connected to the same backend:
> 
> # readlink /proc/14149/fd/12
> socket:[2247671]
> 
> # for s in $(
>     for p in $(
>         netstat -ap | grep app001.*CLO |
>         sed 's/.*CLOSE_WAIT *\([0-9]*\).pgpool.*/\1/')
>     do
>         readlink /proc/$p/fd/12;
>     done |
>         sed 's/.*socket:\[\([0-9]*\)\]/\1/'
> )
> do
>     netstat -apeev 2>/dev/null | grep $s
> done
> tcp 0 0 pgpool001:39096 pgslave001:postgresql ESTABLISHED 109 2247671 14149/pgpool:
> tcp 0 0 pgpool001:39001 pgslave001:postgresql ESTABLISHED 109 2249926 13684/pgpool:
> tcp 0 0 pgpool001:39110 pgslave001:postgresql ESTABLISHED 109 2250306 13835/pgpool:
> 
> # pcp_node_info 1 localhost 9898 postgres XXX 1
> pgslave001 postgresql 2 0.312500
> 
> -- 
> Paul Wood
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


-- 
Yugo Nagata <nagata at sraoss.co.jp>


More information about the pgpool-general mailing list