[pgpool-general: 3045] Re: Hanging sockets and pool shutdown failure with connection caching

Tatsuo Ishii ishii at postgresql.org
Fri Jul 18 17:21:23 JST 2014


Looks similar problem reported at:

http://www.pgpool.net/mantisbt/view.php?id=107

I have committed a fix possibly solved your problem.

http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=52a3a8c6ab67be3e09db9a7bdfd8e74d81ae3687

Please try if you like.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Hi,
> 
> I'm sorry for late reply. However, I can't reproduce this problem.
> 
> If possible, could you please provide backtrace of the hanged procces
> using gdb? In addition, pgpool.log and pgpool.conf also would be helpful.
> 
> As a workarould, client_idle_limit is woakaround
> 
> On Fri, 30 May 2014 15:46:24 +0100
> Paul Wood <pauwood at gumtree.com> wrote:
> 
>> Hi,
>> 
>> I have a pgpool with four backends in master/slave streaming mode with
>> connection caching, running 3.3.3 and postgres 9.3. I'm seeing a problem
>> where if I restart the app, the existing pools don't shut down properly,
>> leaving connections hanging in CLOSE_WAIT. stracing the processes shows
>> that they're waiting on a select() on an connection to a backend.  The
>> strange thing is, it's always the same backend; I've tried swapping
>> backends around and the problem always stays on the 2nd backend (node 1,
>> or node 2 if node 1 is omitted). In every other regard, the backend
>> behaves normally and I can see the disconnections in the postgres logs
>> on the backend in question. If I set connection_cache = off, the problem
>> goes away and the pools shut down properly when the app restarts.
>> 
>> A search of the archives threw up this:
>> http://www.sraoss.jp/pipermail/pgpool-general/2012-December/001283.html
>> 
>> Here are the connections in CLOSE_WAIT:
>> 
>> # netstat -ap | grep app001.*CLO
>> tcp 0 0 pgpool001:postgresql app001:58410 CLOSE_WAIT  14149/pgpool:
>> tcp 0 0 pgpool001:postgresql app001:58046 CLOSE_WAIT  13684/pgpool:
>> tcp 0 0 pgpool001:postgresql app001:58442 CLOSE_WAIT  13835/pgpool:
>> 
>> They all seem to be waiting on the same FD (12 in this case):
>> 
>> # strace -p 14149
>> Process 14149 attached - interrupt to quit
>> select(13, [12], NULL, [12], NULL^C <unfinished ...>
>> 
>> These sockets are all connected to the same backend:
>> 
>> # readlink /proc/14149/fd/12
>> socket:[2247671]
>> 
>> # for s in $(
>>     for p in $(
>>         netstat -ap | grep app001.*CLO |
>>         sed 's/.*CLOSE_WAIT *\([0-9]*\).pgpool.*/\1/')
>>     do
>>         readlink /proc/$p/fd/12;
>>     done |
>>         sed 's/.*socket:\[\([0-9]*\)\]/\1/'
>> )
>> do
>>     netstat -apeev 2>/dev/null | grep $s
>> done
>> tcp 0 0 pgpool001:39096 pgslave001:postgresql ESTABLISHED 109 2247671 14149/pgpool:
>> tcp 0 0 pgpool001:39001 pgslave001:postgresql ESTABLISHED 109 2249926 13684/pgpool:
>> tcp 0 0 pgpool001:39110 pgslave001:postgresql ESTABLISHED 109 2250306 13835/pgpool:
>> 
>> # pcp_node_info 1 localhost 9898 postgres XXX 1
>> pgslave001 postgresql 2 0.312500
>> 
>> -- 
>> Paul Wood
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> 
> 
> -- 
> Yugo Nagata <nagata at sraoss.co.jp>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list