[pgpool-general: 2897] Hanging sockets and pool shutdown failure with connection caching

Paul Wood pauwood at gumtree.com
Fri May 30 23:46:24 JST 2014


Hi,

I have a pgpool with four backends in master/slave streaming mode with
connection caching, running 3.3.3 and postgres 9.3. I'm seeing a problem
where if I restart the app, the existing pools don't shut down properly,
leaving connections hanging in CLOSE_WAIT. stracing the processes shows
that they're waiting on a select() on an connection to a backend.  The
strange thing is, it's always the same backend; I've tried swapping
backends around and the problem always stays on the 2nd backend (node 1,
or node 2 if node 1 is omitted). In every other regard, the backend
behaves normally and I can see the disconnections in the postgres logs
on the backend in question. If I set connection_cache = off, the problem
goes away and the pools shut down properly when the app restarts.

A search of the archives threw up this:
http://www.sraoss.jp/pipermail/pgpool-general/2012-December/001283.html

Here are the connections in CLOSE_WAIT:

# netstat -ap | grep app001.*CLO
tcp 0 0 pgpool001:postgresql app001:58410 CLOSE_WAIT  14149/pgpool:
tcp 0 0 pgpool001:postgresql app001:58046 CLOSE_WAIT  13684/pgpool:
tcp 0 0 pgpool001:postgresql app001:58442 CLOSE_WAIT  13835/pgpool:

They all seem to be waiting on the same FD (12 in this case):

# strace -p 14149
Process 14149 attached - interrupt to quit
select(13, [12], NULL, [12], NULL^C <unfinished ...>

These sockets are all connected to the same backend:

# readlink /proc/14149/fd/12
socket:[2247671]

# for s in $(
    for p in $(
        netstat -ap | grep app001.*CLO |
        sed 's/.*CLOSE_WAIT *\([0-9]*\).pgpool.*/\1/')
    do
        readlink /proc/$p/fd/12;
    done |
        sed 's/.*socket:\[\([0-9]*\)\]/\1/'
)
do
    netstat -apeev 2>/dev/null | grep $s
done
tcp 0 0 pgpool001:39096 pgslave001:postgresql ESTABLISHED 109 2247671 14149/pgpool:
tcp 0 0 pgpool001:39001 pgslave001:postgresql ESTABLISHED 109 2249926 13684/pgpool:
tcp 0 0 pgpool001:39110 pgslave001:postgresql ESTABLISHED 109 2250306 13835/pgpool:

# pcp_node_info 1 localhost 9898 postgres XXX 1
pgslave001 postgresql 2 0.312500

-- 
Paul Wood


More information about the pgpool-general mailing list