[pgpool-general: 2759] All child processes hang at read_packets_and_process

Junegunn Choi junegunn.c at gmail.com
Mon Apr 14 12:42:33 JST 2014


All child processes hang at read_packets_and_process

Hi,

We're having occasional problem where all child processes hang at
`read_packets_and_process`, waiting for packets from backend when the
backend
connection is actually idle and not executing any query. If this happens,
our only option is to restart pgpool.

I'm aware of `child_idle_limit` parameter, but unfortunately we can't set
it to
a small value as we know that some of the queries takes a few minutes,
so it doesn't help much in reducing the downtime.

We have no idea when or why this happens, our pgpool process currently
handles around 500 connections/second, and it happens maybe once in a
couple of days.

Any idea what could have gone wrong? Any help is greatly appreciated.

# Callstack of each child process (all identical):

    (gdb) where
    #0  0x0000003d222cdaf3 in __select_nocancel () from /lib64/libc.so.6
    #1  0x000000000041addf in read_packets_and_process (frontend=0xac6cb60,
backend=0xac5da40, reset_request=0, state=0x7fffb871aad0,
num_fields=0x7fffb871aad6, cont=0x7fffb871aacc "\001")
        at pool_process_query.c:4859
    #2  0x000000000041b741 in pool_process_query (frontend=0xac6cb60,
backend=0xac5da40, reset_request=0) at pool_process_query.c:260
    #3  0x000000000040ad7a in do_child (unix_fd=5, inet_fd=6) at child.c:355
    #4  0x000000000040455f in fork_a_child (unix_fd=5, inet_fd=6, id=91) at
main.c:1258
    #5  0x0000000000404887 in reaper () at main.c:2482
    #6  0x0000000000404c15 in pool_sleep (second=<value optimized out>) at
main.c:2679
    #7  0x00000000004079fa in main (argc=<value optimized out>, argv=<value
optimized out>) at main.c:856

# `ps` output when it happens:

    pgpool   12894  0.0  0.0  22452  1800 ?        S    Mar26   0:00
 \_ /db/pgpool/bin/pgpool -n -D -f /db/pgpool/conf/pg8/pgpool.conf
    pgpool   12918  0.0  0.0  22536  1356 ?        S    Mar26   0:23      |
  \_ pgpool: PCP: wait for connection request
    pgpool   12919  0.0  0.0  22452  1004 ?        S    Mar26   0:00      |
  \_ pgpool: worker process
    pgpool   26236  0.0  0.0  27240  2616 ?        S    07:54   0:10      |
  \_ pgpool: cloud cloud 10.27.18.56(17208) idle
    pgpool   26936  0.0  0.0  27068  2468 ?        S    07:59   0:06      |
  \_ pgpool: cloud cloud 10.30.25.96(24663) idle
    pgpool   27422  0.1  0.0  27308  2684 ?        S    08:02   0:12      |
  \_ pgpool: cloud cloud 10.27.18.56(49400) idle
    pgpool   29357  0.0  0.0  27252  2644 ?        S    08:13   0:11      |
  \_ pgpool: cloud cloud 10.27.33.88(17186) idle
    pgpool    2363  0.1  0.0  27244  2620 ?        S    08:45   0:10      |
  \_ pgpool: cloud cloud 10.30.25.96(35038) idle
    pgpool    3672  0.1  0.0  27216  2612 ?        S    08:54   0:10      |
  \_ pgpool: cloud cloud 10.27.18.56(49399) idle
    pgpool    4969  0.1  0.0  27168  2568 ?        S    09:00   0:09      |
  \_ pgpool: cloud cloud 10.27.33.88(16792) idle
    pgpool    5981  0.1  0.0  27144  2540 ?        S    09:06   0:08      |
  \_ pgpool: cloud cloud 10.27.33.88(16821) idle
    pgpool   10072  0.1  0.0  27308  2704 ?        S    09:30   0:11      |
  \_ pgpool: cloud cloud 10.30.25.96(47224) idle
    pgpool   10825  0.0  0.0  27032  2404 ?        S    09:35   0:05      |
  \_ pgpool: cloud cloud 10.27.33.88(22482) idle
    pgpool   12839  0.0  0.0  26960  2344 ?        S    09:46   0:04      |
  \_ pgpool: cloud cloud 10.27.33.88(22429) idle
    pgpool   13630  0.0  0.0  26924  2304 ?        S    09:51   0:03      |
  \_ pgpool: cloud cloud 10.27.18.56(45126) idle
    pgpool   13650  0.0  0.0  26932  2320 ?        S    09:51   0:03      |
  \_ pgpool: cloud cloud 10.30.25.96(35105) idle
    pgpool   13869  0.0  0.0  26924  2300 ?        S    09:52   0:03      |
  \_ pgpool: cloud cloud 10.30.25.96(34504) idle
    pgpool   15023  0.0  0.0  26916  2284 ?        S    09:58   0:03      |
  \_ pgpool: cloud cloud 10.27.18.56(44532) idle
    pgpool   18577  0.2  0.0  27200  2580 ?        S    10:16   0:07      |
  \_ pgpool: cloud cloud 10.27.33.88(29036) idle
    pgpool   27679  0.2  0.0  27080  2464 ?        S    10:37   0:05      |
  \_ pgpool: cloud cloud 10.27.33.88(36875) idle
    pgpool   29064  0.1  0.0  26988  2364 ?        S    10:39   0:03      |
  \_ pgpool: cloud cloud 10.30.25.96(46962) idle
    pgpool   30220  0.1  0.0  26936  2336 ?        S    10:40   0:03      |
  \_ pgpool: cloud cloud 10.27.18.56(56669) idle
    pgpool   10215  0.0  0.0  26752  2148 ?        S    10:55   0:00      |
  \_ pgpool: cloud cloud 10.30.25.96(46863) idle
    pgpool   26719  0.0  0.0  26752  1616 ?        S    11:16   0:00      |
  \_ pgpool: cloud cloud 10.27.18.56(58985) idle


Thanks,
Junegunn Choi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20140414/8fa77fdb/attachment.html>


More information about the pgpool-general mailing list