[Pgpool-general] seemingly hung pgpool process consuming 100% CPU

Lonni J Friedman netllama at gmail.com
Tue Nov 22 18:11:27 UTC 2011


This hadn't reproduced in a long time, but we ugpraded to pgpool-3.1 a
week ago, and this morning I found a pgpool process that was consuming
100% CPU, and had been running for a week (although wasn't consuming
100% CPU the entire time).  Something else weird is that it showed an
active, idle connection from a client system which had only been up
for the past 21 hours.  Anyway, here's the backtrace from the process
(gdb hung at the very bottom):

[root ~]# gdb pgpool 31293
GNU gdb (GDB) Red Hat Enterprise Linux (7.1-29.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/sbin/pgpool...Reading symbols from
/usr/lib/debug/usr/sbin/pgpool.debug...done.
done.
Attaching to program: /usr/sbin/pgpool, process 31293
Reading symbols from /usr/lib64/libpq.so.5...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libpq.so.5
Reading symbols from /usr/lib64/libpcp.so.0...Reading symbols from
/usr/lib/debug/usr/lib64/libpcp.so.0.0.0.debug...done.
done.
Loaded symbols for /usr/lib64/libpcp.so.0
Reading symbols from /lib64/libpam.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libpam.so.0
Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libssl.so.10
Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib64/libcrypto.so.10
Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging
symbols found)...done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /usr/lib64/libldap_r-2.4.so.2...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib64/libldap_r-2.4.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols
found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libaudit.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libaudit.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libk5crypto.so.3
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /usr/lib64/libfreebl3.so...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libfreebl3.so
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging
symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libkrb5support.so.0...(no debugging
symbols found)...done.
Loaded symbols for /lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libkeyutils.so.1
Reading symbols from /usr/lib64/liblber-2.4.so.2...(no debugging
symbols found)...done.
Loaded symbols for /usr/lib64/liblber-2.4.so.2
Reading symbols from /usr/lib64/libsasl2.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib64/libsasl2.so.2
Reading symbols from /lib64/libselinux.so.1...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x00000000004192c0 in pool_process_query (frontend=0x2dd8fd0,
backend=0x187d540,
    reset_request=<value optimized out>) at pool_process_query.c:379
379				if (!pool_read_buffer_is_empty(MASTER(backend)) ||
pool_is_query_in_progress())
Missing separate debuginfos, use: debuginfo-install
audit-libs-2.0.4-1.el6.x86_64 cyrus-sasl-lib-2.1.23-8.el6.x86_64
glibc-2.12-1.7.el6.x86_64 keyutils-libs-1.4-1.el6.x86_64
krb5-libs-1.8.2-3.el6.x86_64 libcom_err-1.41.12-3.el6.x86_64
libselinux-2.0.94-2.el6.x86_64
nss-softokn-freebl-3.12.7-1.1.el6.x86_64 openldap-2.4.19-15.el6.x86_64
openssl-1.0.0-4.el6.x86_64 pam-1.1.1-4.el6.x86_64
postgresql-libs-8.4.4-2.el6.x86_64 zlib-1.2.3-25.el6.x86_64
(gdb) bt
#0  0x00000000004192c0 in pool_process_query (frontend=0x2dd8fd0,
backend=0x187d540,
    reset_request=<value optimized out>) at pool_process_query.c:379
#1  0x000000000040ae42 in do_child (unix_fd=3, inet_fd=<value
optimized out>) at child.c:354
#2  0x00000000004054c5 in fork_a_child (unix_fd=3, inet_fd=4, id=152)
at main.c:1072
#3  0x0000000000407b1c in main (argc=<value optimized out>,
argv=<value optimized out>)
    at main.c:549
(gdb) cont
Continuing.

On Tue, Sep 20, 2011 at 9:25 AM, Lonni J Friedman <netllama at gmail.com> wrote:
> Nevermind, I figured out what I was doing wrong.  Now I just need for
> this to hang again.
>
> On Tue, Sep 20, 2011 at 9:08 AM, Lonni J Friedman <netllama at gmail.com> wrote:
>> I tried to do that, but pgpool refuses to start reporting:
>> -bash: /usr/lib/debug/usr/sbin/pgpool.debug: bad ELF interpreter: No
>> such file or directory
>>
>> I'm puzzled why it fails, as it was built on the same server where I
>> built the (working) release build of pgpool.
>>
>> $ file /usr/lib/debug/usr/sbin/pgpool.debug
>> /usr/lib/debug/usr/sbin/pgpool.debug: ELF 64-bit LSB executable,
>> x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs),
>> for GNU/Linux 2.6.18, not stripped
>> $ file /usr/sbin/pgpool
>> /usr/sbin/pgpool: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
>> dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
>>
>>
>>
>>
>> On Mon, Sep 19, 2011 at 7:10 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>> It's really hard to find the cause of the problem from a stack trace
>>> without symbol tables... Is it possible to reinstalll pgpool binary
>>> with debug symbols?
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese: http://www.sraoss.co.jp
>>>
>>>> This happened again.  I ran the gdb command that you requested,
>>>> however it occurred to me that the output may not be all that useful
>>>> since I'm not running a debug build of pgpool:
>>>> ###############
>>>> # gdb pgpool 2343
>>>> GNU gdb (GDB) Red Hat Enterprise Linux (7.1-29.el6)
>>>> Copyright (C) 2010 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>>>> and "show warranty" for details.
>>>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>>> For bug reporting instructions, please see:
>>>> <http://www.gnu.org/software/gdb/bugs/>...
>>>> Reading symbols from /usr/sbin/pgpool...(no debugging symbols found)...done.
>>>> Attaching to program: /usr/sbin/pgpool, process 2343
>>>> Reading symbols from /usr/lib64/libpq.so.5...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /usr/lib64/libpq.so.5
>>>> Reading symbols from /usr/lib64/libpcp.so.0...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /usr/lib64/libpcp.so.0
>>>> Reading symbols from /lib64/libpam.so.0...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libpam.so.0
>>>> Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /usr/lib64/libssl.so.10
>>>> Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging
>>>> symbols found)...done.
>>>> Loaded symbols for /usr/lib64/libcrypto.so.10
>>>> Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libcrypt.so.1
>>>> Reading symbols from /lib64/libresolv.so.2...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libresolv.so.2
>>>> Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libnsl.so.1
>>>> Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libm.so.6
>>>> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libc.so.6
>>>> Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging
>>>> symbols found)...done.
>>>> Loaded symbols for /lib64/libgssapi_krb5.so.2
>>>> Reading symbols from /usr/lib64/libldap_r-2.4.so.2...(no debugging
>>>> symbols found)...done.
>>>> Loaded symbols for /usr/lib64/libldap_r-2.4.so.2
>>>> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols
>>>> found)...done.
>>>> [Thread debugging using libthread_db enabled]
>>>> Loaded symbols for /lib64/libpthread.so.0
>>>> Reading symbols from /lib64/libaudit.so.1...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libaudit.so.1
>>>> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libdl.so.2
>>>> Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libkrb5.so.3
>>>> Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libcom_err.so.2
>>>> Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libk5crypto.so.3
>>>> Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
>>>> Loaded symbols for /lib64/libz.so.1
>>>> Reading symbols from /usr/lib64/libfreebl3.so...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /usr/lib64/libfreebl3.so
>>>> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging
>>>> symbols found)...done.
>>>> Loaded symbols for /lib64/ld-linux-x86-64.so.2
>>>> Reading symbols from /lib64/libkrb5support.so.0...(no debugging
>>>> symbols found)...done.
>>>> Loaded symbols for /lib64/libkrb5support.so.0
>>>> Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libkeyutils.so.1
>>>> Reading symbols from /usr/lib64/liblber-2.4.so.2...(no debugging
>>>> symbols found)...done.
>>>> Loaded symbols for /usr/lib64/liblber-2.4.so.2
>>>> Reading symbols from /usr/lib64/libsasl2.so.2...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /usr/lib64/libsasl2.so.2
>>>> Reading symbols from /lib64/libselinux.so.1...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libselinux.so.1
>>>> Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libnss_files.so.2
>>>> Reading symbols from /lib64/libnss_dns.so.2...(no debugging symbols
>>>> found)...done.
>>>> Loaded symbols for /lib64/libnss_dns.so.2
>>>> 0x000000000044790a in ?? ()
>>>> Missing separate debuginfos, use: debuginfo-install pgpool-II-3.0.4-1.el6.x86_64
>>>> (gdb) bt
>>>> #0  0x000000000044790a in ?? ()
>>>> #1  0x0000000000414547 in ?? ()
>>>> #2  0x000000000041762e in ?? ()
>>>> #3  0x000000000040a4cd in ?? ()
>>>> #4  0x0000000000405345 in ?? ()
>>>> #5  0x00000000004068dc in ?? ()
>>>> #6  0x00000000004076dc in ?? ()
>>>> #7  0x00000031ae41ec5d in __libc_start_main () from /lib64/libc.so.6
>>>> #8  0x0000000000403bf9 in ?? ()
>>>> #9  0x00007fff0663cfc8 in ?? ()
>>>> #10 0x000000000000001c in ?? ()
>>>> #11 0x0000000000000004 in ?? ()
>>>> #12 0x00007fff0663d90b in ?? ()
>>>> #13 0x00007fff0663dfe6 in ?? ()
>>>> #14 0x00007fff0663dfe6 in ?? ()
>>>> #15 0x00007fff0663dfe6 in ?? ()
>>>> #16 0x0000000000000000 in ?? ()
>>>> (gdb) cont
>>>> Continuing.
>>>>
>>>> ###############
>>>>
>>>> The entire session completely hung at the end there.
>>>>
>>>>
>>>> On Wed, Sep 14, 2011 at 3:57 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>>>> Please use gdb. For example,
>>>>>
>>>>> become postgres user (or root user)
>>>>> gdb pgpool 29191
>>>>> bt
>>>>> cont
>>>>> bt
>>>>> cont
>>>>> :
>>>>> :
>>>>> :
>>>>>
>>>>> This will give us an idea where it's looping.
>>>>> --
>>>>> Tatsuo Ishii
>>>>> SRA OSS, Inc. Japan
>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>> Japanese: http://www.sraoss.co.jp
>>>>>
>>>>>> This problem has returned yet again:
>>>>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>>> 29191 postgres  20   0 80192  14m 1544 R 89.8  0.2  51:15.91 pgpool
>>>>>>
>>>>>> postgres 29191  3.4  0.1  80192 14728 ?        R    Sep13  51:40
>>>>>> pgpool: lfriedman nightly 10.31.96.84(61698) idle
>>>>>>
>>>>>>
>>>>>> I'd really appreciate some input on how to debug this.
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netllama at gmail.com> wrote:
>>>>>>> No one else has experienced this or has suggestions how to debug it?
>>>>>>>
>>>>>>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman <netllama at gmail.com> wrote:
>>>>>>>> Greetings,
>>>>>>>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
>>>>>>>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2
>>>>>>>> standby).  I'm seeing strange behavior where a single pgpool process
>>>>>>>> seems to hang after some period of time, and then consume 100% of the
>>>>>>>> CPU.  I've seen this behavior happen twice since last Friday (when
>>>>>>>> pgpool was brought online in my production environment).  At the
>>>>>>>> moment the current hung process looks like this in 'ps auxww' output:
>>>>>>>>
>>>>>>>> postgres 19838 98.7  0.0  68856  2904 ?        R    Sep06 1027:36
>>>>>>>> pgpool: lfriedman nightly 10.31.45.20(58277) idle
>>>>>>>>
>>>>>>>>
>>>>>>>> In top, I see:
>>>>>>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>>>>> 19838 postgres  20   0 68856 2904 1072 R 100.0  0.0   1027:29 pgpool
>>>>>>>>
>>>>>>>>
>>>>>>>> When to connect to the process with strace, there is no output, so I'm
>>>>>>>> guessing the process is stuck spinning somewhere:
>>>>>>>> # strace -p 19838
>>>>>>>> Process 19838 attached - interrupt to quit
>>>>>>>> ...
>>>>>>>> ^CProcess 19838 detached
>>>>>>>>
>>>>>>>> One thing that i'm certain of is that the client IP (10.31.45.20)
>>>>>>>> associated with the hung process has rebooted at least once since that
>>>>>>>> process was spawned.  So pgpool seems to be in some confused state, as
>>>>>>>> the client definitely severed the connection already.  I checked the
>>>>>>>> pgpool log and there are no explicit references to PID 19838.  I'm at
>>>>>>>> a loss how to debug this further, but clearly something is wrong
>>>>>>>> somewhere, and this isn't normal/expected behavior.
>


More information about the Pgpool-general mailing list