[Pgpool-general] seemingly hung pgpool process consuming 100% CPU

Lonni J Friedman netllama at gmail.com
Tue Sep 20 16:25:19 UTC 2011


Nevermind, I figured out what I was doing wrong.  Now I just need for
this to hang again.

On Tue, Sep 20, 2011 at 9:08 AM, Lonni J Friedman <netllama at gmail.com> wrote:
> I tried to do that, but pgpool refuses to start reporting:
> -bash: /usr/lib/debug/usr/sbin/pgpool.debug: bad ELF interpreter: No
> such file or directory
>
> I'm puzzled why it fails, as it was built on the same server where I
> built the (working) release build of pgpool.
>
> $ file /usr/lib/debug/usr/sbin/pgpool.debug
> /usr/lib/debug/usr/sbin/pgpool.debug: ELF 64-bit LSB executable,
> x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs),
> for GNU/Linux 2.6.18, not stripped
> $ file /usr/sbin/pgpool
> /usr/sbin/pgpool: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
> dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
>
>
>
>
> On Mon, Sep 19, 2011 at 7:10 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> It's really hard to find the cause of the problem from a stack trace
>> without symbol tables... Is it possible to reinstalll pgpool binary
>> with debug symbols?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>>> This happened again.  I ran the gdb command that you requested,
>>> however it occurred to me that the output may not be all that useful
>>> since I'm not running a debug build of pgpool:
>>> ###############
>>> # gdb pgpool 2343
>>> GNU gdb (GDB) Red Hat Enterprise Linux (7.1-29.el6)
>>> Copyright (C) 2010 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>>> and "show warranty" for details.
>>> This GDB was configured as "x86_64-redhat-linux-gnu".
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/>...
>>> Reading symbols from /usr/sbin/pgpool...(no debugging symbols found)...done.
>>> Attaching to program: /usr/sbin/pgpool, process 2343
>>> Reading symbols from /usr/lib64/libpq.so.5...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /usr/lib64/libpq.so.5
>>> Reading symbols from /usr/lib64/libpcp.so.0...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /usr/lib64/libpcp.so.0
>>> Reading symbols from /lib64/libpam.so.0...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libpam.so.0
>>> Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /usr/lib64/libssl.so.10
>>> Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging
>>> symbols found)...done.
>>> Loaded symbols for /usr/lib64/libcrypto.so.10
>>> Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libcrypt.so.1
>>> Reading symbols from /lib64/libresolv.so.2...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libresolv.so.2
>>> Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libnsl.so.1
>>> Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libm.so.6
>>> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libc.so.6
>>> Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging
>>> symbols found)...done.
>>> Loaded symbols for /lib64/libgssapi_krb5.so.2
>>> Reading symbols from /usr/lib64/libldap_r-2.4.so.2...(no debugging
>>> symbols found)...done.
>>> Loaded symbols for /usr/lib64/libldap_r-2.4.so.2
>>> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols
>>> found)...done.
>>> [Thread debugging using libthread_db enabled]
>>> Loaded symbols for /lib64/libpthread.so.0
>>> Reading symbols from /lib64/libaudit.so.1...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libaudit.so.1
>>> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libdl.so.2
>>> Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libkrb5.so.3
>>> Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libcom_err.so.2
>>> Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libk5crypto.so.3
>>> Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
>>> Loaded symbols for /lib64/libz.so.1
>>> Reading symbols from /usr/lib64/libfreebl3.so...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /usr/lib64/libfreebl3.so
>>> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging
>>> symbols found)...done.
>>> Loaded symbols for /lib64/ld-linux-x86-64.so.2
>>> Reading symbols from /lib64/libkrb5support.so.0...(no debugging
>>> symbols found)...done.
>>> Loaded symbols for /lib64/libkrb5support.so.0
>>> Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libkeyutils.so.1
>>> Reading symbols from /usr/lib64/liblber-2.4.so.2...(no debugging
>>> symbols found)...done.
>>> Loaded symbols for /usr/lib64/liblber-2.4.so.2
>>> Reading symbols from /usr/lib64/libsasl2.so.2...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /usr/lib64/libsasl2.so.2
>>> Reading symbols from /lib64/libselinux.so.1...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libselinux.so.1
>>> Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libnss_files.so.2
>>> Reading symbols from /lib64/libnss_dns.so.2...(no debugging symbols
>>> found)...done.
>>> Loaded symbols for /lib64/libnss_dns.so.2
>>> 0x000000000044790a in ?? ()
>>> Missing separate debuginfos, use: debuginfo-install pgpool-II-3.0.4-1.el6.x86_64
>>> (gdb) bt
>>> #0  0x000000000044790a in ?? ()
>>> #1  0x0000000000414547 in ?? ()
>>> #2  0x000000000041762e in ?? ()
>>> #3  0x000000000040a4cd in ?? ()
>>> #4  0x0000000000405345 in ?? ()
>>> #5  0x00000000004068dc in ?? ()
>>> #6  0x00000000004076dc in ?? ()
>>> #7  0x00000031ae41ec5d in __libc_start_main () from /lib64/libc.so.6
>>> #8  0x0000000000403bf9 in ?? ()
>>> #9  0x00007fff0663cfc8 in ?? ()
>>> #10 0x000000000000001c in ?? ()
>>> #11 0x0000000000000004 in ?? ()
>>> #12 0x00007fff0663d90b in ?? ()
>>> #13 0x00007fff0663dfe6 in ?? ()
>>> #14 0x00007fff0663dfe6 in ?? ()
>>> #15 0x00007fff0663dfe6 in ?? ()
>>> #16 0x0000000000000000 in ?? ()
>>> (gdb) cont
>>> Continuing.
>>>
>>> ###############
>>>
>>> The entire session completely hung at the end there.
>>>
>>>
>>> On Wed, Sep 14, 2011 at 3:57 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>>> Please use gdb. For example,
>>>>
>>>> become postgres user (or root user)
>>>> gdb pgpool 29191
>>>> bt
>>>> cont
>>>> bt
>>>> cont
>>>> :
>>>> :
>>>> :
>>>>
>>>> This will give us an idea where it's looping.
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>>> This problem has returned yet again:
>>>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>> 29191 postgres  20   0 80192  14m 1544 R 89.8  0.2  51:15.91 pgpool
>>>>>
>>>>> postgres 29191  3.4  0.1  80192 14728 ?        R    Sep13  51:40
>>>>> pgpool: lfriedman nightly 10.31.96.84(61698) idle
>>>>>
>>>>>
>>>>> I'd really appreciate some input on how to debug this.
>>>>>
>>>>>
>>>>> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netllama at gmail.com> wrote:
>>>>>> No one else has experienced this or has suggestions how to debug it?
>>>>>>
>>>>>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman <netllama at gmail.com> wrote:
>>>>>>> Greetings,
>>>>>>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
>>>>>>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2
>>>>>>> standby).  I'm seeing strange behavior where a single pgpool process
>>>>>>> seems to hang after some period of time, and then consume 100% of the
>>>>>>> CPU.  I've seen this behavior happen twice since last Friday (when
>>>>>>> pgpool was brought online in my production environment).  At the
>>>>>>> moment the current hung process looks like this in 'ps auxww' output:
>>>>>>>
>>>>>>> postgres 19838 98.7  0.0  68856  2904 ?        R    Sep06 1027:36
>>>>>>> pgpool: lfriedman nightly 10.31.45.20(58277) idle
>>>>>>>
>>>>>>>
>>>>>>> In top, I see:
>>>>>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>>>>> 19838 postgres  20   0 68856 2904 1072 R 100.0  0.0   1027:29 pgpool
>>>>>>>
>>>>>>>
>>>>>>> When to connect to the process with strace, there is no output, so I'm
>>>>>>> guessing the process is stuck spinning somewhere:
>>>>>>> # strace -p 19838
>>>>>>> Process 19838 attached - interrupt to quit
>>>>>>> ...
>>>>>>> ^CProcess 19838 detached
>>>>>>>
>>>>>>> One thing that i'm certain of is that the client IP (10.31.45.20)
>>>>>>> associated with the hung process has rebooted at least once since that
>>>>>>> process was spawned.  So pgpool seems to be in some confused state, as
>>>>>>> the client definitely severed the connection already.  I checked the
>>>>>>> pgpool log and there are no explicit references to PID 19838.  I'm at
>>>>>>> a loss how to debug this further, but clearly something is wrong
>>>>>>> somewhere, and this isn't normal/expected behavior.


More information about the Pgpool-general mailing list