[Pgpool-general] seemingly hung pgpool process consuming 100% CPU

Lonni J Friedman netllama at gmail.com
Wed Sep 14 23:06:10 UTC 2011


Thanks for your reply.  I'll do this the next time this happens (which
will likely be within a few days based on history).

On Wed, Sep 14, 2011 at 3:57 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> Please use gdb. For example,
>
> become postgres user (or root user)
> gdb pgpool 29191
> bt
> cont
> bt
> cont
> :
> :
> :
>
> This will give us an idea where it's looping.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>
>> This problem has returned yet again:
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 29191 postgres  20   0 80192  14m 1544 R 89.8  0.2  51:15.91 pgpool
>>
>> postgres 29191  3.4  0.1  80192 14728 ?        R    Sep13  51:40
>> pgpool: lfriedman nightly 10.31.96.84(61698) idle
>>
>>
>> I'd really appreciate some input on how to debug this.
>>
>>
>> On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netllama at gmail.com> wrote:
>>> No one else has experienced this or has suggestions how to debug it?
>>>
>>> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman <netllama at gmail.com> wrote:
>>>> Greetings,
>>>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
>>>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2
>>>> standby).  I'm seeing strange behavior where a single pgpool process
>>>> seems to hang after some period of time, and then consume 100% of the
>>>> CPU.  I've seen this behavior happen twice since last Friday (when
>>>> pgpool was brought online in my production environment).  At the
>>>> moment the current hung process looks like this in 'ps auxww' output:
>>>>
>>>> postgres 19838 98.7  0.0  68856  2904 ?        R    Sep06 1027:36
>>>> pgpool: lfriedman nightly 10.31.45.20(58277) idle
>>>>
>>>>
>>>> In top, I see:
>>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>> 19838 postgres  20   0 68856 2904 1072 R 100.0  0.0   1027:29 pgpool
>>>>
>>>>
>>>> When to connect to the process with strace, there is no output, so I'm
>>>> guessing the process is stuck spinning somewhere:
>>>> # strace -p 19838
>>>> Process 19838 attached - interrupt to quit
>>>> ...
>>>> ^CProcess 19838 detached
>>>>
>>>> One thing that i'm certain of is that the client IP (10.31.45.20)
>>>> associated with the hung process has rebooted at least once since that
>>>> process was spawned.  So pgpool seems to be in some confused state, as
>>>> the client definitely severed the connection already.  I checked the
>>>> pgpool log and there are no explicit references to PID 19838.  I'm at
>>>> a loss how to debug this further, but clearly something is wrong
>>>> somewhere, and this isn't normal/expected behavior.


More information about the Pgpool-general mailing list