[Pgpool-general] seemingly hung pgpool process consuming 100% CPU
Lonni J Friedman
netllama at gmail.com
Wed Sep 14 17:24:52 UTC 2011
This problem has returned yet again:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29191 postgres 20 0 80192 14m 1544 R 89.8 0.2 51:15.91 pgpool
postgres 29191 3.4 0.1 80192 14728 ? R Sep13 51:40
pgpool: lfriedman nightly 10.31.96.84(61698) idle
I'd really appreciate some input on how to debug this.
On Fri, Sep 9, 2011 at 8:11 AM, Lonni J Friedman <netllama at gmail.com> wrote:
> No one else has experienced this or has suggestions how to debug it?
>
> On Wed, Sep 7, 2011 at 12:49 PM, Lonni J Friedman <netllama at gmail.com> wrote:
>> Greetings,
>> I'm running pgpool-3.0.4 on a Linux-x86_64 server serving as a load
>> balancer for a three server postgresql-9.0.4 cluster (1 master, 2
>> standby). I'm seeing strange behavior where a single pgpool process
>> seems to hang after some period of time, and then consume 100% of the
>> CPU. I've seen this behavior happen twice since last Friday (when
>> pgpool was brought online in my production environment). At the
>> moment the current hung process looks like this in 'ps auxww' output:
>>
>> postgres 19838 98.7 0.0 68856 2904 ? R Sep06 1027:36
>> pgpool: lfriedman nightly 10.31.45.20(58277) idle
>>
>>
>> In top, I see:
>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>> 19838 postgres 20 0 68856 2904 1072 R 100.0 0.0 1027:29 pgpool
>>
>>
>> When to connect to the process with strace, there is no output, so I'm
>> guessing the process is stuck spinning somewhere:
>> # strace -p 19838
>> Process 19838 attached - interrupt to quit
>> ...
>> ^CProcess 19838 detached
>>
>> One thing that i'm certain of is that the client IP (10.31.45.20)
>> associated with the hung process has rebooted at least once since that
>> process was spawned. So pgpool seems to be in some confused state, as
>> the client definitely severed the connection already. I checked the
>> pgpool log and there are no explicit references to PID 19838. I'm at
>> a loss how to debug this further, but clearly something is wrong
>> somewhere, and this isn't normal/expected behavior.
More information about the Pgpool-general
mailing list