[pgpool-general: 125] Re: severe memory leak in 3.1.1

Sat Dec 31 06:52:52 JST 2011

On Wed, Dec 21, 2011 at 3:11 PM, Lonni J Friedman <netllama at gmail.com> wrote:
> On Wed, Dec 21, 2011 at 1:46 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>> On Wed, Dec 21, 2011 at 7:39 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>>>> Greetings,
>>>>> I'm running pgpool-3.1.1 on a Linux-x86_64 system with 8GB RAM and
>>>>> 2.5GB swap.  Ever since we upgraded from pgpool-3.0.3 to 3.0.4, we've
>>>>> seen a severe memory leak which consistently consumes all the RAM+swap
>>>>> on the system every 4-7 days.  The leak is so severe that sometimes
>>>>> the OOM killer cannot respond fast enough, and the entire server had
>>>>> locked up one time (with OOM killer spew on the local console).  We're
>>>>> confident that the leak is in pgpool, as if we stop and (re)start
>>>>> pgpool, all the memory in use is freed up immediately.
>>>>>
>>>>> We were hoping that upgrading to 3.1.1 would eliminate the problem,
>>>>> however we upgraded to 3.1.1 last Thursday, and as of this morning its
>>>>> obvious that the leak is still present.  Please advise what kind of
>>>>> information we can provide to debug this problem.
>>>>
>>>> Self contained test case is the best way to tuckle the problem. Can
>>>> you please provide?
>>>
>>> I'm afraid that I don't know what is causing this.  I'm going to need
>>> your assistance here.
>>
>> Please provide pgpool.conf. Also please let know me how did you start
>> pgpool (pgpool options) and what queries most likely cause the leak. I
>> need to reproduce your problem.
>
> I've attached pgpool.conf.  pgpool is invoked with:
> /usr/bin/pgpool -f /etc/pgpool-II/pgpool.conf -n
>
> I honestly have no idea which queries would likely cause a leak.
> Other than setting log_per_node_statement=on and log_statement=on and
> then tracking memory usage over time and correlating to the content in
> the log, is there any other way to obtain this information?
>
>>
>> In the mean time child_life_time or child_max_connections might help
>> you, because they force pgpool child process to exit periodically thus
>> free memory allocated by pgpool. If you still see leak, the problem
>> must be in the pgpool parent process, not child. In this case you can
>> see the pgpool parent process growing.
>
> Based on top output, it doesn't look like the parent PID is the
> problem, as the memory usage of it is relatively low.  However, I do
> see several other pgpool processes that are consuming what looks like
> a lot of memory (30+ MB each).  I know that they don't start off using
> anywhere near that much, but I'm not sure how normal it is for their
> memory usage to grow over time, or by how much.  I've attached top
> output.  PID 15045 is the parent.
>
> I've reduced child_life_time from 300 to 100, and increased
> child_max_connections from 0 to 250.  I'll see if this has any
> positive impact.

I've confirmed that making the above changes has worked around the
memory leak.  However, I still would like to track this down and get
it fixed.  I need some guidance on how to determine which query(s) are
contributing to the leak.

thanks