[pgpool-general: 2956] Re: Connections stuck in CLOSE_WAIT, again

Juan Jose Perez jperez at oteara.com
Tue Jun 24 01:46:15 JST 2014


     This is the full output of gdb attached to one of the pgpool 
children in DISCARD state:

#0  0x00007fa8da59fbe3 in select () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x0000000000416621 in pool_check_fd (cp=cp at entry=0x1cd86c0) at 
pool_process_query.c:951
#2  0x0000000000416826 in pool_check_fd (cp=cp at entry=0x1cd86c0) at 
pool_process_query.c:971
#3  0x000000000041d99c in pool_read (cp=0x1cd86c0, 
buf=buf at entry=0x7fffb797e053, len=len at entry=1) at pool_stream.c:139
#4  0x000000000041c4ce in read_kind_from_backend 
(frontend=frontend at entry=0x1ccb7c0, backend=backend at entry=0x1cb5c28,
     decided_kind=decided_kind at entry=0x7fffb797e45f "") at 
pool_process_query.c:3771
#5  0x000000000044e291 in ProcessBackendResponse 
(frontend=frontend at entry=0x1ccb7c0, backend=backend at entry=0x1cb5c28,
     state=state at entry=0x7fffb797e4c8, 
num_fields=num_fields at entry=0x7fffb797e4c6) at pool_proto_modules.c:2742
#6  0x0000000000419d85 in pool_process_query 
(frontend=frontend at entry=0x1ccb7c0, backend=backend at entry=0x1cb5c28,
     reset_request=reset_request at entry=1) at pool_process_query.c:289
#7  0x000000000040c68f in do_child (unix_fd=unix_fd at entry=4, 
inet_fd=inet_fd at entry=5) at child.c:403
#8  0x00000000004077ff in fork_a_child (unix_fd=4, inet_fd=5, id=42) at 
main.c:1238
#9  0x0000000000407e63 in reaper () at main.c:2457
#10 reaper () at main.c:2369
#11 0x00000000004087ed in pool_sleep (second=<optimized out>) at main.c:2654
#12 0x00000000004064c9 in main (argc=<optimized out>, argv=<optimized 
out>) at main.c:836

     Many thanks,

-- 
Juanjo Pérez

www.oteara.com

El 20/06/14 01:19, Tatsuo Ishii escribió:
> Can you please post stack trace with symbols?  I need identify the
> code path. There are too may similar code path like this.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
>> I have the same problem with PuppetDB fortend.
>> When I stop the service of PuppetDB, all connections from pgpool is
>> CLOSE_WAIT and in DISCARD state, even with connection timeout set to 1
>> minute.
>> here is the summery from gdb:
>>
>> the hanged function is:
>> __select_no_cancel() from /lib64/libc.so.6.
>>
>> stacktrace output:
>>
>> 0 __select_no_cancel() from /lib64/libc.so.6.
>> 1 pool_check_fd
>> 2 pool_read
>> 3 read_kind_from_backend
>> 4 ProcessBackendResponse
>> 5 pool_process_query
>> 6 fork_a_child()
>> 7 reaper
>> 8 pool_sleep
>> 9 main
>>
>>
>>
>> On Thu, Jun 19, 2014 at 2:29 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>
>>> You said your developer found pgpool child process in DISCARD state.
>>>
>>> Please attach gdb to the process in DISCARD state and take
>>> backtrace. Also I need actual netsta -anp outputs.
>>>
>>> Best regards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese:http://www.sraoss.co.jp
>>>
>>>>      Hi Tatsuo,
>>>>
>>>>      We are facing this same problem, and as I see it remains unsolved (or
>>>>      perhaps they didn't report the solution).
>>>>      We are using pgpool 3.3.3, with two postgres 9.1 in streaming
>>>>      replication. The OS is Debian 3.2.54-2 (64bits).
>>>>
>>>>      The log shows messages like this:
>>>>
>>>>          ProcessFrontendResponse: failed to read kind from frontend.
>>> frontend
>>>>          abnormally exited
>>>>
>>>>      I'm not sure but it looks like one for each connection left unclosed.
>>>>
>>>>      Also, a ps -ef returns a lot of pgpool children in DISCARD state. And
>>>>      the netstat -anp, gives the correspondent TCP connection in
>>> CLOSE_WAIT
>>>>      state.
>>>>
>>>>      The developer has checked the client process is closing the
>>>>      connections and we try connecting directly to the postgres and it
>>>>      worked fine.
>>>>
>>>>      When you say "to attach debugger", do you mean change log_statment =
>>>>      true or debug_level to a value other than 0 or any other procedure?
>>>>
>>>>      Many thanks,
>>>>
>>>> --
>>>> Juanjo Pérez
>>>>
>>>> www.oteara.com
>>>>
>>>> El 02/04/14 01:05, Tatsuo Ishii escribió:
>>>>>> We have a client on 3.3 experiencing the the problem noted here:
>>>>>>
>>>>>>
>>> http://www.sraoss.jp/pipermail/pgpool-general/2012-December/001283.html
>>>>>> strace is showing the child processes at:
>>>>>>
>>>>>>      clone(child_stack=0,
>>>>>>      flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
>>>>>>      child_tidptr=0x7fcf659f3a10) = 6258
>>>>>>
>>>>>> I didn't see any resolution of that issue; is there data we can gather
>>>>>> to assist?
>>>>> I followed the old 2012 posting with:
>>>>>
>>>>>> That means pgpool does not close the socket connected to by your
>>>>>> applications. Is it possible to attach debugger to such that pgpool
>>>>>> process to see what pgpool is doing?
>>>>> But I got no response until now. Can you please do this?
>>>>>
>>>>> Best regards,
>>>>> --
>>>>> Tatsuo Ishii
>>>>> SRA OSS, Inc. Japan
>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>> Japanese: http://www.sraoss.co.jp
>>>>> _______________________________________________
>>>>> pgpool-general mailing list
>>>>> pgpool-general at pgpool.net
>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>> _______________________________________________
>>> pgpool-general mailing list
>>> pgpool-general at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>



More information about the pgpool-general mailing list