[pgpool-general: 4304] Re: Pgpool - connection hangs in DISCARD ALL

Tatsuo Ishii ishii at postgresql.org
Fri Jan 8 07:38:03 JST 2016


BTW, if you don't have the "connections hang in DISCARD" and just
max_connections is hit, then it is likely your configuration
problem. There's a formula to decide how much max_connections is
needed. See docs for more details.

> I heard similar reports but problem for developers is, nobody knows
> how to reproduce the hang (a stack trace after the hang is not very
> helpful).
> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
> 
> From: the Rat <therat at abv.bg>
> Subject: Re: [pgpool-general: 4295] Re: Pgpool - connection hangs in DISCARD ALL
> Date: Thu, 7 Jan 2016 22:11:23 +0200 (EET)
> Message-ID: <761041727.22616.1452197483031.JavaMail.apache at nm3.abv.bg>
> 
>> Hello, 
>>   We've also experienced this problem with 3.4.3 with postgres 9.4. The connections pile up on the backend as well and eventually hit the max_connections limit causing pgpool to failover to another node.  
>>   The only way we were able to work around it is to disable the pooling of pgpool, but that looses the main advantage of pgpool.   connection_cache = off   
>> 
>> On 04.01.2016 10:00, Tatsuo Ishii wrote:
>> 
>>>>>>>>> On 31.12.2015 10:50, Tatsuo Ishii wrote:
>> 
>>>>>>>>>>> Yes, reconfigured. Keep you up2date. Any fix for the logic possible?
>> 
>>>>>>>>>> Yes, I'm thinking now.
>> 
>>>>>>>>> Still hangs even with child_max_connections = 0, same stack trace.
>> 
>>>>>>>> Too bad. I'm going to look for other causes if any.
>> 
>>>>>>> I remebered this:
>> 
>>>>>>>
>> 
>>>>>>>  http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=78d2fe3fd 
>> 
>>>>>>> 82b4d2ee90e1369be8dd583196fd36e
>> 
>>>>>>>
>> 
>>>>>>> If you still have the problem after disabling client_idle_limit, the
>> 
>>>>>>> fix is for you.
>> 
>>>>> Sorry, I wanted to say:
>> 
>>>>>
>> 
>>>>>> If you do not  have the problem after disabling client_idle_limit,
>> 
>>>>>> the fix is for you.
>> 
>>>>>> Thnx, but still happens. Should I try master version or 3.3.7?
>> 
>>>>> That's an option. However, if you like, could you try the fix before
>> 
>>>>> going to 3.3?  The patch may fix other code path which trigger the
>> 
>>>>> problem.  (or try 3.4-stable head).
>> 
>>>>>
>> 
>>>> I tried the fix on the link already, but it didn't help (unless you did forget to attach another fix?).
>> 
>>>>
>> 
>>>> Will try 3.4-stable head, then 3.3 if it doesn't help.
>> 
>> 
>> 
>> 3.4 stable head also has the problem:
>> 
>> (gdb) ba
>> 
>> #0  0x00007f53d6f40d63 in __select_nocancel () from /lib64/libc.so.6
>> 
>> #1  0x000055a2f00669e1 in pool_check_fd (cp=cp at entry=0x55a2f05c0a20) at 
>> 
>> protocol/pool_process_query.c:970
>> 
>> #2  0x000055a2f0066c86 in pool_check_fd (cp=cp at entry=0x55a2f05c0a20) at 
>> 
>> protocol/pool_process_query.c:992
>> 
>> #3  0x000055a2f00a9f7b in pool_read (cp=0x55a2f05c0a20, 
>> 
>> buf=buf at entry=0x7ffc811e7137, len=len at entry=1) at utils/pool_stream.c:159
>> 
>> #4  0x000055a2f006f1dc in read_kind_from_backend 
>> 
>> (frontend=frontend at entry=0x55a2f05ba270, 
>> 
>> backend=backend at entry=0x55a2f05b9210, 
>> 
>> decided_kind=decided_kind at entry=0x7ffc811e7537 "E")
>> 
>>      at protocol/pool_process_query.c:3618
>> 
>> #5  0x000055a2f0076d79 in ProcessBackendResponse 
>> 
>> (frontend=frontend at entry=0x55a2f05ba270, 
>> 
>> backend=backend at entry=0x55a2f05b9210, state=state at entry=0x7ffc811e75cc,
>> 
>>      num_fields=num_fields at entry=0x7ffc811e75ca) at 
>> 
>> protocol/pool_proto_modules.c:2519
>> 
>> #6  0x000055a2f006b1b7 in pool_process_query (frontend=0x55a2f05ba270, 
>> 
>> backend=0x55a2f05b9210, reset_request=reset_request at entry=1) at 
>> 
>> protocol/pool_process_query.c:302
>> 
>> #7  0x000055a2f0062a9a in backend_cleanup (backend= , 
>> 
>> frontend_invalid=frontend_invalid at entry=0 '\000', 
>> 
>> frontend=0x55a2f037c4e0  ) at protocol/child.c:442
>> 
>> #8  0x000055a2f0065745 in do_child (fds=fds at entry=0x55a2f05b4440) at 
>> 
>> protocol/child.c:238
>> 
>> #9  0x000055a2f004263e in fork_a_child (fds=0x55a2f05b4440, id=3) at 
>> 
>> main/pgpool_main.c:678
>> 
>> #10 0x000055a2f0043735 in reaper () at main/pgpool_main.c:2148
>> 
>> #11 0x000055a2f004724b in PgpoolMain (discard_status= , 
>> 
>> clear_memcache_oidmaps= ) at main/pgpool_main.c:411
>> 
>> #12 0x000055a2f0040d4e in main (argc= , 
>> 
>> argv=0x7ffc811ecb28) at main/main.c:310
>> 
>> 
>> 
>> Will try 3.3.7.
>> 
>> 
>> 
>> BTW: I'm using the following script to get an overview whats happening:
>> 
>> cat watch_pg.sh
>> 
>> #!/usr/bin/env bash
>> 
>> watch -n 1 "ps wwwaux --sort=user,command,pid | grep -v grep | grep -E 
>> 
>> '^USER|pgpool|postgres:'"
>> 
>> 
>> 
>> Ciao,
>> 
>> Gerhard
>> 
>> 
>> 
>> _______________________________________________
>> 
>> pgpool-general mailing list
>> 
>>  pgpool-general at pgpool.net 
>> 
>>  http://www.pgpool.net/mailman/listinfo/pgpool-general 
>> 
>>  
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list