[pgpool-general: 4447] Re: Pgpool - connection hangs in DISCARD ALL

Sat Feb 13 01:45:27 JST 2016

Hmm it looks like it's fine now. Right now I only see these in log:

2016-02-12 17:28:12: pid 8838: LOG:  child process with pid: 27299 exits 
with status 256
2016-02-12 17:28:12: pid 8838: LOG:  fork a new child process with pid: 6140
2016-02-12 17:30:42: pid 8838: LOG:  child process with pid: 5571 exits 
with status 512
2016-02-12 17:30:42: pid 8838: LOG:  fork a new child process with pid: 6720
2016-02-12 17:30:43: pid 8838: LOG:  child process with pid: 4444 exits 
with status 512
2016-02-12 17:30:43: pid 8838: LOG:  fork a new child process with pid: 6751
2016-02-12 17:35:42: pid 8838: LOG:  child process with pid: 6140 exits 
with status 512
2016-02-12 17:35:42: pid 8838: LOG:  fork a new child process with pid: 7868
2016-02-12 17:40:42: pid 8838: LOG:  child process with pid: 6751 exits 
with status 512
2016-02-12 17:40:42: pid 8838: LOG:  fork a new child process with pid: 9018
2016-02-12 17:40:42: pid 8838: LOG:  child process with pid: 6720 exits 
with status 512
2016-02-12 17:40:42: pid 8838: LOG:  fork a new child process with pid: 9019

Thank you!

Best regards,
Paweł Ufnalewski
Infrastructure Architect at Foap.com

W dniu 2016-02-09 o 14:02, Muhammad Usama pisze:
> Hi
>
> Many thanks for sharing the pgpool.log, The log shared by you does
> contains some error messages "ERROR: unable to to flush data to
> frontend" that have the potential to cause the stuck connection
> Can you please try out the attached patch if it fix the problem. I am
> attaching the patches for both 3_5 and 3_4 branches, please use the
> respective patch as per your setup. Hopefully this should fix the
> stuck issue.
>
> Kind regards
> Muhammad Usama
>
>
>
> On Mon, Feb 8, 2016 at 8:49 PM, Paweł Ufnalewski <archon at foap.com> wrote:
>> Hi,
>>
>>      It looks like it hangs in this places (see attachment). Problem is, that
>> developer responsible for app has changed something in code, so connections
>> now closes properly from client side (before I got a lot of these errors:
>>
>> 2016-02-08 09:33:39: pid 8472: ERROR:  unable to read data from frontend
>> 2016-02-08 09:33:39: pid 8472: DETAIL:  EOF encountered with frontend)
>> .
>>
>> Best regards,
>> Paweł Ufnalewski
>> Infrastructure Architect at Foap.com
>>
>> W dniu 2016-02-08 o 09:00, Muhammad Usama pisze:
>>
>> Hi
>>
>> Thanks in advance for the help. If you could share the pgpool-II log
>> when the stuck connection happens that would help us in identifiny and
>> rectifing the problem.
>>
>> Thanks
>> Best regards
>> Muhammad Usama
>>
>>
>> On Mon, Feb 8, 2016 at 11:36 AM, Paweł Ufnalewski <archon at foap.com> wrote:
>>
>> Hi,
>>
>>      just to let you know - I'm having the same problem with 3.4.4 version
>> (DISCARD ALL appears slower than in 3.4.3 I think, but it still does). How
>> can I help to fix this problem?
>>
>> Best regards,
>> Paweł Ufnalewski
>> Infrastructure Architect at Foap.com
>>
>> W dniu 2016-02-01 o 08:44, Muhammad Usama pisze:
>>
>> Hi Gerhard
>>
>> Many thanks for testing and pointing this out. It's unfortunate that you are
>> still getting the stuck connection issue. If it is possible can you please
>> share the pgpool-II log for the time when this stuck connection issue
>> happens. I am more interested in seeing which exact error message that
>> caused the child process to jump to error handler from where the child
>> process proceeded to send the DISCARD ALL to backend and eventually got
>> stuck. Since after many tries we are not able to reproduce this issue, so
>> log would be really helpful in understanding and fixing the problem.
>>
>> Best regards
>> Muhammad Usama
>>
>>
>> On Sun, Jan 31, 2016 at 9:33 PM, Gerhard Wiesinger <lists at wiesinger.com>
>> wrote:
>>
>> On 28.01.2016 01:10, Tatsuo Ishii wrote:
>>
>> On 21.01.2016 20:52, Muhammad Usama wrote:
>>
>> Hi
>>
>> I am looking into this issue. and unfortunately like Ishii-San I am
>> also not able to reproduce it. But I found one issue in 3.4 that might
>> cause the problem. Can you please try the attached patch if it solves
>> the problem. Also, if the problem still persists, it would be really
>> helpful if you could share the pgpool-II log.
>>
>> I looked at the patch but it includes only logging changes and no
>> functional changes. Therefore I didn't test it. Do you expect and
>> behavioral changes to fix it, and why?
>>
>> elog() is not only a logging function, but also it plays very
>> important role including exception handling and error treatments in
>> pgpool-II. If you are familiar with PostgreSQL internals, you may
>> notice it (elog() was imported from PostgreSQL source tree).
>>
>> Tried version 3.5.0 where the patch is included. Still not working. See
>> backtrace below.
>>
>> Reverting to 3.3.7 which works perfectly.
>>
>> Ciao,
>> Gerhard
>>
>> (gdb) back
>> #0  0x00007fd87fdb6d43 in __select_nocancel () from /lib64/libc.so.6
>> #1  0x0000564471af16a1 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
>> protocol/pool_process_query.c:635
>> #2  0x0000564471af1976 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
>> protocol/pool_process_query.c:657
>> #3  0x0000564471b1f67b in pool_read (cp=0x564473dfa610,
>> buf=buf at entry=0x7ffc1d71bf97, len=len at entry=1) at utils/pool_stream.c:162
>> #4  0x0000564471af8e6e in read_kind_from_backend
>> (frontend=frontend at entry=0x564473df3e60,
>> backend=backend at entry=0x564473df2e00,
>>      decided_kind=decided_kind at entry=0x7ffc1d71c397 "E") at
>> protocol/pool_process_query.c:3234
>> #5  0x0000564471affdc9 in ProcessBackendResponse
>> (frontend=frontend at entry=0x564473df3e60,
>> backend=backend at entry=0x564473df2e00, state=state at entry=0x7ffc1d71c41c,
>>      num_fields=num_fields at entry=0x7ffc1d71c41a) at
>> protocol/pool_proto_modules.c:2356
>> #6  0x0000564471af5b15 in pool_process_query (frontend=0x564473df3e60,
>> backend=0x564473df2e00, reset_request=reset_request at entry=1) at
>> protocol/pool_process_query.c:302
>> #7  0x0000564471aed98c in backend_cleanup (backend=<optimized out>,
>> frontend_invalid=frontend_invalid at entry=0 '\000', frontend=0x564471e09e40
>> <child_frontend>)
>>      at protocol/child.c:437
>> #8  0x0000564471af0637 in do_child (fds=fds at entry=0x564473dee030) at
>> protocol/child.c:234
>> #9  0x0000564471ace107 in fork_a_child (fds=0x564473dee030, id=8) at
>> main/pgpool_main.c:678
>> #10 0x0000564471aceb6d in reaper () at main/pgpool_main.c:2254
>> #11 0x0000564471ad322b in PgpoolMain (discard_status=<optimized out>,
>> clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:429
>> #12 0x0000564471acc7b1 in main (argc=<optimized out>, argv=0x7ffc1d7219e8)
>> at main/main.c:310
>>
>> #1  0x0000564471af16a1 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
>> protocol/pool_process_query.c:635
>> 635                     fds = select(fd+1, &readmask, NULL, &exceptmask,
>> timeoutp);
>>
>> (gdb) print fd
>> $1 = 8
>> (gdb) print readmask
>> $2 = {fds_bits = {256, 0 <repeats 15 times>}}
>> (gdb) print exceptmask
>> $3 = {fds_bits = {256, 0 <repeats 15 times>}}
>> (gdb) print timeoutp
>> $4 = (struct timeval *) 0x0
>>
>>
>>
>>
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>
>>
>>