[pgpool-general: 4475] Re: Pgpool - connection hangs in DISCARD ALL

Gerhard Wiesinger lists at wiesinger.com
Sat Feb 20 03:52:01 JST 2016


Hello,

Can confirm that this patch worked for me (tested the 3.5 patch 
version), nearly 2 days without any problem. Can you please add it to 
the git repo and make a new release (3.4, 3.5).

Thnx.

Ciao,
Gerhard


On 16.02.2016 11:44, Muhammad Usama wrote:
> Hi
>
> Many thanks for the reply and a good news that you are not getting
> stuck connection issue after the patch.
>
> Thanks
> Best regards
> Muhammad Usama
>
>
> On Fri, Feb 12, 2016 at 9:45 PM, Paweł Ufnalewski <archon at foap.com> wrote:
>> Hmm it looks like it's fine now. Right now I only see these in log:
>>
>> 2016-02-12 17:28:12: pid 8838: LOG:  child process with pid: 27299 exits
>> with status 256
>> 2016-02-12 17:28:12: pid 8838: LOG:  fork a new child process with pid: 6140
>> 2016-02-12 17:30:42: pid 8838: LOG:  child process with pid: 5571 exits with
>> status 512
>> 2016-02-12 17:30:42: pid 8838: LOG:  fork a new child process with pid: 6720
>> 2016-02-12 17:30:43: pid 8838: LOG:  child process with pid: 4444 exits with
>> status 512
>> 2016-02-12 17:30:43: pid 8838: LOG:  fork a new child process with pid: 6751
>> 2016-02-12 17:35:42: pid 8838: LOG:  child process with pid: 6140 exits with
>> status 512
>> 2016-02-12 17:35:42: pid 8838: LOG:  fork a new child process with pid: 7868
>> 2016-02-12 17:40:42: pid 8838: LOG:  child process with pid: 6751 exits with
>> status 512
>> 2016-02-12 17:40:42: pid 8838: LOG:  fork a new child process with pid: 9018
>> 2016-02-12 17:40:42: pid 8838: LOG:  child process with pid: 6720 exits with
>> status 512
>> 2016-02-12 17:40:42: pid 8838: LOG:  fork a new child process with pid: 9019
>>
>> Thank you!
>>
>> Best regards,
>> Paweł Ufnalewski
>> Infrastructure Architect at Foap.com
>>
>> W dniu 2016-02-09 o 14:02, Muhammad Usama pisze:
>>
>>> Hi
>>>
>>> Many thanks for sharing the pgpool.log, The log shared by you does
>>> contains some error messages "ERROR: unable to to flush data to
>>> frontend" that have the potential to cause the stuck connection
>>> Can you please try out the attached patch if it fix the problem. I am
>>> attaching the patches for both 3_5 and 3_4 branches, please use the
>>> respective patch as per your setup. Hopefully this should fix the
>>> stuck issue.
>>>
>>> Kind regards
>>> Muhammad Usama
>>>
>>>
>>>
>>> On Mon, Feb 8, 2016 at 8:49 PM, Paweł Ufnalewski <archon at foap.com> wrote:
>>>> Hi,
>>>>
>>>>       It looks like it hangs in this places (see attachment). Problem is,
>>>> that
>>>> developer responsible for app has changed something in code, so
>>>> connections
>>>> now closes properly from client side (before I got a lot of these errors:
>>>>
>>>> 2016-02-08 09:33:39: pid 8472: ERROR:  unable to read data from frontend
>>>> 2016-02-08 09:33:39: pid 8472: DETAIL:  EOF encountered with frontend)
>>>> .
>>>>
>>>> Best regards,
>>>> Paweł Ufnalewski
>>>> Infrastructure Architect at Foap.com
>>>>
>>>> W dniu 2016-02-08 o 09:00, Muhammad Usama pisze:
>>>>
>>>> Hi
>>>>
>>>> Thanks in advance for the help. If you could share the pgpool-II log
>>>> when the stuck connection happens that would help us in identifiny and
>>>> rectifing the problem.
>>>>
>>>> Thanks
>>>> Best regards
>>>> Muhammad Usama
>>>>
>>>>
>>>> On Mon, Feb 8, 2016 at 11:36 AM, Paweł Ufnalewski <archon at foap.com>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>>       just to let you know - I'm having the same problem with 3.4.4
>>>> version
>>>> (DISCARD ALL appears slower than in 3.4.3 I think, but it still does).
>>>> How
>>>> can I help to fix this problem?
>>>>
>>>> Best regards,
>>>> Paweł Ufnalewski
>>>> Infrastructure Architect at Foap.com
>>>>
>>>> W dniu 2016-02-01 o 08:44, Muhammad Usama pisze:
>>>>
>>>> Hi Gerhard
>>>>
>>>> Many thanks for testing and pointing this out. It's unfortunate that you
>>>> are
>>>> still getting the stuck connection issue. If it is possible can you
>>>> please
>>>> share the pgpool-II log for the time when this stuck connection issue
>>>> happens. I am more interested in seeing which exact error message that
>>>> caused the child process to jump to error handler from where the child
>>>> process proceeded to send the DISCARD ALL to backend and eventually got
>>>> stuck. Since after many tries we are not able to reproduce this issue, so
>>>> log would be really helpful in understanding and fixing the problem.
>>>>
>>>> Best regards
>>>> Muhammad Usama
>>>>
>>>>
>>>> On Sun, Jan 31, 2016 at 9:33 PM, Gerhard Wiesinger <lists at wiesinger.com>
>>>> wrote:
>>>>
>>>> On 28.01.2016 01:10, Tatsuo Ishii wrote:
>>>>
>>>> On 21.01.2016 20:52, Muhammad Usama wrote:
>>>>
>>>> Hi
>>>>
>>>> I am looking into this issue. and unfortunately like Ishii-San I am
>>>> also not able to reproduce it. But I found one issue in 3.4 that might
>>>> cause the problem. Can you please try the attached patch if it solves
>>>> the problem. Also, if the problem still persists, it would be really
>>>> helpful if you could share the pgpool-II log.
>>>>
>>>> I looked at the patch but it includes only logging changes and no
>>>> functional changes. Therefore I didn't test it. Do you expect and
>>>> behavioral changes to fix it, and why?
>>>>
>>>> elog() is not only a logging function, but also it plays very
>>>> important role including exception handling and error treatments in
>>>> pgpool-II. If you are familiar with PostgreSQL internals, you may
>>>> notice it (elog() was imported from PostgreSQL source tree).
>>>>
>>>> Tried version 3.5.0 where the patch is included. Still not working. See
>>>> backtrace below.
>>>>
>>>> Reverting to 3.3.7 which works perfectly.
>>>>
>>>> Ciao,
>>>> Gerhard
>>>>
>>>> (gdb) back
>>>> #0  0x00007fd87fdb6d43 in __select_nocancel () from /lib64/libc.so.6
>>>> #1  0x0000564471af16a1 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
>>>> protocol/pool_process_query.c:635
>>>> #2  0x0000564471af1976 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
>>>> protocol/pool_process_query.c:657
>>>> #3  0x0000564471b1f67b in pool_read (cp=0x564473dfa610,
>>>> buf=buf at entry=0x7ffc1d71bf97, len=len at entry=1) at utils/pool_stream.c:162
>>>> #4  0x0000564471af8e6e in read_kind_from_backend
>>>> (frontend=frontend at entry=0x564473df3e60,
>>>> backend=backend at entry=0x564473df2e00,
>>>>       decided_kind=decided_kind at entry=0x7ffc1d71c397 "E") at
>>>> protocol/pool_process_query.c:3234
>>>> #5  0x0000564471affdc9 in ProcessBackendResponse
>>>> (frontend=frontend at entry=0x564473df3e60,
>>>> backend=backend at entry=0x564473df2e00, state=state at entry=0x7ffc1d71c41c,
>>>>       num_fields=num_fields at entry=0x7ffc1d71c41a) at
>>>> protocol/pool_proto_modules.c:2356
>>>> #6  0x0000564471af5b15 in pool_process_query (frontend=0x564473df3e60,
>>>> backend=0x564473df2e00, reset_request=reset_request at entry=1) at
>>>> protocol/pool_process_query.c:302
>>>> #7  0x0000564471aed98c in backend_cleanup (backend=<optimized out>,
>>>> frontend_invalid=frontend_invalid at entry=0 '\000', frontend=0x564471e09e40
>>>> <child_frontend>)
>>>>       at protocol/child.c:437
>>>> #8  0x0000564471af0637 in do_child (fds=fds at entry=0x564473dee030) at
>>>> protocol/child.c:234
>>>> #9  0x0000564471ace107 in fork_a_child (fds=0x564473dee030, id=8) at
>>>> main/pgpool_main.c:678
>>>> #10 0x0000564471aceb6d in reaper () at main/pgpool_main.c:2254
>>>> #11 0x0000564471ad322b in PgpoolMain (discard_status=<optimized out>,
>>>> clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:429
>>>> #12 0x0000564471acc7b1 in main (argc=<optimized out>,
>>>> argv=0x7ffc1d7219e8)
>>>> at main/main.c:310
>>>>
>>>> #1  0x0000564471af16a1 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
>>>> protocol/pool_process_query.c:635
>>>> 635                     fds = select(fd+1, &readmask, NULL, &exceptmask,
>>>> timeoutp);
>>>>
>>>> (gdb) print fd
>>>> $1 = 8
>>>> (gdb) print readmask
>>>> $2 = {fds_bits = {256, 0 <repeats 15 times>}}
>>>> (gdb) print exceptmask
>>>> $3 = {fds_bits = {256, 0 <repeats 15 times>}}
>>>> (gdb) print timeoutp
>>>> $4 = (struct timeval *) 0x0
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>
>>>>
>>>>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>



More information about the pgpool-general mailing list