[pgpool-general: 4420] Re: Pgpool - connection hangs in DISCARD ALL

Muhammad Usama m.usama at gmail.com
Tue Feb 9 22:02:09 JST 2016


Hi

Many thanks for sharing the pgpool.log, The log shared by you does
contains some error messages "ERROR: unable to to flush data to
frontend" that have the potential to cause the stuck connection
Can you please try out the attached patch if it fix the problem. I am
attaching the patches for both 3_5 and 3_4 branches, please use the
respective patch as per your setup. Hopefully this should fix the
stuck issue.

Kind regards
Muhammad Usama



On Mon, Feb 8, 2016 at 8:49 PM, Paweł Ufnalewski <archon at foap.com> wrote:
> Hi,
>
>     It looks like it hangs in this places (see attachment). Problem is, that
> developer responsible for app has changed something in code, so connections
> now closes properly from client side (before I got a lot of these errors:
>
> 2016-02-08 09:33:39: pid 8472: ERROR:  unable to read data from frontend
> 2016-02-08 09:33:39: pid 8472: DETAIL:  EOF encountered with frontend)
> .
>
> Best regards,
> Paweł Ufnalewski
> Infrastructure Architect at Foap.com
>
> W dniu 2016-02-08 o 09:00, Muhammad Usama pisze:
>
> Hi
>
> Thanks in advance for the help. If you could share the pgpool-II log
> when the stuck connection happens that would help us in identifiny and
> rectifing the problem.
>
> Thanks
> Best regards
> Muhammad Usama
>
>
> On Mon, Feb 8, 2016 at 11:36 AM, Paweł Ufnalewski <archon at foap.com> wrote:
>
> Hi,
>
>     just to let you know - I'm having the same problem with 3.4.4 version
> (DISCARD ALL appears slower than in 3.4.3 I think, but it still does). How
> can I help to fix this problem?
>
> Best regards,
> Paweł Ufnalewski
> Infrastructure Architect at Foap.com
>
> W dniu 2016-02-01 o 08:44, Muhammad Usama pisze:
>
> Hi Gerhard
>
> Many thanks for testing and pointing this out. It's unfortunate that you are
> still getting the stuck connection issue. If it is possible can you please
> share the pgpool-II log for the time when this stuck connection issue
> happens. I am more interested in seeing which exact error message that
> caused the child process to jump to error handler from where the child
> process proceeded to send the DISCARD ALL to backend and eventually got
> stuck. Since after many tries we are not able to reproduce this issue, so
> log would be really helpful in understanding and fixing the problem.
>
> Best regards
> Muhammad Usama
>
>
> On Sun, Jan 31, 2016 at 9:33 PM, Gerhard Wiesinger <lists at wiesinger.com>
> wrote:
>
> On 28.01.2016 01:10, Tatsuo Ishii wrote:
>
> On 21.01.2016 20:52, Muhammad Usama wrote:
>
> Hi
>
> I am looking into this issue. and unfortunately like Ishii-San I am
> also not able to reproduce it. But I found one issue in 3.4 that might
> cause the problem. Can you please try the attached patch if it solves
> the problem. Also, if the problem still persists, it would be really
> helpful if you could share the pgpool-II log.
>
> I looked at the patch but it includes only logging changes and no
> functional changes. Therefore I didn't test it. Do you expect and
> behavioral changes to fix it, and why?
>
> elog() is not only a logging function, but also it plays very
> important role including exception handling and error treatments in
> pgpool-II. If you are familiar with PostgreSQL internals, you may
> notice it (elog() was imported from PostgreSQL source tree).
>
> Tried version 3.5.0 where the patch is included. Still not working. See
> backtrace below.
>
> Reverting to 3.3.7 which works perfectly.
>
> Ciao,
> Gerhard
>
> (gdb) back
> #0  0x00007fd87fdb6d43 in __select_nocancel () from /lib64/libc.so.6
> #1  0x0000564471af16a1 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
> protocol/pool_process_query.c:635
> #2  0x0000564471af1976 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
> protocol/pool_process_query.c:657
> #3  0x0000564471b1f67b in pool_read (cp=0x564473dfa610,
> buf=buf at entry=0x7ffc1d71bf97, len=len at entry=1) at utils/pool_stream.c:162
> #4  0x0000564471af8e6e in read_kind_from_backend
> (frontend=frontend at entry=0x564473df3e60,
> backend=backend at entry=0x564473df2e00,
>     decided_kind=decided_kind at entry=0x7ffc1d71c397 "E") at
> protocol/pool_process_query.c:3234
> #5  0x0000564471affdc9 in ProcessBackendResponse
> (frontend=frontend at entry=0x564473df3e60,
> backend=backend at entry=0x564473df2e00, state=state at entry=0x7ffc1d71c41c,
>     num_fields=num_fields at entry=0x7ffc1d71c41a) at
> protocol/pool_proto_modules.c:2356
> #6  0x0000564471af5b15 in pool_process_query (frontend=0x564473df3e60,
> backend=0x564473df2e00, reset_request=reset_request at entry=1) at
> protocol/pool_process_query.c:302
> #7  0x0000564471aed98c in backend_cleanup (backend=<optimized out>,
> frontend_invalid=frontend_invalid at entry=0 '\000', frontend=0x564471e09e40
> <child_frontend>)
>     at protocol/child.c:437
> #8  0x0000564471af0637 in do_child (fds=fds at entry=0x564473dee030) at
> protocol/child.c:234
> #9  0x0000564471ace107 in fork_a_child (fds=0x564473dee030, id=8) at
> main/pgpool_main.c:678
> #10 0x0000564471aceb6d in reaper () at main/pgpool_main.c:2254
> #11 0x0000564471ad322b in PgpoolMain (discard_status=<optimized out>,
> clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:429
> #12 0x0000564471acc7b1 in main (argc=<optimized out>, argv=0x7ffc1d7219e8)
> at main/main.c:310
>
> #1  0x0000564471af16a1 in pool_check_fd (cp=cp at entry=0x564473dfa610) at
> protocol/pool_process_query.c:635
> 635                     fds = select(fd+1, &readmask, NULL, &exceptmask,
> timeoutp);
>
> (gdb) print fd
> $1 = 8
> (gdb) print readmask
> $2 = {fds_bits = {256, 0 <repeats 15 times>}}
> (gdb) print exceptmask
> $3 = {fds_bits = {256, 0 <repeats 15 times>}}
> (gdb) print timeoutp
> $4 = (struct timeval *) 0x0
>
>
>
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
>
>
-------------- next part --------------
diff --git a/src/protocol/pool_process_query.c b/src/protocol/pool_process_query.c
index 271f19f..2d043c5 100644
--- a/src/protocol/pool_process_query.c
+++ b/src/protocol/pool_process_query.c
@@ -844,7 +844,7 @@ POOL_STATUS wait_for_query_response(POOL_CONNECTION *frontend, POOL_CONNECTION *
 				pool_write(frontend, DUMMY_VALUE, sizeof(DUMMY_VALUE));
 				if (pool_flush_it(frontend) < 0)
 				{
-                    ereport(ERROR,
+                    ereport(FRONTEND_ERROR,
 						(errmsg("unable to to flush data to frontend"),
                              errdetail("frontend error occured while waiting for backend reply")));
 				}
@@ -864,7 +864,7 @@ POOL_STATUS wait_for_query_response(POOL_CONNECTION *frontend, POOL_CONNECTION *
 				pool_write(frontend, notice_message, strlen(notice_message)+1);
 				if (pool_flush_it(frontend) < 0)
 				{
-                    ereport(ERROR,
+                    ereport(FRONTEND_ERROR,
                         (errmsg("unable to to flush data to frontend"),
                              errdetail("frontend error occured while waiting for backend reply")));
 
-------------- next part --------------
diff --git a/src/protocol/pool_process_query.c b/src/protocol/pool_process_query.c
index e5c8322..5caf283 100644
--- a/src/protocol/pool_process_query.c
+++ b/src/protocol/pool_process_query.c
@@ -504,7 +504,7 @@ POOL_STATUS wait_for_query_response(POOL_CONNECTION *frontend, POOL_CONNECTION *
 				pool_write(frontend, DUMMY_VALUE, sizeof(DUMMY_VALUE));
 				if (pool_flush_it(frontend) < 0)
 				{
-                    ereport(ERROR,
+                    ereport(FRONTEND_ERROR,
 						(errmsg("unable to to flush data to frontend"),
                              errdetail("frontend error occured while waiting for backend reply")));
 				}
@@ -524,7 +524,7 @@ POOL_STATUS wait_for_query_response(POOL_CONNECTION *frontend, POOL_CONNECTION *
 				pool_write(frontend, notice_message, strlen(notice_message)+1);
 				if (pool_flush_it(frontend) < 0)
 				{
-                    ereport(ERROR,
+                    ereport(FRONTEND_ERROR,
                         (errmsg("unable to to flush data to frontend"),
                              errdetail("frontend error occured while waiting for backend reply")));
 


More information about the pgpool-general mailing list