View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000265 | Pgpool-II | Bug | public | 2016-12-05 17:39 | 2017-01-12 09:46 |
| Reporter | a.dushechkin | Assigned To | t-ishii | ||
| Priority | normal | Severity | minor | Reproducibility | random |
| Status | closed | Resolution | open | ||
| Platform | x86_64 | OS | Ubuntu | OS Version | 14.04 |
| Product Version | 3.4.6 | ||||
| Summary | 0000265: pgpool-II sometimes hangs on restart | ||||
| Description | On rare occasion when pgpool-II is restarted by Ansible deploy script, restart hangs indefinitely. This was spotted on initial deploy only (no application connections), just custom monitoring shell script asking for pgpool-II status running mostly "pcp_command node_info" stuff. GDB reports following stacktrace: (gdb) bt #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 0000001 0x00007f216943c684 in _L_lock_533 () from /lib/x86_64-linux-gnu/libc.so.6 0000002 0x00007f216943c0e2 in __GI___vsyslog_chk (pri=<optimized out>, flag=1, fmt=0x4ba720 "[%lu-%d] %s", ap=ap@entry=0x7fff250d9c68) at ../misc/syslog.c:258 0000003 0x00007f216943c532 in __syslog_chk (pri=pri@entry=6, flag=flag@entry=1, fmt=fmt@entry=0x4ba720 "[%lu-%d] %s") at ../misc/syslog.c:129 0000004 0x000000000046caeb in syslog (__fmt=0x4ba720 "[%lu-%d] %s", __pri=6) at /usr/include/x86_64-linux-gnu/bits/syslog.h:31 0000005 write_syslog (level=6, line=<optimized out>) at ../../src/utils/error/elog.c:1554 0000006 0x000000000046f863 in send_message_to_server_log (edata=0x74a2c0 <errordata+384>) at ../../src/utils/error/elog.c:2166 0000007 EmitErrorReport () at ../../src/utils/error/elog.c:1125 0000008 0x000000000046d9ce in errfinish (dummy=<optimized out>) at ../../src/utils/error/elog.c:430 0000009 0x0000000000425ad7 in die (sig=15) at protocol/child.c:923 0000010 <signal handler called> 0000011 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:93 0000012 0x00007f216943c684 in _L_lock_533 () from /lib/x86_64-linux-gnu/libc.so.6 0000013 0x00007f216943c0e2 in __GI___vsyslog_chk (pri=<optimized out>, flag=1, fmt=0x4ba720 "[%lu-%d] %s", ap=ap@entry=0x7fff250da8e8) at ../misc/syslog.c:258 0000014 0x00007f216943c532 in __syslog_chk (pri=pri@entry=6, flag=flag@entry=1, fmt=fmt@entry=0x4ba720 "[%lu-%d] %s") at ../misc/syslog.c:129 0000015 0x000000000046caeb in syslog (__fmt=0x4ba720 "[%lu-%d] %s", __pri=6) at /usr/include/x86_64-linux-gnu/bits/syslog.h:31 0000016 write_syslog (level=6, line=<optimized out>) at ../../src/utils/error/elog.c:1554 0000017 0x000000000046f863 in send_message_to_server_log (edata=0x74a200 <errordata+192>) at ../../src/utils/error/elog.c:2166 0000018 EmitErrorReport () at ../../src/utils/error/elog.c:1125 0000019 0x000000000046d9ce in errfinish (dummy=<optimized out>) at ../../src/utils/error/elog.c:430 0000020 0x0000000000425ad7 in die (sig=3) at protocol/child.c:923 0000021 <signal handler called> 0000022 0x00007f21694433fd in __libc_send (fd=4, buf=0x1248d70, n=119, flags=-1, flags@entry=16384) at ../sysdeps/unix/sysv/linux/x86_64/send.c:27 0000023 0x00007f216943c271 in __GI___vsyslog_chk (pri=<optimized out>, flag=1, fmt=0x4ba720 "[%lu-%d] %s", ap=ap@entry=0x7fff250db4e8) at ../misc/syslog.c:279 0000024 0x00007f216943c532 in __syslog_chk (pri=pri@entry=3, flag=flag@entry=1, fmt=fmt@entry=0x4ba720 "[%lu-%d] %s") at ../misc/syslog.c:129 0000025 0x000000000046caeb in syslog (__fmt=0x4ba720 "[%lu-%d] %s", __pri=3) at /usr/include/x86_64-linux-gnu/bits/syslog.h:31 0000026 write_syslog (level=3, line=<optimized out>) at ../../src/utils/error/elog.c:1554 0000027 0x000000000046f863 in send_message_to_server_log (edata=0x74a140 <errordata>) at ../../src/utils/error/elog.c:2166 0000028 EmitErrorReport () at ../../src/utils/error/elog.c:1125 0000029 0x000000000046d9ce in errfinish (dummy=<optimized out>) at ../../src/utils/error/elog.c:430 0000030 0x0000000000432c89 in new_connection (p=p@entry=0x1240c20) at protocol/pool_connection_pool.c:855 0000031 0x0000000000432d15 in pool_create_cp () at protocol/pool_connection_pool.c:251 0000032 0x0000000000424f2a in connect_backend (sp=sp@entry=0x1246b40, frontend=frontend@entry=0x1242b20) at protocol/child.c:864 #33 0x00000000004273ee in get_backend_connection (frontend=0x1242b20) at protocol/child.c:2326 0000034 do_child (fds=fds@entry=0x12401b0) at protocol/child.c:334 0000035 0x0000000000406c4c in fork_a_child (fds=0x12401b0, id=23) at main/pgpool_main.c:678 0000036 0x000000000040aec5 in PgpoolMain (discard_status=discard_status@entry=0 '\000', clear_memcache_oidmaps=clear_memcache_oidmaps@entry=0 '\000') at main/pgpool_main.c:278 0000037 0x000000000040548f in main (argc=<optimized out>, argv=<optimized out>) at main/main.c:310 | ||||
| Tags | No tags attached. | ||||
|
|
|
|
|
It seems the signal handler is interrupted by another signal and it causes a dealock in kernel. Attached patch fixes the problem by blocking interruption inside the signal handler. Can you please try it out? |
|
|
Hello. Thank you for quick response, but this issue is really rare. I'll try to statistically/artifically reproduce it this weekend. |
|
|
I have committed fixes to all supported branches. The real issue was malloc() was interrupted by signal, and the signal handler calls elog/ereport which in turn calls malloc(). The fix was removing elog/ereport calls from signal handlers. The fix for 3.4 branch will appear in the next minor version: 3.4.10. (the schedule for the release is not decided yet though). I'm going to close the issue for now. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2016-12-05 17:39 | a.dushechkin | New Issue | |
| 2016-12-05 18:08 | t-ishii | Assigned To | => Muhammad Usama |
| 2016-12-05 18:08 | t-ishii | Status | new => assigned |
| 2016-12-06 13:04 | t-ishii | File Added: child_v3_4.diff | |
| 2016-12-06 13:06 | t-ishii | Note Added: 0001204 | |
| 2016-12-06 13:06 | t-ishii | Assigned To | Muhammad Usama => t-ishii |
| 2016-12-07 07:38 | t-ishii | Status | assigned => feedback |
| 2016-12-07 11:38 | a.dushechkin | Note Added: 0001207 | |
| 2016-12-07 11:38 | a.dushechkin | Status | feedback => assigned |
| 2016-12-07 14:21 | t-ishii | Status | assigned => feedback |
| 2017-01-12 09:45 | t-ishii | Note Added: 0001290 | |
| 2017-01-12 09:46 | t-ishii | Status | feedback => closed |