View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000370 | Pgpool-II | Bug | public | 2017-12-10 22:13 | 2018-02-16 06:04 |
| Reporter | avi | Assigned To | t-ishii | ||
| Priority | high | Severity | block | Reproducibility | always |
| Status | resolved | Resolution | open | ||
| Product Version | 3.6.7 | ||||
| Target Version | 3.6.8 | Fixed in Version | 3.6.8 | ||
| Summary | 0000370: Pgpool get stuck with db exception followed by select | ||||
| Description | We tested pgpool version 3.6.7 with pgproto and found that a db exception followed by select cause pgpool to get stuck [root@h2-nms trunk]# ./run_tests.sh iteration 0 FE=> Query(query="SET statement_timeout = '4s'") <= BE CommandComplete(SET) <= BE ReadyForQuery(I) FE=> Parse(stmt="S1", query="SELECT 1/0") FE=> Bind(stmt="S1", portal="") FE=> Execute(portal="") FE=> Close(stmt="S1") FE=> Parse(stmt="S2", query="SELECT * from pg_tables") FE=> Bind(stmt="S2", portal="") FE=> Execute(portal="") FE=> Close(stmt="S2") FE=> Sync =================================== | ||||
| Steps To Reproduce | To reproduce the problem please: create a sh script with the following loop for ((n=0; n<50000; n++)) do ( echo "iteration $n " pgproto -f bug.data -p 9999 -h your_IP -u postgres -d postgres; ) done ========== This is the bug data file to be used with pgproto =============== 'Q' "SET statement_timeout = '4s'" # Receive response from backend 'Y' 'P' "S1" "SELECT 1/0" 0 'B' "" "S1" 0 0 0 'E' "" 0 'C' 'S' "S1" 'P' "S2" "SELECT * from pg_tables" 0 'B' "" "S2" 0 0 0 'E' "" 0 'C' 'S' "S2" # Issue Sync message 'S' # Receive response from backend 'Y' # Send terminate message 'X' | ||||
| Tags | No tags attached. | ||||
|
|
The problem is reproduced here. It only happens when the erroneous query was sent to primary and the next query requires a catalogue cache look up. At the point when the erroneous was sent, backend is already in error state. The next query requires a catalogue cache look up and issues a SELECT. But since backend will ignore everything until a sync message received, backend will not respond, and Pgpool-II gets stuck. Will think how to deal with this. |
|
|
|
|
|
Ok, here is a patch trying to fix this issue. |
|
|
Hi I applied the patch and tested it. This issue seems to be resolved. Thanks you! Could you please let us know when the next minor release will be out. This problem is critical and is causing the whole system to freeze. In addition, when I applied the patch I applied it on sources from pgpool-II-3.6.7.tar.gz and "patch -p1 < patch.file" command was successful. However the command returned that the line numbers were adjusted. Should the patch be applied on pgpool-II-3.6.7.tar.gz or on more current sources from your svn? Thanks |
|
|
For the release dates, we have not decided yet. I will discuss with other developers. The patch was created for master branch head. As long as the patch applies to 3.6 tree, it should be ok. BTW, we are using git, rather than svn:-) |
|
|
I found another bug with Pgpool-II 3.6.7. It can be reproduced by following pgproto data. This is basically just another SELECT added to your example. The fix is already in the git repository. You can download a snapshot tar ball from the repository. (this is close to the next 3.6.8) Since I have discovered two critical bugs, I think we need to make minor releases as soon as possible. https://git.postgresql.org/gitweb/?p=pgpool2.git;a=snapshot;h=911d6e78429598b399577b9473d5ab3148ecd795;sf=tgz 'Q' "SET statement_timeout = '4s'" # Receive response from backend 'Y' 'P' "S1" "SELECT 1/0" 0 'B' "" "S1" 0 0 0 'E' "" 0 'C' 'S' "S1" 'P' "S2" "SELECT * from pg_tables" 0 'B' "" "S2" 0 0 0 'E' "" 0 'C' 'S' "S2" # Issue Sync message 'S' # Receive response from backend 'Y' 'P' "" "SELECT * from pg_tables" 0 'B' "" "" 0 0 0 'E' "" 0 'S' 'Y' # Send terminate message 'X' |
|
|
It seems the bug starts from 3.6.6. |
|
|
Thanks. The bug I reported two days ago started at 3.6.7. I tested it on 3.6.6 and it worked well. I'm happy that a new 3.6.8 release will be out ASAP since we cannot use 3.6.6 because of a fail-over problem and not 3.6.7 because of the freeze problem. |
|
|
May I close this issue? |
|
|
Hi I would like to test it on my setup before we close this issue. I took the 3.6.7 rpms and replace the sources tar.gz file with the one you sent me. I ran the rpmbuild command and got the following. Can you help please... bash-4.1$ rpmbuild -ba pgpool.spec --define="pgpool_version 3.6.7" --define="pg_version 96" --define="pghome /usr/pgsql-9.6" --define="dist rhel6" Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.F6r2o2 + umask 022 + cd /var/lib/pgsql/rpmbuild/BUILD + LANG=C + export LANG + unset DISPLAY + cd /var/lib/pgsql/rpmbuild/BUILD + rm -rf pgpool-II-3.6.7 + /usr/bin/gzip -dc /var/lib/pgsql/rpmbuild/SOURCES/pgpool-II-3.6.7.tar.gz + /bin/tar -xf - + STATUS=0 + '[' 0 -ne 0 ']' + cd pgpool-II-3.6.7 + /bin/chmod -Rf a+rX,u+w,g-w,o-w . + echo 'Patch 0000001 (pgpool-II-head.patch):' Patch 0000001 (pgpool-II-head.patch): + /bin/cat /var/lib/pgsql/rpmbuild/SOURCES/pgpool-II-head.patch + /usr/bin/patch -p1 --fuzz=0 can't find file to patch at input line 4 Perhaps you used the wrong -p or --strip option? The text leading up to this was: -------------------------- |diff -crN pgpool-II-3.6.7/doc/src/sgml/html/LEGALNOTICE.html head/pgpool-II-3.6.7/doc/src/sgml/html/LEGALNOTICE.html |*** pgpool-II-3.6.7/doc/src/sgml/html/LEGALNOTICE.html Wed Nov 1 01:02:24 2017 |--- head/pgpool-II-3.6.7/doc/src/sgml/html/LEGALNOTICE.html Wed Nov 1 03:02:23 2017 -------------------------- |
|
|
You are asking the same question on the mailing list. So I assume you do not need help on Mantis regarding RPM. |
|
|
May I close this issue? |
|
|
Yes, I tested it with 3.6.8 and 3.6.9 and it was fixed. Thank you! |
|
|
Thank you for the feedbback!. I mark this issue as "resolved". |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2017-12-10 22:13 | avi | New Issue | |
| 2017-12-11 07:36 | t-ishii | Note Added: 0001854 | |
| 2017-12-11 07:37 | t-ishii | Assigned To | => t-ishii |
| 2017-12-11 07:37 | t-ishii | Status | new => assigned |
| 2017-12-11 07:37 | t-ishii | Steps to Reproduce Updated | |
| 2017-12-11 13:38 | t-ishii | File Added: bug370.patch | |
| 2017-12-11 13:38 | t-ishii | Note Added: 0001855 | |
| 2017-12-11 13:38 | t-ishii | Status | assigned => feedback |
| 2017-12-11 22:50 | avi | Note Added: 0001856 | |
| 2017-12-11 22:50 | avi | Status | feedback => assigned |
| 2017-12-11 23:42 | t-ishii | Note Added: 0001857 | |
| 2017-12-11 23:42 | t-ishii | Target Version | => 3.6.8 |
| 2017-12-12 16:38 | t-ishii | Note Added: 0001859 | |
| 2017-12-12 16:49 | t-ishii | Note Added: 0001860 | |
| 2017-12-12 18:13 | avi | Note Added: 0001861 | |
| 2017-12-13 15:18 | t-ishii | Note Added: 0001862 | |
| 2017-12-13 15:18 | t-ishii | Status | assigned => feedback |
| 2017-12-14 18:25 | avi | Note Added: 0001868 | |
| 2017-12-14 18:25 | avi | Status | feedback => assigned |
| 2017-12-22 16:34 | t-ishii | Note Added: 0001878 | |
| 2017-12-22 17:40 | t-ishii | Status | assigned => feedback |
| 2018-01-12 14:37 | t-ishii | Note Added: 0001891 | |
| 2018-02-16 00:20 | avi | Note Added: 0001929 | |
| 2018-02-16 00:20 | avi | Status | feedback => assigned |
| 2018-02-16 06:03 | t-ishii | Note Added: 0001930 | |
| 2018-02-16 06:04 | t-ishii | Status | assigned => resolved |
| 2018-02-16 06:04 | t-ishii | Fixed in Version | => 3.6.8 |