View Issue Details

IDProjectCategoryView StatusLast Update
0000018Pgpool-IIBugpublic2012-12-18 16:39
ReporterkvcAssigned Tonagata 
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
Platformx86-64OSRHELOS Version6.1
Product Version 
Target VersionFixed in Version 
Summary0000018: pgpool-II deadlock on extended queries
Descriptionversion is pgpool-II 3.1.3

Using epgsql (Erlang client implementation of PG's wire protocol) issuing an extended query of the
form: "insert into spool (username, xml) values ($1, $2)" I'm finding that the query hangs and
then times out.

I captured a network trace of the same query done directly and via pgpool-II to compare protocol
states and can clearly see that pgpool ends up issuing 2 flush statements ('H') in a row to the backend
and then hangs on a read of the 2nd flush since the backend has already responded to the 1st flush
and has nothing more to return.

Results of tshark decoding the two scenarios along with pgpool log is provided below under "Additional Information"
Steps To ReproduceIf you have Erlang/OTP

  (1) download and install epgsql from https://github.com/wg/epgsql
  (2) with EBIN set to wherever you installed epgsql BEAM files:

$ erl -pa $EBIN
Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]

Eshell V5.8.5 (abort with ^G)
1> {ok, C} = pgsql:connect("localhost", "ejabberd", "password", [{database, "ejabberd"}, {port, 15432}]).
{ok,<0.34.0>}
2> pgsql:squery(C, "delete from spool").
{ok,0}
3> pgsql:equery(C, "insert into spool (username, xml) values ($1, $2)", ["somebody", "foobar"]).

=ERROR REPORT==== 7-Jul-2012::19:39:00 ===
** State machine <0.34.0> terminating
** Last event in was timeout
** When State == timeout
** Data == {state,undefined,<0.35.0>,5000,
                        [{<<"application_name">>,<<>>},
                         {<<"client_encoding">>,<<"UTF8">>},
                         {<<"DateStyle">>,<<"ISO, MDY">>},
                         {<<"integer_datetimes">>,<<"on">>},
                         {<<"IntervalStyle">>,<<"postgres">>},
                         {<<"is_superuser">>,<<"off">>},
                         {<<"server_encoding">>,<<"UTF8">>},
                         {<<"server_version">>,<<"9.1.2">>},
                         {<<"session_authorization">>,<<"ejabberd">>},
                         {<<"standard_conforming_strings">>,<<"off">>},
                         {<<"TimeZone">>,<<"UTC">>}],
                        undefined,
                        {<0.32.0>,#Ref<0.0.0.42>},
                        undefined,
                        {2342,822611433},
                        {statement,[],[],[text,text]},
                        73}
** Reason for termination =
** timeout

=ERROR REPORT==== 7-Jul-2012::19:39:00 ===
** Generic server <0.35.0> terminating
** Last message in was {'EXIT',<0.34.0>,timeout}
** When Server state == {state,<0.34.0>,gen_tcp,#Port<0.603>,<<>>}
** Reason for termination ==
** timeout
** exception exit: timeout
4> q().
ok
Additional Informationtshark trace:
--------------------------------------
NO pgpool (works):
 11 16.969056 127.0.0.1 -> 127.0.0.1 PGSQL >P ***** EPGSQL REQUESTS A PARSE AND DESCRIBE *****
 12 16.969121 127.0.0.1 -> 127.0.0.1 PGSQL >D
 13 16.969140 127.0.0.1 -> 127.0.0.1 PGSQL >H
 14 16.969212 127.0.0.1 -> 127.0.0.1 TCP postgresql > 36045 [ACK] Seq=334 Ack=153 Win=32768 Len=0 TSV=1472324032 TSER=1472324032
 15 16.973402 127.0.0.1 -> 127.0.0.1 PGSQL <1/t/n
 16 16.973438 127.0.0.1 -> 127.0.0.1 TCP 36045 > postgresql [ACK] Seq=153 Ack=359 Win=33920 Len=0 TSV=1472324037 TSER=1472324037
 17 16.974735 127.0.0.1 -> 127.0.0.1 PGSQL >B ***** EPGSQL FOLLOWS UP WITH BIND/EXECUTE/COMPLETE/SYNC
 18 16.974769 127.0.0.1 -> 127.0.0.1 PGSQL >E
 19 16.974784 127.0.0.1 -> 127.0.0.1 PGSQL >C
 20 16.974797 127.0.0.1 -> 127.0.0.1 PGSQL >S
 21 16.974828 127.0.0.1 -> 127.0.0.1 TCP postgresql > 36045 [ACK] Seq=359 Ack=213 Win=32768 Len=0 TSV=1472324038 TSER=1472324037
 22 16.978919 127.0.0.1 -> 127.0.0.1 PGSQL <2/C/3/Z

WITH pgpool (hangs):
 67 77.791216 127.0.0.1 -> 127.0.0.1 PGSQL >P ***** EPGSQL AS ABOVE, BUT THIS TIME SENDS TO PGPOOL *****
 68 77.791255 127.0.0.1 -> 127.0.0.1 PGSQL >D
 69 77.791277 127.0.0.1 -> 127.0.0.1 PGSQL >H
 70 77.791370 127.0.0.1 -> 127.0.0.1 TCP 15432 > 53695 [ACK] Seq=334 Ack=153 Win=32768 Len=0 TSV=1472384854 TSER=1472384854
 71 77.792786 127.0.0.1 -> 127.0.0.1 PGSQL >P/H ***** PGPOOL INSERTS A FLUSH *****
 72 77.794615 127.0.0.1 -> 127.0.0.1 PGSQL <1
 73 77.794662 127.0.0.1 -> 127.0.0.1 TCP 35315 > postgresql [ACK] Seq=146 Ack=339 Win=33920 Len=0 TSV=1472384858 TSER=1472384858
 74 77.794835 127.0.0.1 -> 127.0.0.1 PGSQL <1
 75 77.794962 127.0.0.1 -> 127.0.0.1 PGSQL >D/H ***** PGPOOL INSERTS A FLUSH *****
 76 77.795037 127.0.0.1 -> 127.0.0.1 PGSQL <t/n
 77 77.795162 127.0.0.1 -> 127.0.0.1 PGSQL <t
 78 77.795200 127.0.0.1 -> 127.0.0.1 PGSQL <n
 79 77.795244 127.0.0.1 -> 127.0.0.1 PGSQL >H ***** HANGS RIGHT HERE BECAUSE BACKEND NOW HAS NOTHING TO FLUSH *****

 80 77.795325 127.0.0.1 -> 127.0.0.1 TCP 53695 > 15432 [ACK] Seq=153 Ack=359 Win=32832 Len=0 TSV=1472384859 TSER=1472384858
 81 77.795402 127.0.0.1 -> 127.0.0.1 PGSQL >B
 82 77.795418 127.0.0.1 -> 127.0.0.1 PGSQL >E
 83 77.795432 127.0.0.1 -> 127.0.0.1 PGSQL >C
 84 77.795446 127.0.0.1 -> 127.0.0.1 PGSQL >S


pgpool.log:
--------------------------------------
2012-07-07 05:55:53 DEBUG: pid 24178: ProcessFrontendResponse: kind from frontend P(50)
2012-07-07 05:55:53 DEBUG: pid 24178: pool_unset_doing_extended_query_message: done
2012-07-07 05:55:53 DEBUG: pid 24178: pool_set_doing_extended_query_message: done
2012-07-07 05:55:53 DEBUG: pid 24178: pool_set_query_in_progress: done
2012-07-07 05:55:53 DEBUG: pid 24178: Parse: statement name <>
2012-07-07 05:55:53 DEBUG: pid 24178: pool_set_query_in_progress: done
2012-07-07 05:55:53 LOG: pid 24178: DB node id: 0 backend pid: 24231 statement: Parse: insert into spool (username, xml) values ($1, $2)
2012-07-07 05:55:53 DEBUG: pid 24178: wait_for_query_response: waiting for backend 0 completing the query
2012-07-07 05:55:53 DEBUG: pid 24178: read_kind_from_backend: read kind from 0 th backend 1 NUM_BACKENDS: 1
2012-07-07 05:55:53 DEBUG: pid 24178: ProcessBackendResponse: kind from backend: 1
2012-07-07 05:55:53 DEBUG: pid 24178: pool_set_command_success: done
2012-07-07 05:55:53 DEBUG: pid 24178: pool_unset_query_in_progress: done
2012-07-07 05:55:53 DEBUG: pid 24178: ProcessFrontendResponse: kind from frontend D(44)
2012-07-07 05:55:53 DEBUG: pid 24178: pool_unset_doing_extended_query_message: done
2012-07-07 05:55:53 DEBUG: pid 24178: pool_set_doing_extended_query_message: done
2012-07-07 05:55:53 DEBUG: pid 24178: pool_set_query_in_progress: done
2012-07-07 05:55:53 DEBUG: pid 24178: Describe: waiting for master completing the query
2012-07-07 05:55:53 LOG: pid 24178: DB node id: 0 backend pid: 24231 statement: D message
2012-07-07 05:55:53 DEBUG: pid 24178: wait_for_query_response: waiting for backend 0 completing the query
2012-07-07 05:55:53 DEBUG: pid 24178: read_kind_from_backend: read kind from 0 th backend t NUM_BACKENDS: 1
2012-07-07 05:55:53 DEBUG: pid 24178: ProcessBackendResponse: kind from backend: t
2012-07-07 05:55:53 DEBUG: pid 24178: read_kind_from_backend: read kind from 0 th backend n NUM_BACKENDS: 1
2012-07-07 05:55:53 DEBUG: pid 24178: ProcessBackendResponse: kind from backend: n
2012-07-07 05:55:53 DEBUG: pid 24178: pool_unset_query_in_progress: done
2012-07-07 05:55:53 DEBUG: pid 24178: ProcessFrontendResponse: kind from frontend H(48)
2012-07-07 05:55:53 DEBUG: pid 24178: pool_unset_doing_extended_query_message: done
2012-07-07 05:55:58 DEBUG: pid 23924: I am 23924 accept fd 6
2012-07-07 05:55:58 LOG: pid 23924: connection received: host=127.0.0.1 port=36089
2012-07-07 05:55:58 DEBUG: pid 23924: Protocol Major: 1234 Minor: 5678 database: user:
2012-07-07 05:55:58 DEBUG: pid 23924: Cancel request received
TagsNo tags attached.

Activities

nagata

2012-12-18 16:38

developer   ~0000198

This hang is fixed in V3.1.4 by flowing commit.

http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=8ffb7f63664ff9c7d4682ac58842abf187f0892d

This fixes the bug that pgpool waits for primary node which does not
have pending data.

Issue History

Date Modified Username Field Change
2012-07-11 04:35 kvc New Issue
2012-11-26 15:30 anzai Assigned To => nagata
2012-11-26 15:30 anzai Status new => assigned
2012-12-18 16:38 nagata Note Added: 0000198
2012-12-18 16:39 nagata Status assigned => resolved
2012-12-18 16:39 nagata Resolution open => fixed