View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000313 | Pgpool-II | Bug | public | 2017-06-06 20:23 | 2017-06-09 11:04 |
| Reporter | nagata | Assigned To | nagata | ||
| Priority | normal | Severity | major | Reproducibility | always |
| Status | resolved | Resolution | fixed | ||
| Product Version | 3.5.8 | ||||
| Target Version | 3.5.9 | ||||
| Summary | 0000313: hang with extended protocol | ||||
| Description | A hang occurs with streaming-replication mode and extended protocol in a certain condition. The following is an example. - pgpool.conf is configured so that all read queries are sent to the standby. - First, issue a writing query in a transaction block - After commiting the transaction, issue a select query. - When processing the query, send Describe (statement) message just after Parse. In practice, such Describe message is sent by JDBC when executing PreparedStatement with a parameter of timestamp type more than prepare_threshold times. In addition, queris can be issued outside of a transaction block when autocommit is enabled. For details, see Steps To Reproduce. [Analysis] When a wiring query is issued in a transaction block, writing_transaction flag is set, but after commiting the transaction this flag is not unset. The priblem is that parse_before_bind() is always called when writing_transaction is set. Alhough Parse of the select and the next Describe are sent to the standby due to loadbalance, another Parse and the next Bind, Execute are sent to the primary due to parse_before_bind. (In the following log, node 0 is primary, 1 is standby.) LOG: DB node id: 0 backend pid: 7079 statement: Parse: BEGIN LOG: DB node id: 1 backend pid: 7080 statement: Parse: BEGIN LOG: DB node id: 0 backend pid: 7079 statement: B message LOG: DB node id: 1 backend pid: 7080 statement: B message LOG: DB node id: 0 backend pid: 7079 statement: Execute: BEGIN LOG: DB node id: 1 backend pid: 7080 statement: Execute: BEGIN LOG: DB node id: 0 backend pid: 7079 statement: Parse: INSERT INTO test_tbl VALUES(1) LOG: DB node id: 0 backend pid: 7079 statement: B message LOG: DB node id: 0 backend pid: 7079 statement: Execute: INSERT INTO test_tbl VALUES(1) LOG: DB node id: 0 backend pid: 7079 statement: Parse: COMMIT LOG: DB node id: 1 backend pid: 7080 statement: Parse: COMMIT LOG: DB node id: 0 backend pid: 7079 statement: B message LOG: DB node id: 1 backend pid: 7080 statement: B message LOG: DB node id: 0 backend pid: 7079 statement: Execute: COMMIT LOG: DB node id: 1 backend pid: 7080 statement: Execute: COMMIT LOG: DB node id: 1 backend pid: 7080 statement: Parse: SELECT * FROM test_tbl LOG: DB node id: 1 backend pid: 7080 statement: D message LOG: DB node id: 0 backend pid: 7079 statement: C message LOG: DB node id: 0 backend pid: 7079 statement: Parse: SELECT * FROM test_tbl LOG: DB node id: 0 backend pid: 7079 statement: B message In this situation, Pgpool-II waits for the responces of Describe from the standby forever because any messages after Describe is not sent to the standby and this backend desn't respond anyting. (When I send Flush after Describe, the standby returns the response and this hang doesn't occuer.) (gdb) bt #0 0x00007f4bd6b93c33 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81 0000001 0x00000000004329e9 in pool_check_fd (cp=0x1b27f70) at protocol/pool_process_query.c:647 0000002 0x0000000000472381 in pool_read (cp=0x1b27f70, buf=0x7ffc3e42632f, len=1) at utils/pool_stream.c:162 0000003 0x0000000000438e6b in read_kind_from_one_backend (frontend=0x1b1d970, backend=0x1b1c910, kind=0x7ffc3e42668f "1", node=1) at protocol/pool_process_query.c:3074 0000004 0x00000000004394be in read_kind_from_backend (frontend=0x1b1d970, backend=0x1b1c910, decided_kind=0x7ffc3e4268cf "") at protocol/pool_process_query.c:3209 0000005 0x00000000004476f3 in ProcessBackendResponse (frontend=0x1b1d970, backend=0x1b1c910, state=0x7ffc3e42695c, num_fields=0x7ffc3e42696e) at protocol/pool_proto_modules.c:2538 0000006 0x0000000000431a8e in pool_process_query (frontend=0x1b1d970, backend=0x1b1c910, reset_request=0) at protocol/pool_process_query.c:212 0000007 0x000000000042c208 in do_child (fds=0x1b17ae0) at protocol/child.c:377 0000008 0x0000000000406541 in fork_a_child (fds=0x1b17ae0, id=3) at main/pgpool_main.c:589 0000009 0x0000000000405c44 in PgpoolMain (discard_status=1 '\001', clear_memcache_oidmaps=0 '\000') at main/pgpool_main.c:357 0000010 0x0000000000404695 in main (argc=9, argv=0x7ffc3e42b3f8) at main/main.c:300 [patch] The attached is a simple patch to fix it. In this fix, parse_before_bind() is call only if we are in a transaction block. Due to this, we can send Bind, Execute to the right backend. | ||||
| Steps To Reproduce | Use pgproto -------------------------------------------- 'Q' "DROP TABLE IF EXISTS test_tbl" 'Q' "CREATE TABLE test_tbl(i int)" 'Q' "INSERT INTO test_tbl VALUES(1)" 'Y' 'Y' 'Y' 'P' "" "BEGIN" 0 'B' "" "" 0 0 0 'E' "" 0 'S' 'Y' 'P' "" "INSERT INTO hoge VALUES(1)" 0 'B' "" "" 0 0 0 'E' "" 0 'S' 'Y' 'P' "" "COMMIT" 0 'B' "" "" 0 0 0 'E' "" 0 'S' 'Y' 'P' "S_1" "SELECT * FROM hoge" 0 'D' 'S' "S_1" 'B' "C_1" "S_1" 0 0 0 'E' "C_1" 0 'S' 'Y' 'X' --------------------------------------------- | ||||
| Tags | No tags attached. | ||||
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2017-06-06 20:23 | nagata | New Issue | |
| 2017-06-06 20:27 | nagata | File Added: fix_query_hang.patch | |
| 2017-06-07 14:23 | t-ishii | Note Added: 0001535 | |
| 2017-06-07 14:23 | t-ishii | Assigned To | => nagata |
| 2017-06-07 14:23 | t-ishii | Status | new => assigned |
| 2017-06-07 14:23 | t-ishii | Target Version | => 3.5.9 |
| 2017-06-09 11:04 | nagata | Status | assigned => resolved |
| 2017-06-09 11:04 | nagata | Resolution | open => fixed |
| 2017-06-09 11:04 | nagata | Note Added: 0001538 |