[pgpool-hackers: 4428] Re: Guard against ill mannered frontend

Tatsuo Ishii ishii at sraoss.co.jp
Sun Feb 11 12:16:41 JST 2024


> Recently a user complained that pgpool hangs up if JDBC driver is
> used with certain option (autosave=always).
> 
> [pgpool-general: 8990] autosave=always jdbc option & it only sends query SAVEPOINT PGJDBC_AUTOSAVE and hangs
> https://www.pgpool.net/pipermail/pgpool-general/2023-December/009051.html
> 
> I found a reliable reproducer of the problem:
> [pgpool-general: 9007]
> https://www.pgpool.net/pipermail/pgpool-general/2024-January/009068.html
> 
>> I have taken a look at the log. I tried to reproduce the problem using
>> pgproto (a protocol level speaking test tool coming with Pgpool-II)
>> and succeeded. The essential condition seems:
>> 
>> 1) set backend_weight for the standby node 0 so that everything goes
>>   to primary PostgreSQL.
>> 
>> 2) after sequence of extended queries, send simple query like
>>   "SAVEPOINT PGJDBC_AUTOSAVE" (I think this is generated by the JDBC
>>   driver).

Follow up to the previous email. For a record, I would like to explain
why pgpool hangs up in the first place.

First of all, #1 is only applied to two-node cluster (primary and a
standby). If a cluster contains two or more standbys,  pgpool always
hangs up no matter what the weight settings are.

So, let me explain where the hanging up happens. Here is the stack
trace when it happens.

#0  0x00007f6cb2058392 in __libc_read (fd=9, 
    buf=buf at entry=0x55d5e5e86ca0 <readbuf>, nbytes=nbytes at entry=1024)
    at ../sysdeps/unix/sysv/linux/read.c:26
#1  0x000055d5e5ca2d40 in read (__nbytes=1024, __buf=0x55d5e5e86ca0 <readbuf>, 
    __fd=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2  pool_read (cp=0x7f6cb16f1578, buf=buf at entry=0x7ffecd8776f3, 
    len=len at entry=1) at utils/pool_stream.c:196
#3  0x000055d5e5c6d6ae in read_kind_from_backend (
    frontend=frontend at entry=0x55d5e67d35b8, 
    backend=backend at entry=0x7f6cb16ec928, 
    decided_kind=decided_kind at entry=0x7ffecd877b1e "")
    at protocol/pool_process_query.c:3410
#4  0x000055d5e5c7bf94 in ProcessBackendResponse (
    frontend=frontend at entry=0x55d5e67d35b8, 
    backend=backend at entry=0x7f6cb16ec928, state=state at entry=0x7ffecd877dbc, 
    num_fields=num_fields at entry=0x7ffecd877dba)
    at protocol/pool_proto_modules.c:2978
#5  0x000055d5e5c6c089 in read_packets_and_process (
    frontend=frontend at entry=0x55d5e67d35b8, 
    backend=backend at entry=0x7f6cb16ec928, reset_request=reset_request at entry=0, 
    state=state at entry=0x7ffecd877dbc, 
    num_fields=num_fields at entry=0x7ffecd877dba, 
    cont=cont at entry=0x7ffecd877dc4 "\001U")
    at protocol/pool_process_query.c:5130
#6  0x000055d5e5c6ca73 in pool_process_query (frontend=0x55d5e67d35b8, 
    backend=0x7f6cb16ec928, reset_request=reset_request at entry=0)
    at protocol/pool_process_query.c:299
#7  0x000055d5e5c64d3f in do_child (fds=fds at entry=0x55d5e680fd10)
    at protocol/child.c:467
#8  0x000055d5e5c38926 in fork_a_child (fds=0x55d5e680fd10, id=5)
    at main/pgpool_main.c:863
#9  0x000055d5e5c40d82 in PgpoolMain (discard_status=<optimized out>, 
    clear_memcache_oidmaps=<optimized out>) at main/pgpool_main.c:561
#10 0x000055d5e5c36846 in main (argc=<optimized out>, argv=<optimized out>)
    at main/main.c:365

>From #3, the particular line is 3410 in read_kind_from_backend():

				kind = 0;
--->				if (pool_read(CONNECTION(backend, i), &kind, 1))
				{
					ereport(FATAL,
							(return_code(2),
							 errmsg("failed to read kind from backend %d", i),
							 errdetail("pool_read returns error")));
				}

Here, "i" represents the PostgreSQL node id. i = 1 which is the
standby node. Why pgpool wants to read from node 1?  "session_context"
tells the answer.

(gdb) p *session_context
$1 = {process_context = 0x55d5e5e85fa0 <process_context_d>, 
  frontend = 0x55d5e67d35b8, backend = 0x7f6cb16ec928, in_progress = 0 '\000', 
  doing_extended_query_message = 0 '\000', 
  need_to_restore_where_to_send = 0 '\000', 
  where_to_send_save = '\000' <repeats 127 times>, command_success = 1 '\001', 
  writing_transaction = 0 '\000', failed_transaction = 0 '\000', 
  skip_reading_from_backends = 0 '\000', ignore_till_sync = 0 '\000', 
  transaction_isolation = POOL_UNKNOWN, query_context = 0x55d5e67ddbe8, 
  memory_context = 0x55d5e67f2a70, uncompleted_message = 0x0, message_list = {
    capacity = 8, size = 2, sent_messages = 0x55d5e67d95d8}, 
  load_balance_node_id = 0, mismatch_ntuples = 0 '\000', ntuples = {
    0 <repeats 128 times>}, reset_context = 0 '\000', query_cache_array = 0x0, 
  num_selects = 0, pending_messages = 0x55d5e67d96f0, 
  previous_message_exists = 0 '\000', previous_message = {type = POOL_PARSE, 
    contents = 0x0, contents_len = 0, query = '\000' <repeats 1023 times>, 
    statement = '\000' <repeats 127 times>, 
    portal = '\000' <repeats 127 times>, is_rows_returned = 0 '\000', 
    not_forward_to_frontend = 0 '\000', node_ids = '\000' <repeats 127 times>, 
    query_context = 0x0, flush_pending = 0 '\000', 
    is_tx_started_by_multi_statement = 0 '\000'}, major = 3, minor = 0, 
  suspend_reading_from_frontend = 0 '\000', temp_tables = 0x0, 
  is_in_transaction = 0 '\000', transaction_temp_write_list = 0x0, 
  si_state = SI_NO_SNAPSHOT, transaction_read_only = SI_NO_SNAPSHOT, 
  flush_pending = 0 '\000', is_tx_started_by_multi_statement = 0 '\000'}

Here, "in_progress" = 0, which means no query is in progress. Thus
pgpool tried to read from all backend nodes including node = 1.  But
since load_balance_node_id = 0, the simple query (in this case it was
"SAVEPOINT") was only sent to node 0. Of course reading from node 1
will be blocked.

Why ordinary series of simple query messages does not cause the
hanging up? In SimpleQuey() it does not reset in_progress flag
(sending and receiving extended query protocol messages reset
in_progress flag). Also it saves the query_context (which includes the
information which backend to read) into
session_context->query_context. So VALID_BACKEND consults the
query_context and instructs that only the node which query was sent to
needs to be read.

Another question is, why pgpool does not hang if backend_weight0 is 0?
In this case the load balance node is 1, and pgpool sends the query to
node 0 and node 1 because SAVEPOINT is needed to be sent to not only
primary but the load balance node. Fortunately the pool_read on line
4310 reads from node 0 and node 1 and no hanging up.

If there's two or more standbys, it's not that lucky. Even if load
balance node is 1, pgpool will read not only primary but node 2 which
had not received the query and will not have generated response
message - hanging up!

Note that a series of extended query protocol message saves each query
context into "pending message" stack. Upon receiving a response from
backend, it pops up the query context to determine where to read the
reply message. People may wonder why this mechanism is not applied to
simple query protocol message as well? Well, the answer is
performance. Creating and handling the stack is not cheap. I think
it's not worth the problem and it would be better to refuse to accept
a simple query if previous extended protocol message does not end as
the proposal suggests.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list