[pgpool-hackers: 4351] Proposal: mitigating session disconnection issue while failover

Tatsuo Ishii ishii at sraoss.co.jp
Sat Jul 8 10:34:06 JST 2023


Background:

Currently Pgpool-II disconnects client sessions in failover or backend
error. This is fine because the client needs to access the PostgreSQL
backend anyway. But even in the case when the client does not use
particular backend, the client session is disconnected in
failover. This is not good. Suppose we have 3 streaming replication
PostgreSQL cluster and the client uses primary (node 0) and standby 1
(node 1), but does not use standby 2 (node 2). In this case ideally
shutting down node 2 should not disconnect the session. However the
session is disconnected if the session sends query to Pgpool-II while
failover. This is necessary because there are bunch of places in the
source code something like this:

for (i = 0; i < NUM_BACKENDS; i++)
{
	if (!VALID_BACKEND(i))
	   continue;
	   :
	   :

Here, NUM_BACKENDS represents the number of PostgreSQL backends (in
the case above it's 3). VALID_BACKEND returns true if the backend is
not in down status. If this code is executed while failover, the code
may access the backend socket which is not available any more and will
cause troubles including segfault. So inside VALID_BACKEND, we check
whether failover is performed, and if so, the pgpool child process
exits and the session disconnects.

Proposal:

In this proposal I would like to mitigate the issue above in certain
cases.  This proposal does not resolve all cases. Still some session
disconnection cases will remain. In all cases the precondition is, the
client does not use the backend which is the target of failover. To
make the proposal easier to understand, I supopose the session uses
only node 0 and/or node 1, and the failover target is node 2. Usually
to make this possible, we need to set backend_weight2 = 0 or
load_balance_mode = off.

case 1:

Failover on node 2 occurs while the session keeps on sending queries
to node 0 and/or 1. Change VALID_BACKEND so that it waits for
completion of failover. For this purpose new function
wait_for_failover_to_finish() is added. It waits for the completion of
failover up to MAX_FAILOVER_WAIT seconds (for now it's 30). It maybe
better to make the wait time configurable. Thoughts?

case 2:

Failover on node 2 occurs while not only the session keep on sending
queries to node 0 and/or 1, but new session is created. This is much
harder than case 1. There are multiple places where session
disconnection could occur.

- accepting new connection from client. In wait_for_new_connections,
  call wait_for_failover_to_finish to wait for completion of
  failover.

- creating new connection to backend. After accepting connection
  request from client and before creating connection to backend, call
  wait_for_failover_to_finish to wait for completion of failover.

- fixing broken socket. pool_get_cp checks whether exiting backend
  connection is broken. If it's broken, sleep 1 second to expect
  failover happens then calls wait_for_failover_to_finish().

- processing an application name. If an application name is included
  in a startup message, pgpool sends query like "SET application_name
  TO foo" to all backend nodes including node 2, which could cause a
  write error. To prevent the error, I modified
  connect_using_existing_connection, which is sending the SET command
  using do_command, so that do_command does not raise an ERROR by
  wrapping it in TRY/CATCH block.

Note that even with all fixes above, I was not able to fix some cases
where pool_write raises error. pool_write is used to write to backend
socket and there are too many places to fix all of them. For now we
need to run "pcp_detach_node 2" before shutdown it. pcp_detach_node
will tell all pgpool child process that node 2 is going down. Even if
a child process does not notice it and writes to backend 2 socket,
there will be no error because node 2 is still alive.

Tests: I have created new test 079.failover_session for this
feature. For the test pgbench is used. I can generate load using
continuous session (without -C option) and repeating
connection/disconnection (with -C option). There are 4 causes in the test:

"=== test1: backend_weight2 = 0 and pgbench without -C option"
"=== test2: backend_weight2 = 0 and pgbench with -C option"
"=== test3: load_balance_mode = off and pgbench without -C option"
"=== test4: load_balance_mode = off and pgbench with -C option"

test2 and test4 requires pcp_detach_node before shutting down node 2.

Patch:

Attached is the v1 patch for this feature. Comments and suggestions
are welcome.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: v1-0001-Avoid-session-disconnection.patch
Type: text/x-patch
Size: 11072 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20230708/d4942b12/attachment.bin>


More information about the pgpool-hackers mailing list