[pgpool-committers: 2230] pgpool: The commit tries to fix the failover deadlock problem

Fri Oct 24 04:01:35 JST 2014

The commit tries to fix the failover deadlock problem
"[pgpool-II 0000105]: Failover dead lock".
Although the issue is not readily reproducible but analysis of Yugo reveals
the following scenario which can cause the deadlock.

Scenario:

--quote--

When watchdog is used and a backend goes down, pgpools can hang.
This is related to semaphore locking used in failover(),degenerate_backend_set().
I analyzed and made a hypothesis of mechanism as below.

1. A backend goes down during many child processes are accepting connection.

2. A child process (P_c0) which detects the backend down executes
degererate_backend_set(). Then in this function,

	- aquires a semaphore lock (pool_semaphore_lock(REQUEST_INFO_SEM))
	- notifies to other pgpools (wd_degenrate_backend_sent())
	- sends a signal to trigger failover (kill(parent, SIGUSR1))
	- release the lock (pool_semaphore_unlock(REQUEST_INFO_SEM))

3. Then, other child process (P_c1) which detects backend down acquires the lock
as similar to P_c0.

4. watchdog process (P_w) which receives notification from P_c0 in other pgpool
also executes degenerate_backend_set() in wd_sent_response(), and tries to
acquires the lock. However, P_w can't acquire lock and goes into waiting since
P_c1 is already holding the lock.

5. The parent process (P_p) executes failover() and tries to acquires the lock.
However, P_p can't get lock and goes waiting for the same reason.

6. P_c1 tries to notify to other pgpools (wd_degenrate_backend_sent()).
However, P_w in other pgpools are also waiting lock and can't respond.
So, P_c1 waits response forever.

7. As a result, both pgools hang in following status;
        - P_c1: lock holder and waiting response from P_w in other pgpool
        - P_w: waiting lock
        - P_p: waiting lock

--unquote--

Fix:

The commit fixes the mentioned scenario by introducing the request info queues
in place of single request info structure for failover requests.
so the failover reporter process can quickly gets back control after enqueueing
the failover request and continue with its normal duty.

There are few minor behavioural changes happened because of the commit mainly in
the area where the failover() function intimates the PCP child for waking up
after processing the failover.

	1-) Now the failover() function only sends the WAKEUP call to the PCP child
            after executing all the queued failover/failback requests.
        2-) Restart signal is sent to PCP child only once after processing all queued requests.
        3-) failover() function sets the switching flag to true when starting to process
            failover queue and the flag keeps set till complete queue is processed.

Branch
------
V3_3_STABLE

Details
-------
http://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=de56c504b100e485a3ddf6bd105e8ede93b641ad

Modified Files
--------------
main.c                  |  757 +++++++++++++++++++++++++----------------------
pcp_child.c             |   23 +-
pool.h                  |   15 +-
pool_proto_modules.c    |    4 +-
recovery.c              |    2 -
watchdog/wd_ext.h       |    2 +-
watchdog/wd_interlock.c |    7 +-
7 files changed, 427 insertions(+), 383 deletions(-)