[pgpool-committers: 7702] pgpool: Fix race condition between detach_false_primary and follow_prim

Tatsuo Ishii ishii at sraoss.co.jp
Tue May 11 22:02:42 JST 2021


Fix race condition between detach_false_primary and follow_primary command.

It was reported that if detach_false_primary and follow_primary
command are running concurrently, many problem occured:

https://www.pgpool.net/pipermail/pgpool-general/2021-April/007583.html

Typical problem is, no primary node is found at the end.

I confirmed that this can be easily reproduced:

https://www.pgpool.net/pipermail/pgpool-hackers/2021-May/003893.html

In this commit new functions pool_acquire_follow_primary_lock(bool
block) and pool_release_follow_primary_lock(void) are introduced. They
are responsible for acquiring or releasing the lock. There are 3
places where those functions are used:

1) find_primary_node

This function is called upon startup and failover in the main pgpool
process to find new primary node.

2) failover

This function is called in the follow_primary_command subprocess
forked off by pgpool main process to execute follow_primary_command
script. The lock should be help until all follow_primary_command are
completed.

3) streaming replication check

Before starting verify_backend_node, which is the work horse of
detach_false_primary, the lock must be acquired. If it fails, just
skip the streaming replication check cycle.

The commit also deal with the case when watchdog is enabled.

https://www.pgpool.net/pipermail/pgpool-hackers/2021-May/003894.html

Multiple pgpool nodes perform detach_false_primary concurrently and
this is the cause of the problem.  To fix this detach_false_primary is
performed only on the leader node. Also if the quorum is absent,
detach_false_primary is not performed.

Branch
------
master

Details
-------
https://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=455f00dd5f5b7b94bd91aa0b6b40aab21dceabb9

Modified Files
--------------
src/include/pool.h                                 |  10 +-
src/main/pgpool_main.c                             |  79 ++++++++++++++
src/streaming_replication/pool_worker_child.c      | 120 +++++++++++++++------
.../regression/tests/018.detach_primary/test.sh    |  61 +++++++++++
4 files changed, 237 insertions(+), 33 deletions(-)



More information about the pgpool-committers mailing list