[pgpool-committers: 8577] pgpool: Fix for [pgpool-general: 7896] Possible race condition..

Muhammad Usama m.usama at gmail.com
Mon May 2 07:03:17 JST 2022


Fix for [pgpool-general: 7896] Possible race condition..

Watchdog does not allow the remote nodes reported lost by life-check to rejoin
the cluster until the life-check process confirms the existence of life in the
previously lost nodes. This is good enough except for the case when the
(lost by life-check) node tries to rejoin the cluster after it was restarted
(Pgpool-II service restarted).
What happens is the cluster keeps rejecting the restarted node because
the cluster's life-check doesn't agree while the restarted node's life-check
waits to be added to cluster before it can start sending the heart-beats.

The fix is to allow the previously lost remote node become the part of the
cluster after restart, no matter the lost-reason.

Issue report:
https://www.pgpool.net/pipermail/pgpool-general/2021-November/007954.html

Branch
------
V4_2_STABLE

Details
-------
https://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=a7895a55ac60d25cef386a4e7e9f9fcc7ce11731

Modified Files
--------------
src/watchdog/watchdog.c | 21 ++++++---------------
1 file changed, 6 insertions(+), 15 deletions(-)



More information about the pgpool-committers mailing list