[pgpool-committers: 4278] pgpool: New Feature: Quorum and Consensus for backend failover
m.usama at gmail.com
Tue Sep 19 01:20:14 JST 2017
New Feature: Quorum and Consensus for backend failover
Add ability in the Pgpool-II to considers the existence of quorum and seek the
majority node (Pgpool-II nodes part of the watchdog cluster) consensus to
validate the backend node failover request.
The addition of this feature also made some modification in the execution
behaviour of the failover (failover, failback, promote-node) command.
Now Only the Pgpool-II of master watchdog node executes the failover
The mechanism of synchronised failover across the cluster has given the way to
the failover on master. Now, only the Pgpool-II of master watchdog node performs
the failover, and rest of the Pgpool-II nodes (standby nodes in the watchdog
cluster) re-sync the backend statuses with master after it is finished with
the execution of failover.
No More Failover locks
Since with this new failover mechanism we do not require any synchronisation and
guards against the execution of failover_commands by multiple Pgpool-II nodes,
So the commit removes all the distributed locks from failover function,
This makes the failover simpler and faster.
The commit adds the new kind of backend node operation NODE_QUARANTINE which is
effectively same as the NODE_DOWN, but with node_quarantine the configured
failover_command is not triggered,
Also the NODE_DOWN_REQUEST is automatically converted to the
NODE_QUARANTINE_REQUEST when the failover is requested on the backend node but
is rejected by the watchdog cluster because of missing quorum or consensus.
This means in the absence of quorum or consensus the failed backend nodes are
quarantined and when the quorum and consensus becomes available again the
Pgpool-II performs the failback operation on all the quarantine nodes.
And again when the failback is performed on the quarantine backend node the
failover function does not trigger the failback_command.
Controlling the Failover behaviour.
The commit adds three new configuration parameters to configure the failover
behaviour from user side.
When enabled the failover command will only be executed when the watchdog cluster
holds the quorum. And when the quorum is absent and failover_when_quorum_exists
is enabled, the failed backend nodes will get quarantine until the quorum becomes
available again. disabling it will enable the old behaviour of failover commands.
This new configuration parameter can be used to make sure we get the majority
vote before performing the failover on the node. When failover_require_consensus
is enabled then the failover is only performed after receiving the failover
request from the majority (From at least 50% nodes) of Pgpool-II nodes.
For example: In three nodes cluster the failover will not be performed until at
least two nodes ask to perform the failover on the particular backend node.
It is worthwhile to mention here that failover_require_consensus only works
when failover_when_quorum_exists is enables.
This parameter works in connection with failover_require_consensus config. When
enabled a single Pgpool-II node can vote for failover multiple times.
For example: In the three nodes cluster if one Pgpool-II node sends the failover
request of particular node twice that would be counted as two votes in favour of
failover and the failover will be performed even if we do not get a vote from
other two nodes.
And when enable_multiple_failover_requests_from_node is disabled, Only the first
vote from each Pgpool-II will be accepted and all other subsequent votes will
be marked duplicate and rejected.
So in that case Pgpool-II will require a majority votes from distinct nodes to
execute the failover.
Again this enable_multiple_failover_requests_from_node only becomes effective
when both failover_when_quorum_exists and failover_require_consensus are enabled.
Controlling the failover: The Coding perspective.
Although the failover functions are made quorum and consensus aware but there is
still a way to bypass the quorum conditions, and requirement of consensus.
For this the commit adds new request_details flags to control the
Below are the newly added flags values.
Setting this flag while issuing the failover command will not send the failover
request to the watchdog. But this flag may not be useful in any other place
other than where it is already used.
Mostly this flag can be used to avoid the failover command from going to
watchdog that is already originated from watchdog. Otherwise we may end up in
the infinite loop.
Setting this flag will bypass the failover_require_consensus configuration and
immediately perform the failover if quorum is present. This flag can be used to
issue the failover request originated from PCP command.
This flag is used for the command where we are failing back the quarantine nodes.
Setting this flag will not trigger the failback_command.
src/config/pool_config.c | 176 ++---
src/config/pool_config.l | 11 +-
src/config/pool_config_variables.c | 28 +-
src/include/pcp/libpcp_ext.h | 4 +-
src/include/pool.h | 23 +-
src/include/pool_config.h | 7 +-
src/include/watchdog/wd_ipc_commands.h | 14 +-
src/include/watchdog/wd_ipc_defines.h | 25 +-
src/include/watchdog/wd_json_data.h | 4 +-
src/main/health_check.c | 2 +-
src/main/pgpool_main.c | 348 ++++++---
src/pcp_con/pcp_worker.c | 12 +-
src/pcp_con/recovery.c | 2 +-
src/protocol/pool_connection_pool.c | 2 +-
src/protocol/pool_process_query.c | 4 +-
src/protocol/pool_proto_modules.c | 4 +-
src/sample/pgpool.conf.sample | 14 +
src/sample/pgpool.conf.sample-logical | 13 +
src/sample/pgpool.conf.sample-master-slave | 15 +
src/sample/pgpool.conf.sample-replication | 17 +
src/sample/pgpool.conf.sample-stream | 15 +
src/tools/pcp/pcp_frontend_client.c | 17 +-
src/tools/pgmd5/pool_config.c | 176 ++---
src/utils/pool_process_reporting.c | 4 +-
src/utils/pool_ssl.c | 4 +-
src/utils/pool_stream.c | 10 +-
src/watchdog/watchdog.c | 1105 +++++++++-------------------
src/watchdog/wd_commands.c | 204 ++---
src/watchdog/wd_json_data.c | 18 +-
29 files changed, 1008 insertions(+), 1270 deletions(-)
More information about the pgpool-committers