[pgpool-general: 7470] Recovery after a backend has failed

Emond Papegaaij emond.papegaaij at gmail.com
Tue Mar 30 23:33:08 JST 2021


Hi,

We are working on a configuration to get a cluster that requires
minimal effort to keep running and is mostly resilient to failures. We
use streaming replication on PG 12/Pgpool 4.1.4 with the following
settings:

master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 5
sr_check_database = 'postgres'
delay_threshold = 0

Health checks are configured as:

health_check_period = 5
health_check_timeout = 20
health_check_database = ''
health_check_max_retries = 0
health_check_retry_delay = 1
connect_timeout = 10000

For failover/failback and consensus we use:

failover_on_backend_error = on
detach_false_primary = off
search_primary_node_timeout = 0
auto_failback = on
auto_failback_interval = 10

failover_when_quorum_exists = on
failover_require_consensus = on
allow_multiple_failover_requests_from_node = off
enable_consensus_with_half_votes = off

With this setup we get a fairly reliable failover when a backend node
is lost. However, when connectivity to that node is restored, it
sometimes does not rejoin the cluster. Using pcp_node_indo to get the
status of the node, we get quite inconsistent results:

Hostname : 172.29.30.1
Port : 5432
Status : 3
Weight : 0.250000
Status Name : down
Role : standby
Replication Delay : 0
Replication State : streaming
Replication Sync State : async
Last Status Change : 2021-03-29 16:02:09

So pgpool detects that the node is streaming/async with 0 delay, but
still reports it as down. I would expect this node to be re-attached
automatically with because auto_failback = on. In this situation, we
tried restarting the pgpool nodes one by one, but the status = down
remained persistent in the cluster. Only when we stopped all pgpool
nodes simultaneously, they were able to recover. Is this behavior as
expected or do we need to change something in our configuration for
pgpool to recover in a situation like this?

Best regards,
Emond Papegaaij


More information about the pgpool-general mailing list