[pgpool-general: 5321] URGENT: Split-brain scenario...how to prevent?

David Sisk -X (dsisk - TEKSYSTEMS INC at Cisco) dsisk at cisco.com
Fri Feb 10 04:37:44 JST 2017

Hi folks...PGPool 3.5.4 with Postgres 9.3, streaming replication mode with primary and one standby.  I found a scenario that caused a split-brain (luckily, it's in a lab environment instead of a prod environment).

1)      Auto-failover occurred, original standby is now primary.

2)      Prior primary/eminent standby at status 3 NOT reset/sync'd in any way yet (not replicating and not in standby mode).

3)      pcp_attach_node 0 not only attaches the faulty standby, it actually promotes the faulty standby back to primary! :-0

What configuration parameters will prevent this from happening?  I'd prefer to get an error from pcp_attach_node and/or have the node stay at status 3.

Here are the health checks I have defined:
health_check_period = 10# NON-DEFAULT
health_check_timeout = 20
health_check_user = 'postgres'  # NON-DEFAULT
health_check_password = 'postgres'  # NON-DEFAULT
health_check_database = ''
health_check_max_retries = 2# NON-DEFAULT
health_check_retry_delay = 10# NON-DEFAULT

n  Not sure if this is used for load-balancing only, or also used to manage node status
sr_check_period = 10
sr_check_user = 'postgres'  # NON-DEFAULT
sr_check_password = 'postgres'  # NON-DEFAULT
sr_check_database = 'postgres'
delay_threshold = 100000# NON-DEFAULT


David Sisk
Engineer - Software
dsisk at cisco.com

Cisco Systems, Inc.
7025-6 Kit Creek Road PO Box 14987
United States

