[pgpool-general: 4847] Infinite cycle with 3 nodes and disallow_to_failover

Tema Zelikin gunslinger at nightflame.info
Tue Aug 2 20:15:39 JST 2016


Hello.

I'm making multiple node cluster, and have some backend nodes marked
with flag='DISALLOW_TO_FAILOVER'.
Since i need to use failover_command, i've set
fail_over_on_backend_error = on and enabled health_check.
And, when i detach or kill backend node with DISALLOW_TO_FAILOVER flag,
pgpool stops answering to queries with that error:
FATAL:  failed to create a backend connection
DETAIL:  executing failover on backend

Meanwhile, in the pgpool.log i have following:
Aug  2 14:05:26 localhost pgpool: 2016-08-02 14:05:26: pid 9992: LOG: 
failed to connect to PostgreSQL server on "10.20.30.189:5432",
getsockopt() detected error "Connection refused"
Aug  2 14:05:26 localhost pgpool: 2016-08-02 14:05:26: pid 9992: ERROR: 
failed to make persistent db connection
Aug  2 14:05:26 localhost pgpool: 2016-08-02 14:05:26: pid 9992:
DETAIL:  connection to host:"10.20.30.189:5432" failed
Aug  2 14:05:28 localhost pgpool: 2016-08-02 14:05:28: pid 4463: LOG: 
failed to connect to PostgreSQL server on "10.20.30.189:5432",
getsockopt() detected error "Connection refused"
Aug  2 14:05:28 localhost pgpool: 2016-08-02 14:05:28: pid 4463: ERROR: 
failed to make persistent db connection
Aug  2 14:05:28 localhost pgpool: 2016-08-02 14:05:28: pid 4463:
DETAIL:  connection to host:"10.20.30.189:5432" failed
Aug  2 14:05:28 localhost pgpool: 2016-08-02 14:05:28: pid 4463: LOG: 
failed to connect to PostgreSQL server on "10.20.30.189:5432",
getsockopt() detected error "Connection refused"
Aug  2 14:05:28 localhost pgpool: 2016-08-02 14:05:28: pid 4463: ERROR: 
failed to make persistent db connection
Aug  2 14:05:28 localhost pgpool: 2016-08-02 14:05:28: pid 4463:
DETAIL:  connection to host:"10.20.30.189:5432" failed
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992: ERROR: 
Failed to check replication time lag
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992:
DETAIL:  No persistent db connection for the node 1
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992: HINT: 
check sr_check_user and sr_check_password
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992:
CONTEXT:  while checking replication time lag
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992: LOG: 
failed to connect to PostgreSQL server on "10.20.30.189:5432",
getsockopt() detected error "Connection refused"
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992: ERROR: 
failed to make persistent db connection
Aug  2 14:05:29 localhost pgpool: 2016-08-02 14:05:29: pid 9992:
DETAIL:  connection to host:"10.20.30.189:5432" failed
Aug  2 14:05:31 localhost pgpool: 2016-08-02 14:05:31: pid 4463: LOG: 
health checking retry count 1
Aug  2 14:05:31 localhost pgpool: 2016-08-02 14:05:31: pid 4463: LOG: 
failed to connect to PostgreSQL server on "10.20.30.189:5432",
getsockopt() detected error "Connection refused"
Aug  2 14:05:31 localhost pgpool: 2016-08-02 14:05:31: pid 4463: ERROR: 
failed to make persistent db connection
Aug  2 14:05:31 localhost pgpool: 2016-08-02 14:05:31: pid 4463:
DETAIL:  connection to host:"10.20.30.189:5432" failed
Aug  2 14:05:32 localhost pgpool: 2016-08-02 14:05:32: pid 9992: ERROR: 
Failed to check replication time lag
Aug  2 14:05:32 localhost pgpool: 2016-08-02 14:05:32: pid 9992:
DETAIL:  No persistent db connection for the node 1
Aug  2 14:05:32 localhost pgpool: 2016-08-02 14:05:32: pid 9992: HINT: 
check sr_check_user and sr_check_password
Aug  2 14:05:32 localhost pgpool: 2016-08-02 14:05:32: pid 9992:
CONTEXT:  while checking replication time lag

This does not end until either i get backend node back to work, or
pgpool is fully restarted. What's the point of that flag? Or is this a bug?


More information about the pgpool-general mailing list