[pgpool-general: 7518] Recovery after failure on a single backend

Fri Apr 23 19:05:35 JST 2021

Hi all,

In our application we support both a clustered setup and a setup with
a single node. In the situation with a single node, we still put
pgpool between the application and the database for consistency. This
means pgpool runs as a standalone node, with no watchdog and only a
single backend. This works fine most of the time, however, if for some
reason the database (or the connection to it) fails, pgpool marks the
backend as invalid and never recovers. This will make a small hickup
in the network to cause permanent outage of the service until pgpool
is restarted. Is there any way to make pgpool reconnect to the
database in this setup to recover from this situation?

Below is the pgpool log when I manually stop the datbase. At this
moment, pgpool will never recover, even when the database is started
again. It will keep saying 'all backend nodes are down, pgpool
requires at least one valid node'.

2021-04-23 11:55:55: pid 14: LOG:  reading and processing packets
2021-04-23 11:55:55: pid 14: DETAIL:  postmaster on DB node 0 was
shutdown by administrative command
2021-04-23 11:55:55: pid 14: LOG:  received degenerate backend request
for node_id: 0 from pid [14]
2021-04-23 11:55:55: pid 1: LOG:  Pgpool-II parent process has
received failover request
2021-04-23 11:55:55: pid 1: LOG:  starting degeneration. shutdown host
tkh-db(5432)
2021-04-23 11:55:55: pid 1: WARNING:  All the DB nodes are in down
status and skip writing status file.
2021-04-23 11:55:55: pid 1: LOG:  failover: no valid backend node found
2021-04-23 11:55:55: pid 1: LOG:  Restart all children
2021-04-23 11:55:55: pid 1: LOG:  find_primary_node_repeatedly: all of
the backends are down. Giving up finding primary node
2021-04-23 11:55:55: pid 1: LOG:  failover: no follow backends are degenerated
2021-04-23 11:55:55: pid 1: LOG:  failover: set new primary node: -1
failover done. shutdown host tkh-db(5432)2021-04-23 11:55:55: pid 1:
LOG:  failover done. shutdown host tkh-db(5432)
2021-04-23 11:55:55: pid 46: LOG:  worker process received restart request
2021-04-23 11:55:56: pid 45: LOG:  restart request received in pcp child process
2021-04-23 11:55:56: pid 1: LOG:  PCP child 45 exits with status 0 in failover()
2021-04-23 11:55:56: pid 1: LOG:  fork a new PCP child pid 80 in failover()
2021-04-23 11:55:56: pid 1: LOG:  child process with pid: 13 exits
with status 256
....
2021-04-23 11:55:56: pid 1: LOG:  child process with pid: 44 exits with status 0
2021-04-23 11:55:56: pid 1: LOG:  worker child process with pid: 46
exits with status 256
2021-04-23 11:55:56: pid 1: LOG:  fork a new worker child process with pid: 81
2021-04-23 11:55:56: pid 81: LOG:  process started
2021-04-23 11:56:00: pid 79: FATAL:  pgpool is not accepting any new connections
2021-04-23 11:56:00: pid 79: DETAIL:  all backend nodes are down,
pgpool requires at least one valid node
2021-04-23 11:56:00: pid 79: HINT:  repair the backend nodes and restart pgpool
2021-04-23 11:56:00: pid 1: LOG:  child process with pid: 79 exits
with status 256
2021-04-23 11:56:00: pid 1: LOG:  fork a new child process with pid: 82
2021-04-23 11:57:00: pid 54: FATAL:  pgpool is not accepting any new connections
2021-04-23 11:57:00: pid 54: DETAIL:  all backend nodes are down,
pgpool requires at least one valid node
2021-04-23 11:57:00: pid 54: HINT:  repair the backend nodes and restart pgpool
2021-04-23 11:57:00: pid 1: LOG:  child process with pid: 54 exits
with status 256
2021-04-23 11:57:00: pid 1: LOG:  fork a new child process with pid: 83
2021-04-23 11:57:00: pid 83: FATAL:  pgpool is not accepting any new connections

Best regards,
Emond