[Pgpool-general] Recovery from network outage

Russ Neufeld russellneufeld at gmail.com
Tue May 25 18:58:00 UTC 2010


Hi all,

	How do I set up pgpool to recover from the occasion network outage?  This morning we briefly lost network connectivity between our web machine and our db machine, and this showed up in /var/log/messages:

May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169: connect_inet_domain_socket: connect() failed: No route to host
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169: connection to 10.177.77.115(5432) failed
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169: new_connection: create_cp() failed
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 17169: notice_backend_error: 0 fail over request from pid 17169
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 16889: starting degeneration. shutdown host 10.177.77.115(5432)
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 16889: failover_handler: no valid DB node found
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 16889: failover_handler: set new master node: 1
May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 16889: failover done. shutdown host 10.177.77.115(5432)

	I needed to restart pgpool manually for it to recover.  Here's what our pgpool.conf looks like:

listen_addresses = 'localhost'
port = 5432
enable_pool_hba = true
replication_mode = false
load_balance_mode = false
master_slave_mode = false
backend_hostname0 = '10.177.77.115'
backend_port0 = 5432
health_check_period = 0
fail_over_on_backend_error = false
connection_cache = true
num_init_children = 20
max_pool = 2
child_life_time = 300
connection_life_time = 0
child_max_connections = 0
child_idle_limit = 0
authentication_timeout = 30

	Do I need to play with health_check_period and/or health_check_timeout to get this right?  Is there a way to make pgpool resilient to network blips, or is this always a manual recovery?

	Thanks,

		Russ


More information about the Pgpool-general mailing list