[Pgpool-general] Recovery from network outage

Tue May 25 19:23:04 UTC 2010

Russ,

pgpool-II can call a script every time there is a failover event (see
'failover_command' in the manual). You can create a script that calls
pcp_node_attach to reattach the failed node once the access to it is
recovered. For example, keep testing "SELECT 1;" against the remote
database until it returns correctly, meaning there is connectivity.
After this, issue the pcp_node_attach of that node. And that's it.

BTW, looking at your pgpool.conf, you only have one node configured in
pgpool, so I'm assuming this is correct.

Daniel

> -----Original Message-----
> From: pgpool-general-bounces at pgfoundry.org [mailto:pgpool-general-
> bounces at pgfoundry.org] On Behalf Of Russ Neufeld
> Sent: Tuesday, May 25, 2010 2:58 PM
> To: pgpool-general at pgfoundry.org
> Subject: [Pgpool-general] Recovery from network outage
> 
> Hi all,
> 
> 	How do I set up pgpool to recover from the occasion network
> outage?  This morning we briefly lost network connectivity between our
> web machine and our db machine, and this showed up in
> /var/log/messages:
> 
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169:
> connect_inet_domain_socket: connect() failed: No route to host
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169:
> connection to 10.177.77.115(5432) failed
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169:
> new_connection: create_cp() failed
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 17169:
> notice_backend_error: 0 fail over request from pid 17169
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 16889:
> starting degeneration. shutdown host 10.177.77.115(5432)
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 16889:
> failover_handler: no valid DB node found
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 16889:
> failover_handler: set new master node: 1
> May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG:   pid 16889:
> failover done. shutdown host 10.177.77.115(5432)
> 
> 	I needed to restart pgpool manually for it to recover.  Here's
> what our pgpool.conf looks like:
> 
> listen_addresses = 'localhost'
> port = 5432
> enable_pool_hba = true
> replication_mode = false
> load_balance_mode = false
> master_slave_mode = false
> backend_hostname0 = '10.177.77.115'
> backend_port0 = 5432
> health_check_period = 0
> fail_over_on_backend_error = false
> connection_cache = true
> num_init_children = 20
> max_pool = 2
> child_life_time = 300
> connection_life_time = 0
> child_max_connections = 0
> child_idle_limit = 0
> authentication_timeout = 30
> 
> 	Do I need to play with health_check_period and/or
> health_check_timeout to get this right?  Is there a way to make pgpool
> resilient to network blips, or is this always a manual recovery?
> 
> 	Thanks,
> 
> 		Russ
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general