[Pgpool-general] problem with healthcheck

Thu Feb 3 22:24:02 UTC 2011

Le 03/02/2011 16:52, Michal Slocinski a écrit :
> Hi!
> 
> I'm pgpool newbie trying to setup cluster of two PostgreSQL servers.
> 
> I configureed pgpool.conf as follows:
> 
> # $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample-replication,v
> 1.11 2010/09/01 04:58:47 kitagawa Exp $
> listen_addresses = '*'
> port = 9999
> pcp_port = 9898
> socket_dir = '/tmp'
> pcp_socket_dir = '/tmp'
> backend_socket_dir = '/tmp'
> pcp_timeout = 10
> num_init_children = 32
> max_pool = 4
> child_life_time = 300
> connection_life_time = 0
> child_max_connections = 0
> client_idle_limit = 0
> authentication_timeout = 60
> logdir = '/tmp'
> pid_file_name = '/var/run/pgpool/pgpool.pid'
> replication_mode = true
> load_balance_mode = false
> replication_stop_on_mismatch = false
> failover_if_affected_tuples_mismatch = false
> replicate_select = false
> reset_query_list = 'ABORT; DISCARD ALL'
> white_function_list = ''
> black_function_list = 'nextval,setval'
> print_timestamp = true
> master_slave_mode = false
> master_slave_sub_mode = 'slony'
> delay_threshold = 0
> log_standby_delay = 'none'
> connection_cache = true
> health_check_timeout = 20
> health_check_period = 1
> health_check_user = 'nobody'
> failover_command = ''
> failback_command = ''
> fail_over_on_backend_error = true
> insert_lock = true
> ignore_leading_white_space = true
> log_statement = false
> log_per_node_statement = false
> log_connections = false
> log_hostname = false
> parallel_mode = false
> enable_query_cache = false
> pgpool2_hostname = ''
> backend_hostname0 = '172.16.2.72'
> backend_port0 = 5432
> backend_weight0 = 1
> backend_data_directory0 = '/var/lib/pgsql/data'
> backend_hostname1 = '172.16.2.73'
> backend_port1 = 5432
> backend_weight1 = 1
> backend_data_directory1 = '/var/lib/pgsql/data'
> enable_pool_hba = true
> recovery_user = 'nobody'
> recovery_password = ''
> recovery_1st_stage_command = ''
> recovery_2nd_stage_command = ''
> recovery_timeout = 90
> client_idle_limit_in_recovery = 0
> lobj_lock_table = ''
> ssl = false
> debug_level = 1
> 
> 
> Problem I have is that pgpool seems to have problem with checking
> status of PostgreSQL instances.
> 
> For example, I start both PostgreSQL servers and start pgpool, pgpool
> sees both of them as active and I can connect to pgpool and execute
> query - everything fine.
> 
> 2011-02-03 16:50:56 DEBUG: pid 22649: starting health checking
> 2011-02-03 16:50:56 DEBUG: pid 22649: health_check: 0 th DB node status: 1
> 2011-02-03 16:50:56 DEBUG: pid 22649: health_check: 1 th DB node status: 1
> 
> Now, I shutdown one of servers and pgpool correctly discovers that one
> of them is down.
> 
> 2011-02-03 16:50:56 DEBUG: pid 22649: starting health checking
> 2011-02-03 16:50:56 DEBUG: pid 22649: health_check: 0 th DB node status: 1
> 2011-02-03 16:50:56 DEBUG: pid 22649: health_check: 1 th DB node status: 3
> 
> However, when I bring it up again, pgpool seems to not be able to
> detect that node is back
> 
> 2011-02-03 16:50:56 DEBUG: pid 22649: starting health checking
> 2011-02-03 16:50:56 DEBUG: pid 22649: health_check: 0 th DB node status: 1
> 2011-02-03 16:50:56 DEBUG: pid 22649: health_check: 1 th DB node status: 3
> 

You are in replication mode. So, if you stopped node 1, node 0 continues
to work, but the node 1 is stopped and cannot be kept current. So, when
you start again the node 1, it can't be use on replication because there
could be a mismatch of data between them.

So, before restarting node 1, you need to rebuild it with node 0 datas.
I'm not much more familiar with the actions involved, but I'm sure it's
normal for node 1 to not get reattached automatically.

-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com