0000380: When a node is down, the first query to the cluster fails - Pgpool-II Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000380	Pgpool-II	General	public	2018-02-14 18:03	2018-02-17 00:12

Reporter	chg1995	Assigned To	t-ishii
Priority	normal	Severity	major	Reproducibility	always
Status	closed	Resolution	open
Product Version	3.6.7

Summary	0000380: When a node is down, the first query to the cluster fails
Description	I have the next enviroment: A first machine with pgPoolII and Postgres and a second machine with Postgres. The postgres servers are pgPool's nodes, having this: node_id \| hostname \| port \| status \| lb_weight \| role \| select_cnt \| load_balance_node \| replication_delay ---------+-------------+------+--------+-----------+--------+------------+-------------------+------------------- 0 \| 192.168.0.6 \| 5432 \| up \| 0.333333 \| master \| 0 \| false \| 0 1 \| 192.168.0.7 \| 5432 \| up \| 0.666667 \| slave \| 0 \| true \| 0 The problem is when I stop the service of any node, the first query I make after the stop, I get the following error: psql: FATAL: failed to create a backend connection DETAIL: executing failover on backend After this error, I can execute any query succesfully. I hope you can help me. Thank you.
Steps To Reproduce	PgPoolII configured with the file pgpool.conf attached in this ticket. We have both nodes up. We kill one of the nodes (master or slave, doesn't care) We make a query. For example, checking the nodes actives: sudo -u postgres psql -h localhost -p 5433 -c "show pool_nodes;" We get the following error: psql: FATAL: failed to create a backend connection DETAIL: executing failover on backend We make the query again: sudo -u postgres psql -h localhost -p 5433 -c "show pool_nodes;" And it runs sucessfully: node_id \| hostname \| port \| status \| lb_weight \| role \| select_cnt \| load_balance_node \| replication_delay ---------+-------------+------+--------+-----------+--------+------------+-------------------+------------------- 0 \| 192.168.0.6 \| 5432 \| down \| 0.333333 \| slave \| 0 \| false \| 0 1 \| 192.168.0.7 \| 5432 \| up \| 0.666667 \| master \| 0 \| true \| 0 (2 rows)
Additional Information	I attach the pgpool.conf file
Tags	No tags attached.

chg1995 2018-02-14 18:03 reporter	pgpool.conf (39,006 bytes)

t-ishii 2018-02-16 14:10 developer ~0001936	That's an expected behavior from your pgpool.conf. health_check_max_retries = 3000 health_check_retry_delay = 1 This means it will take 3000 seconds, nearly 1 hour before Pgpool-II does a fail over. Until then, Pgpool-II will do a fail over when a client connects to Pgpool-II because : fail_over_on_backend_error = on

chg1995 2018-02-16 21:44 reporter ~0001937	Thank you very much for the answer. Now this problem is solved setting the option health_check_max_retries to 0

t-ishii 2018-02-17 00:12 developer ~0001938	Thank you for the feedback. I will close this issue.

Date Modified	Username	Field	Change
2018-02-14 18:03	chg1995	New Issue
2018-02-14 18:03	chg1995	File Added: pgpool.conf
2018-02-16 11:04	t-ishii	Assigned To	=> t-ishii
2018-02-16 11:04	t-ishii	Status	new => assigned
2018-02-16 14:10	t-ishii	Note Added: 0001936
2018-02-16 14:41	t-ishii	Status	assigned => feedback
2018-02-16 21:44	chg1995	Note Added: 0001937
2018-02-16 21:44	chg1995	Status	feedback => assigned
2018-02-17 00:12	t-ishii	Note Added: 0001938
2018-02-17 00:12	t-ishii	Status	assigned => closed