0000279: Opionally dismiss 57P01 as a clean disconnection when failover_on_backend_error is off and health checks are enabled - Pgpool-II Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000279	Pgpool-II	Enhancement	public	2017-01-10 03:52	2017-02-08 13:39

Reporter	z0rb1n0	Assigned To	t-ishii
Priority	high	Severity	feature	Reproducibility	always
Status	resolved	Resolution	open
Platform	PC	OS	Any compatible unix	OS Version	Any
Product Version	3.6.1

Summary	0000279: Opionally dismiss 57P01 as a clean disconnection when failover_on_backend_error is off and health checks are enabled
Description	Upstream session events such as TCP resets, administrative shutdowns (and unfortunately pg_terminate_backend() due to an SQL code overlap) are the only way PGPool can infer backend availability and trigger a failover if the periodic health checks are disabled. For shutdowns, this happens no matter what failover_on_backend_error is set to. When health checks are enabled this behavior seems redundant and possibly harmful, as the ability to terminate sessions, or even quickly restart a master backend before health checks attempts exhaustion can trigger a failover is a valuable asset for safer maintenance. Additionally, many Postgres admins are not in control of what SQL functions their users may be calling, which creates availability concerns. Is there any other limitation to such change that I'm overlooking or is it just legacy behavior that could be changed relatively easily? TIA F
Steps To Reproduce	Just do anything causing 57P01 to be returned and the whole PG server will be written off as dead
Tags	backend, error, failover, settings

t-ishii 2017-01-10 13:50 developer ~0001287	Pgpool-II 3.6 or later deal with the "admin shutdown problem". http://www.pgpool.net/docs/latest/en/html/release-3-6.html In some cases pg_terminate_backend() now does not trigger a fail-over. (Muhammad Usama) Because PostgreSQL returns exactly the same error code as postmaster down case and pg_terminate_backend() case, using pg_terminate_backend() raises a failover which user might not want. To fix this, now Pgpool-II finds a pid of backend which is the target of pg_terminate_backend() and does not trigger failover if so. This functions is limited to the case of simple protocol and the pid is given to pg_terminate_backend() as a constant. So if you call pg_terminate_backend() via extended protocol (e.g. Java) still pg_terminate_backend() triggers a failover.

z0rb1n0 2017-01-10 19:07 reporter ~0001288	That's very clear, however pg_terminate_backend() is just one case (and could be called from, say, a query/function/trigger/rule , which would evade this detection. EG: we just accidentally caused a failover by calling SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE [...] ). My point is mostly about ignoring any kind of backend session error/termination/reset and ONLY initiate failover when health checks fail and run out of retries. After all, health checks are all a lot of users count on.

t-ishii 2017-01-10 22:16 developer ~0001289	Ok. Can you please continue the discussion at the mailing list? This is not a forum for new feature discussions.

Date Modified	Username	Field	Change
2017-01-10 03:52	z0rb1n0	New Issue
2017-01-10 03:54	z0rb1n0	Tag Attached: backend
2017-01-10 03:54	z0rb1n0	Tag Attached: error
2017-01-10 03:54	z0rb1n0	Tag Attached: failover
2017-01-10 03:54	z0rb1n0	Tag Attached: settings
2017-01-10 13:50	t-ishii	Note Added: 0001287
2017-01-10 19:07	z0rb1n0	Note Added: 0001288
2017-01-10 22:16	t-ishii	Note Added: 0001289
2017-02-08 13:39	t-ishii	Assigned To	=> t-ishii
2017-02-08 13:39	t-ishii	Status	new => resolved