0000542: pgpool service will not come up after failover

ID	Project	Category	View Status	Date Submitted	Last Update

0000542	Pgpool-II	Bug	public	2019-09-02 16:53	2020-04-14 15:48

Reporter	avi	Assigned To	hoshiai
Priority	high	Severity	major	Reproducibility	sometimes
Status	feedback	Resolution	open
Product Version	3.6.9

Summary	0000542: pgpool service will not come up after failover
Description	We have 2 node Postgres 9.6.7 cluster with streaming replication. Pgpool is managed by our cluster as a managed resource and therefore has only one instance running. We have seen cases where the machine that has Postgres master and pgpool is shutdown, Postgres on the other machine will assume the role of master. The cluster also attempts to start pgpool, but pgpool service fail to start. In such a case postgres database is accessed with port 5432 but pgpool does not come up and we cannot access the database with port 9999. I do not know why, but at some point the cluster manages to bring up pgpool. Attached please find the log messages generated when I ran pgpool in debug mode. 172.18.255.42 is the down server 172.18.255.41 is the active server that pgpool does not come up 2019-09-01 15:59 is the time the cluster managed to bring up pgpool Your help is most needed.
Tags	No tags attached.

avi 2019-09-02 16:53 reporter	pgpool.conf (34,738 bytes) pgpool_messages.zip (2,020,083 bytes)

hoshiai 2019-09-27 16:39 developer ~0002890	Sorry for the late reply. When I saw your log file, I concern that pgpool was stopped and started many times using pgpool command. Why do you run pgpool stop and start command many times? If you don't run pgpool stop command, is this problem resolved?

avi 2019-10-03 18:54 reporter ~0002908	Thanks for the reply. We have a shell script that checks Postgres and pgpool health every 1 minute. If the script sees that it was successful accessing the database with port 5432 but failed with port 9999 (and vip) it will attempt to restart pgpool. As you can see it attempted it many times but it failed to access the database with port 9999 so it kept on trying. At some point, which I assume is some timeout that expires, pgpool is restarted and everything works OK.

hoshiai 2019-10-29 14:02 developer ~0002945	I see. This cause is the repeat of pgpool's restart by your script. Wahen primary postgresql node(172.18.255.42) is down, promote of standby node will be needed by pgpool. but it is not finished because pgpool is restarted many time.

hoshiai 2020-04-14 15:48 developer ~0003324	Do you have any questions? If you don't have any problems, I would like to close this issue.

Date Modified	Username	Field	Change
2019-09-02 16:53	avi	New Issue
2019-09-02 16:53	avi	File Added: pgpool.conf
2019-09-02 16:53	avi	File Added: pgpool_messages.zip
2019-09-07 06:48	t-ishii	Assigned To	=> hoshiai
2019-09-07 06:48	t-ishii	Status	new => assigned
2019-09-07 06:48	t-ishii	Description Updated
2019-09-27 16:39	hoshiai	Status	assigned => feedback
2019-09-27 16:39	hoshiai	Note Added: 0002890
2019-10-03 18:54	avi	Note Added: 0002908
2019-10-03 18:54	avi	Status	feedback => assigned
2019-10-29 14:02	hoshiai	Status	assigned => feedback
2019-10-29 14:02	hoshiai	Note Added: 0002945
2020-04-14 15:48	hoshiai	Note Added: 0003324