View Issue Details

IDProjectCategoryView StatusLast Update
0000542Pgpool-IIBugpublic2020-04-14 15:48
ReporteraviAssigned Tohoshiai 
PriorityhighSeveritymajorReproducibilitysometimes
Status feedbackResolutionopen 
Product Version3.6.9 
Target VersionFixed in Version 
Summary0000542: pgpool service will not come up after failover
DescriptionWe have 2 node Postgres 9.6.7 cluster with streaming replication. Pgpool is managed by our cluster as a managed resource and therefore has only one instance running.

We have seen cases where the machine that has Postgres master and pgpool is shutdown, Postgres on the other machine will assume the role of master. The cluster also attempts to start pgpool, but pgpool service fail to start.

In such a case postgres database is accessed with port 5432 but pgpool does not come up and we cannot access the database with port 9999. I do not know why, but at some point the cluster manages to bring up pgpool. Attached please find the log messages generated when I ran pgpool in debug mode.

172.18.255.42 is the down server
172.18.255.41 is the active server that pgpool does not come up

2019-09-01 15:59 is the time the cluster managed to bring up pgpool

Your help is most needed.
TagsNo tags attached.

Activities

avi

2019-09-02 16:53

reporter  

pgpool_messages.zip (2,020,083 bytes)
pgpool.conf (34,738 bytes)

hoshiai

2019-09-27 16:39

developer   ~0002890

Sorry for the late reply.

When I saw your log file, I concern that pgpool was stopped and started many times using pgpool command.
Why do you run pgpool stop and start command many times?

If you don't run pgpool stop command, is this problem resolved?

avi

2019-10-03 18:54

reporter   ~0002908

Thanks for the reply. We have a shell script that checks Postgres and pgpool health every 1 minute. If the script sees that it was successful accessing the database with port 5432 but failed with port 9999 (and vip) it will attempt to restart pgpool.

As you can see it attempted it many times but it failed to access the database with port 9999 so it kept on trying. At some point, which I assume is some timeout that expires, pgpool is restarted and everything works OK.

hoshiai

2019-10-29 14:02

developer   ~0002945

I see. This cause is the repeat of pgpool's restart by your script. Wahen primary postgresql node(172.18.255.42) is down, promote of standby node will be needed by pgpool. but it is not finished because pgpool is restarted many time.

hoshiai

2020-04-14 15:48

developer   ~0003324

Do you have any questions?
If you don't have any problems, I would like to close this issue.

Issue History

Date Modified Username Field Change
2019-09-02 16:53 avi New Issue
2019-09-02 16:53 avi File Added: pgpool.conf
2019-09-02 16:53 avi File Added: pgpool_messages.zip
2019-09-07 06:48 t-ishii Assigned To => hoshiai
2019-09-07 06:48 t-ishii Status new => assigned
2019-09-07 06:48 t-ishii Description Updated View Revisions
2019-09-27 16:39 hoshiai Status assigned => feedback
2019-09-27 16:39 hoshiai Note Added: 0002890
2019-10-03 18:54 avi Note Added: 0002908
2019-10-03 18:54 avi Status feedback => assigned
2019-10-29 14:02 hoshiai Status assigned => feedback
2019-10-29 14:02 hoshiai Note Added: 0002945
2020-04-14 15:48 hoshiai Note Added: 0003324