[pgpool-general: 4336] pgpool issue on failure of primary.
piotr.gbyliczek at reconnix.com
Tue Jan 19 00:57:29 JST 2016
First, hello to all.
I have a specific issue with pgpool and I hope you guys may point me in
direction that will allow me to understand and fix it.
We have a postgres system based on streaming replication a primary server and
two standbys. In front of that are two servers running pgpool-II 3.3.7, doing
connection pooling and load balancing, with watchdog enabled.
Pgpool server holds the VIP that clients are connecting to. Three postgres
servers are using Enterprise Failover Manager (EnterpriseDB) to promote one of
standby server in case of primary failure (as well as notification about any
Now, the problem seems to be related to a primary failure, we have observed
the following case :
Before failure : multiple clients are connecting to the active pgpool,
connections are passed to DB backend fine, databases are seen as active, one is
seen as primary.
After failure of primary : both remaining database servers are seen as down by
pgpool, while EFM reports successful failover and connection directly to new
primary database proves that it is accepting writes.
Logs are showing some errors, for example :
Cannot accept() new connection. all backends are down
but I'm not sure I can follow what happens there.
I have tried attaching backend nodes back using pcp or pgpooladmin, but
nothing seems to be working.
Now, stopping pgpool on active server does the trick, and new pgpool server
detects the backend nodes correctly.
In addition, it seems that if a second pgpool server is master at the moment
of failure of primary db server, we don't observer this issue.
Any thoughts on how to get to the bottom of this ?
More information about the pgpool-general