[pgpool-general: 3773] documentation ambiguities
Igor Gueths
igor.gueths at rackspace.com
Tue Jun 2 00:18:13 JST 2015
Hello,
for about a month or so now I have been working on implementing Pgpool
in front of a streaming cluster of Postgresql-9.3 instances, and have
found a number of items in the documentation that likely require
clarification; consequently, I wanted to get them out there in hopes of
getting this fixed upstream.
Exactly which components of the heartbeat mechanism introduced in
Pgpool-3.3 require Pgpool to run as root in order to work properly? When
running as a non-root user i.e. Postgres, there are warnings/errors in
the logs relating to failing to create the heartbeat receive socket,
SO_BINDTODEVICE requires root privilege, etc; however, the heartbeat
mechanism appears to work to the point where it can recognize that a
primary Pgpool instance has gone away, and therefore tell the secondary
instance to claim the virtual IP. To this point, one thing I did notice
was that when running as Postgres in this case, Watchdog periodically
got itself into situations where each node of the two Pgpool nodes
deployed believed the other was down, resulting in a very interesting
failover situation; this seems like rather similar behavior to another
post describing a race condition which seems to occur when multiple
Pgpool nodes are started in quick succession. I unfortunately was unable
to reproduce the split brain behavior consistently, so therefore I am
not entirely sure what is causing it at present. However, running Pgpool
and therefore Watchdog as Root, and setting wd_heartbeat_deadtime = 120
or somesuch has mitigated the issue.
Another item that doesn't seem to be elaborated on is the fact that
configuration parameters appear to be case-sensitive; this is readily
apparent when setting delegate_ip vs delegate_IP in pgpool.conf. In the
former case, at least under Pgpool-3.4.1 the parameter appears to be
ignored entirely, with no corresponding log messages indicating that
pgpool.conf encountered a parse error etc.
Last but not least, it is not entirely clear what defines a backend
error. This resulted in the main client of the database having issues
interacting with it, due to query errors from Postgres resulting in the
incorrect assessment by Pgpool that a failover was required.
failover_on_backend_error = off ended up resolving this one, although it
took several days to figure out that Pgpool's definition of a backend
error encompassed much more than just the death of Postmaster.
I would be happy to update relevant documentation items pertaining to
the above, presuming that someone isn't already working on it. Thanks
for reading!
--
Igor Gueths, Linux Systems Engineer Desk: +1.210.312.1952 Mobile:
+1.210.997.9397
More information about the pgpool-general
mailing list