[pgpool-general: 6917] Re: Inconsistency between DB nodes with native replication
ishii at sraoss.co.jp
Thu Mar 5 21:18:22 JST 2020
> Do you think there is a possibility that only one specific child
> process of pgpool thinks that one of the DB nodes is dead?
Yes. Pgpool-II maintains backend status (up, down ,quarantine etc.) in
shared memory and also each pgpool process has local copy of it
(usually each pgpool looks into the local copy status). So if for some
reason the local status becomes out of sync with the shared memory
status, we might see the phenomena.
Have you seen any failover event of node 1 and then it failback while
the test? If process 29684 copied the down status of node 1 when
failover happened but failed to change the local status to "up" when
node 1 failed back, then this could be an expiation.
> Is there a way to find about it? I guess I could set the logging to
> some debug level, but I don't even want to imagine the log volume in
> that case, since it is already hundreds of megabytes in size.
As far as I know there's no way to find out the phenomena without
turning on debug log. But, yes, it would create huge amount of log.
> On the side note I also have the test running on other environment
> (in customer's data center) and it did not fail so far (since 25th
> of February) so it may come from some specific setting on those
> servers (pgpool.conf is set the same apart from IP addresses and
> number of child processes generated at startup).
If your test environment's network is more unstable than the data
center, it is likely to have more chance of failover. Of course this
is just a guess.
SRA OSS, Inc. Japan
More information about the pgpool-general