[pgpool-general: 7487] Re: Consensus failover issues?

Tatsuo Ishii ishii at sraoss.co.jp
Thu Apr 8 14:58:42 JST 2021

> Hello.
> Got a bit of a weird one here, wondering if anyone can offer any
> insight in case I'm missing the blazingly obvious (which is quite
> possible)?
> I have three pgpool nodes - pgpool 4.0.4* running in watchdog mode -
> monitoring set of PostgreSQL 10.5* database in streaming replication
> mode.
> A colleague built these, cribbing off a pgpool I built quite some time
> ago that is running fine (and sailed through all testing including
> failover of the database, pgpool failover, and so forth, so I'm
> satisfied that one is OK)
> He shutdown the primary database this morning for a test, expecting
> pgpool to failover to the standby database.
> It didn't.
> It seems the primary pgpool node detected the failover, and is waiting
> on consensus from the two standby nodes.
> And it is waiting.
> And waiting.
> And waiting.
> It says it's slow in getting consensus.
> And keeps waiting ...
> I see no indication in the standby logs that they've decided that
> there needs to be a failover. They're reporting numerous failures to
> connect to the database, but other than that they appear quite
> unperturbed.
> I'm a bit puzzled.
> As far as I can see, the parameters for 'failover by consensus' are in
> place:
> failover_when_quorum_exists is on
> failover_require_consensus is on
> enable_multiple_failover_requests_from_node is off
> I'm going to try some tests this afternoon with some enhanced logging
> to see what I get, but I just thought I'd check here just in case ...
> Are there any known issues with this feature we might be running into?
> I know it's a long shot, and hopefully I should have some more info
> and logging later, but just thought I'd quickly ask in case somebody
> can go 'Oh yes, we had that too, you forgot to do [X]' or some such
> obvious things :)
> TIA.
> Regards,
> Martin.
> * I know, I know. My org isn't keen on keeping up to date on latest
> * versions as recommended by the community, more of the 'If it ain't
> * broke don't fix it' way of thinking :(
> -- 
Martin Goodson

Have you enabled health checking? Other watchdog needs it to know the
primary was shutdown.

Best regards,
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php

