[pgpool-general: 7510] Re: Possible pgpool 4.1.4 failover auto_failback race condition

Takuma Hoshiai hoshiai.takuma at nttcom.co.jp
Fri Apr 16 13:18:50 JST 2021


Hi,

Thank you for your report.

Do you mean that failover_command is executed, but postgres
node 2 fail back while running follow_master_command?
Or follow_master_command is not excuted?
I research it. Could you share pgpool.log?

On 2021/04/14 14:54, Nathan Ward wrote:
> Hi,
> 
> I believe I’ve found a race condition with auto_failback.
> 
> In my test environment I have 3 servers each running both pgpool and postgres.
> 
> I simulate a network failure with iptables rules on one node.
> 
> I start the test with the following state:
> pgpool primary: node 2
> postgres primary: node 0
> 
> When I fail node 0, in order to trigger failover with follow_master (4.1.x still), I find that most of the time node 2 is reattached before follow_master gets a chance to run for that node. It is of course set to CON_DOWN when the failover is triggered, and I would expect it to stay in that state until follow_master reattaches it.
> 
> I believe, though I’m not 100% certain, that sometimes this comes from node 1.
> 
> Is this likely a configuration problem, or, is this a bug of some kind? I had a quick look at the code, and don’t see any changes that would impact this since 4.1.4 - but I am of course happy to be wrong about that !
> 
> We have the following set:
> auto_failback = on
> auto_failback_interval = 60
> 
> --
> Nathan Ward
> 
> 
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
> 

Best Regards,

-- 
Takuma Hoshiai <hoshiai.takuma at nttcom.co.jp>



More information about the pgpool-general mailing list