[pgpool-general: 7906] Re: Possible race condition during startup causing node to enter network isolation

Bo Peng pengbo at sraoss.co.jp
Mon Nov 29 23:55:30 JST 2021


Hello,

On Mon, 29 Nov 2021 09:33:18 +0100
Emond Papegaaij <emond.papegaaij at gmail.com> wrote:

> Hi Bo Peng,
> 
> > > Unfortunately, the failure is not reproducible in a reliable way. It
> > occurs
> > > about once every 10 to 20 runs of our tests. The test that fails is a
> > full
> > > upgrade of our appliance. This upgrade includes a new docker container
> > for
> > > both postgresql and Pgpool. Both services are restarted. Pgpool is
> > upgraded
> > > from 4.2.4 to 4.2.6.
> >
> > I have checked your pgpool.conf.
> > I think it is not caused by the configuration.
> > Does this issue occur only during 4.2.4->4.2.6 upgrade ?
> > Does this issue occur after you upgraded all nodes to 4.2.6?
> >
> 
> I remember seeing this issue also on other scenario's, but I'm not 100%
> sure. I've moved our tests to the next version of our application, meaning
> the upgrade will now be from 4.2.6 to 4.2.6, just upgrading the docker
> container. If this issue is indeed caused by the upgrade, we will know in a
> few weeks. This issue occurs very rarely and the tests take several hours
> to complete, so we only have about 5 runs per day. We did hit the problem
> again this weekend and I've saved all logs from that run. So if there's
> anything you need from that run, let me know.

Thank you for your test.

Because we did some bug fix for watchdog since 4.2.4, it might be an upgrade issue.
If you can reproduce this issue in 4.2.6, could you share the pgpool logs of all nodes?

> Best regards,
> Emond


-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan
http://www.sraoss.co.jp/


More information about the pgpool-general mailing list