[pgpool-general: 8093] Re: Possible race condition during startup causing node to enter network isolation

Emond Papegaaij emond.papegaaij at gmail.com
Thu Apr 21 17:37:59 JST 2022


Hi,

Any thoughts on this issue? We are still experiencing intermittent test
failures due to this issue.

Best regards,
Emond

On Fri, Apr 1, 2022 at 9:03 AM Emond Papegaaij <emond.papegaaij at gmail.com>
wrote:

> Hi,
>
> Unfortunately, this issue still pops up every once in a while. We are now
> running 4.3.1. In our latest failure, the issue occured in a simple restart
> of all services on node 1, with node 3 being the leader. Pgpool on node 1
> tries to rejoin the cluster, but gets rejected over and over again. Node 3
> reports that 'only life-check process can mark this node alive again'. I've
> attached the full logs of both node 1 and 3. The configuration hasn't
> changed since last time.
>
> Best regards,
> Emond
>
> On Mon, Nov 29, 2021 at 4:12 PM Emond Papegaaij <emond.papegaaij at gmail.com>
> wrote:
>
>> On Mon, Nov 29, 2021 at 3:55 PM Bo Peng <pengbo at sraoss.co.jp> wrote:
>>
>>> Thank you for your test.
>>>
>>> Because we did some bug fix for watchdog since 4.2.4, it might be an
>>> upgrade issue.
>>> If you can reproduce this issue in 4.2.6, could you share the pgpool
>>> logs of all nodes?
>>>
>>
>> I'll continue to monitor the tests. If one fails again, I'll share the
>> logs. As I said, this could take some time, because the failure only occurs
>> about once a week. Thanks for your help so far.
>>
>> Best regards,
>> Emond
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20220421/fb67601c/attachment.htm>


More information about the pgpool-general mailing list