[pgpool-hackers: 4025] Re: invalid degenerate backend request on slave failure

Thu Sep 23 21:57:11 JST 2021

Hello Tatsuo

Thanks for the explanation.

I am using Pgpool-II 4.2.2.

I am still trying to investigate into the issue more and trying to reproduce it by changing different things.

It’s strange cz it only occurs sporadically.

I will keep you updated if I find more info.

Thank you,

Anirudh
On 23 Sep 2021, 2:36 PM +0200, Tatsuo Ishii <ishii at sraoss.co.jp>, wrote:
> Hi,
>
> > Hello
> >
> > I have a setup with 3 postgres nodes running behind one pgpool docker container.
> >
> > The setup works fine when the primary node fails/is shutdown. Failover goes fine in that case.
> >
> > However, if the standby goes down, the health check fails but instead of performing a failover, pgpool throws this error message-
> >
> > 2021-09-23 08:37:54: pid 94: LOG: invalid degenerate backend request, node id : 2 status: [2] is not valid for failover
> >
> > What¢s a bit strange to me is that this only happens when I am running the pgpool container through Nomad.
> >
> > If I run it directly without Nomad, it still works as expected. Even though this is the only difference that I see between the working and non-working setup, I believe this isn¢t the root cause.
> >
> > If you can point me towards when a degenerate request is considered invalid, it might help.
>
> The message says that PostgreSQL backend node id 2 is up and
> running. Strange thing is, that's the normal prerequisite to trigger
> failover because if the node is already down, there's no point to
> trigger failover. Actually before this (raising an error) happens,
> Pgpool-II checks a copy of backend status on the process's private
> memory, and it seems the status from the private memory is different
> from the status (which is in the shared memory) referred to in the
> error message. This is hard to understand because the status in the
> private memory and the one in the shared memory should be same since
> it has been copied when the health check process started.
>
> The only explanations I can think of are:
>
> - for some reason Nomad screwed up the private memory of pgpool
> (actually the process is health check process).
>
> - Pgpool-II has new bug and it was revealed accidentally.
>
> BTW, what version of Pgpool-II are you using? I need more checking on
> the source code.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20210923/096fd6c3/attachment.htm>