[pgpool-general: 7988] Re: pcp_node_info does not return when host is lost on 4.3.0

Emond Papegaaij emond.papegaaij at gmail.com
Mon Jan 24 16:23:01 JST 2022


Hi,

Unfortunately, the patch doesn't help. The call to pcp_node_info still
hangs. I do however see a difference in the pgpool log. The pcp worker only
logs a single line:

2022-01-24 05:26:37: pid 81: LOG:  forked new pcp worker, pid=211 socket=7
2022-01-24 05:26:47: pid 211: LOG:  failed to connect to PostgreSQL server
on "172.29.30.2:5432", timed out

After this, there's no mention of pid 211. No log messages from that pid,
but also not from pid 81 (which I would expect to log the PCP process to
exit).

Best regards,
Emond

On Sat, Jan 22, 2022 at 2:15 PM Bo Peng <pengbo at sraoss.co.jp> wrote:

> Hello,
>
> Thank you for your reply.
>
> I think it is a particular issue of 4.3.0.
> Another developer, Tatsuo Ishii, has created a patch that fixes this
> issue.
> Could you check the attached patch if you can apply this patch?
>
> Best regards,
>
> On Fri, 21 Jan 2022 14:10:05 +0100
> Emond Papegaaij <emond.papegaaij at gmail.com> wrote:
>
> > >
> > > > We are working on the upgrade from 4.2.6 to 4.3.0 and we are facing a
> > >> test
> > >> > that is failing consistently. In one of our tests we powerdown 2 of
> the
> > >> 3
> > >> > hosts with a hard poweroff. Prior to the poweroff, we configure the
> > >> cluster
> > >>
> > >> Thank you for reporting this issue.
> > >> I am going to look into it.
> > >> Does this issue only occur in 4.3.0?
> > >
> > >
> > > Thanks for looking into this. As often is the case with these kinds of
> > > errors, I cannot be absolutely sure, but I haven't seen this error
> before
> > > with 4.2.6 or earlier. We skipped 4.2.7, as the release notes state it
> was
> > > only for PG14 support, which we don't need at the moment.
> > >
> > > To report back on this. We've ran 11 consecutive builds with 4.3.0, all
> > failing on this issue. I've check the past 40 or so build with 4.2.6 and
> > none of them failed. So this is definitely a regression in 4.3.0. Do you
> > already have an idea on the cause of this? If not, I can try to perform a
> > bisect on the diff between 4.2.6 and 4.3.0. This will however take me
> some
> > time, as every build takes about 2 hours. Git expects about 8 revisions
> to
> > check, so that's 2 whole working days.
> >
> > Best regards,
> > Emond
>
>
> --
> Bo Peng <pengbo at sraoss.co.jp>
> SRA OSS, Inc. Japan
> http://www.sraoss.co.jp/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20220124/e3ddb141/attachment.htm>


More information about the pgpool-general mailing list