[pgpool-general: 9062] Re: Segmentation after switchover

Emond Papegaaij emond.papegaaij at gmail.com
Tue Apr 2 18:23:04 JST 2024


Op zo 31 mrt 2024 om 07:52 schreef Tatsuo Ishii <ishii at sraoss.co.jp>:

> In the second case make_persistent_db_connection() uses longjmp()
> through PG_TRY/PG_CATCH. In this environment, any variable on the
> stack (and later used in PG_CATCH block) needs to be declared with
> "volatile" modifier so that it is not smashed out when an exception
> occurs. The function missed it. Attached is the patch to fix this.
>
> I have not found the cause of the first case yet, but I suspect it is
> somewhat related to the error above because in the first case:
> > #0  0x0000559e25313126 in get_query_result (slots=0x7fff0ebcff50,
>
> "slot" was previously allocated by make_persistent_db_connection().
>

I've reran our tests with this patch applied, and both crashes are still
present. It does seem that the second is a bit less frequent (but that's
hard to say without running the tests many more times, which takes very
long). The first one still occurs frequently.

I can reproduce the failures very reliably, so I can rerun them with more
logging if that would help. Just give me the configuration options I need
to set. I can even build with custom parameters if needed.

I've also found the reason for missing coredumps from node 1:
systemd.coredump does not work during shutdown and these crashes are caused
by node 1 being rebooted. Unfortunately, there seems to be no way to fix
that. It's a known problem in systemd.

Best regards,
Emond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20240402/d52339c3/attachment.htm>


More information about the pgpool-general mailing list