[pgpool-general: 7994] Re: pcp_node_info does not return when host is lost on 4.3.0

Tue Jan 25 22:15:08 JST 2022

> I'm not aware of anything special in our setup with this regard. It's a
> streaming replication with 3 nodes. Maybe it's because the servers are
> powered down hard and I do not give pgpool time to detect the failures
> before I start reading the node info.

It appeared that the reason why I didn't see the errors was pgpool
connected to PostgreSQL via Unix domain sockets in my development
environment.

After more thought, I think get_nodes ("gut" of pcp_node_info) should
not try to connect to PostgreSQL at all if it already knows the
PostgreSQL server is down (actually get_nodes() calls PQpingParams()
and it already knows whether PostgreSQL server is up or not). In this
case it can immediately return "unknown" for pg_role field, and thus
it can avoid the hang.

Attached is the patch. In the mean time please disregard the previous patch.

> I'm doing some more troubleshooting with additional logging added to pgpool
> to see if I can find the point in the code where the execution gets stuck,
> loops or bails out in some unexpected way. I'll let you know the results
> when I have more information.

Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pcp_node_info_hang.patch
Type: text/x-patch
Size: 527 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20220125/abfb4475/attachment.bin>