[pgpool-hackers: 4278] Bug with streaming replication worker

Tatsuo Ishii ishii at sraoss.co.jp
Sat Feb 4 20:51:05 JST 2023


Hi,

While taking care of this:
https://www.pgpool.net/pipermail/pgpool-general/2023-January/008633.html

I came across a bug with streaming replication worker. According the
log provided by the user:

2023-01-26 13:31:14.595: sr_check_worker pid 796880: FATAL: Backend throw an error message
2023-01-26 13:31:14.595: sr_check_worker pid 796880: DETAIL: Exiting current session because of an error from backend
2023-01-26 13:31:14.595: sr_check_worker pid 796880: HINT: BACKEND Error: "recovery is in progress"
2023-01-26 13:31:14.595: sr_check_worker pid 796880: CONTEXT: while checking replication time lag

The error message was logged when "SELECT pg_current_wal_lsn()" is
sent to PostgreSQL standby. This is strange because sr worker is
supposed to send the query only to the primary node.

I noticed the user set ALWAYS_PRYMARY and for some reasons primary
node was in down status.

There is a code in sr check worker:

		if (PRIMARY_NODE_ID == i)
		{
			if (server_version[i] >= PG10_SERVER_VERSION)
				query = "SELECT pg_current_wal_lsn()";
			else
				query = "SELECT pg_current_xlog_location()";

So it is supposed that "SELECT pg_current_wal_lsn()" is sent to the
primary node only at a first glance. But, the definition of
PRIMARY_NODE_ID is:

#define PRIMARY_NODE_ID (Req_info->primary_node_id >=0 && VALID_BACKEND_RAW(Req_info->primary_node_id) ? \
						 Req_info->primary_node_id:REAL_MAIN_NODE_ID)

So if Req_info->primary_node_id >= 0 and the primary is down, the
macro returns REAL_MAIN_NODE_ID, which was a standby node in this case
because the primary (node 0) was down and the standby (node 1) was the
only live node and REAL_MAIN_NODE_ID is 1. As a result when i == 1,

Fix is, check the primary node is down or not, before going in to the
replication delay loop.

Attached is proposed fix.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sr.patch
Type: text/x-patch
Size: 912 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20230204/2061f4d6/attachment.bin>


More information about the pgpool-hackers mailing list