[pgpool-general: 8592] Re: PgPool thinks node 0 is in recovery.

Sat Feb 4 09:15:28 JST 2023

It seems pgpool thinks backend node 0 is down. To confirm this, can
you share pool_status file and the result of show pool_nodes?

> Logs attached, with log_statement = 'all'.
> 
> I don't see any attempted connections to the primary server when
> pgpool is starting up.
> 
> On 2/3/23 03:25, Tatsuo Ishii wrote:
>> Can you share PostgreSQL log of the primary with log_statement =
>> 'all'?  I would like to confirm that queries sent from sr_check worker
>> are reached to the primary. If so, you should see something like:
>>
>> 1771450 2023-02-03 18:19:05.585 JST LOG: statement: SELECT
>> pg_is_in_recovery()
>> 1771463 2023-02-03 18:19:15.597 JST LOG: statement: SELECT
>> pg_current_wal_lsn()
>>
>> Best reagards,
>> --
>> Tatsuo Ishii
>> SRA OSS LLC
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>
>>> Attached are three log files (pgpool, the primary and replicated
>>> servers).
>>>
>>> The primary is definitely not in replication mode.
>>>
>>> On 2/1/23 00:04, Tatsuo Ishii wrote:
>>>>> There must have been a miscommunication; I thought I attached my
>>>>> pgpool.conf and the log file to a previous email, but maybe not.
>>>>>
>>>>> I fixed the backend_port0 problem last week.
>>>> Ok.
>>>>
>>>>> pgppol is already running with pgpool.conf log_min_messages=debug3. Is
>>>>> that sufficient?
>>>> Yes.
>>>>
>>>>> Attached is the error log from when I last started pgpool, and the
>>>>> pgpool.conf from that time.
>>>> I see some errors with streaming replication check process:
>>>>
>>>> 2023-01-26 13:31:04.594: sr_check_worker pid 796880: DEBUG: do_query:
>>>> extended:0 query:"SELECT pg_current_wal_lsn()"
>>>> 2023-01-26 13:31:04.594: sr_check_worker pid 796880: CONTEXT: while
>>>> checking replication time lag
>>>> 2023-01-26 13:31:09.594: health_check1 pid 796881: DEBUG: health
>>>> check: clearing alarm
>>>> 2023-01-26 13:31:09.603: health_check1 pid 796881: DEBUG: authenticate
>>>> kind = 10
>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: SCRAM
>>>> authentication successful for user:pool_health_check
>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: authenticate
>>>> backend: key data received
>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: authenticate
>>>> backend: transaction state: I
>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: health
>>>> check: clearing alarm
>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: health
>>>> check: clearing alarm
>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: FATAL: Backend
>>>> throw an error message
>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: DETAIL: Exiting
>>>> current session because of an error from backend
>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: HINT: BACKEND
>>>> Error: "recovery is in progress"
>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: CONTEXT: while
>>>> checking replication time lag
>>>>
>>>> sr_check_process tried to dtermin WAL LSN on backend0 by issuing
>>>> "SELECT pg_current_wal_lsn()" to backend0 but failed with:
>>>>
>>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: HINT: BACKEND
>>>>> Error: "recovery is in progress"
>>>> This suggests that backend0 is running as a standby server. I guess
>>>> there's something wrong with the setting in backend0.  Maybe
>>>> standby.signal exists?  Can you share PostgreSQL log of backend0 at
>>>> it's start up?
>>>>
>>>> Best reagards,
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS LLC
>>>> English: http://www.sraoss.co.jp/index_en/
>>>> Japanese:http://www.sraoss.co.jp
>>> -- 
>>> Born in Arizona, moved to Babylonia.
> 
> -- 
> Born in Arizona, moved to Babylonia.