[pgpool-general: 8595] Re: PgPool thinks node 0 is in recovery.

Sat Feb 4 14:44:28 JST 2023

It was that simple... :(

On 2/3/23 23:17, Tatsuo Ishii wrote:
> You can stop pgpool and remove pgpool_status file then start pgpool so
> that it recognizes backend 0. pgpool_status is recreated upon starting
> up of pgpool.
>
> pgpool_status should be located under "logdir" (in your case /tmp).
>
>> Correct.  It's as if pgpool doesn't even see backend_hostname0.  (I
>> tried commenting out all of the backend_*1 config items, and pgpool
>> didn't see *anything*. "psql --port=9999" refused connection.)
>>
>> $ psql --port=9999 -c "\x" -c "show pool_nodes;"
>> Expanded display is on.
>> -[ RECORD 1 ]----------+--------------------
>> node_id                | 0
>> hostname               | FISPCCPGS405a
>> port                   | 5432
>> status                 | down <<<<<<<<<<<<<<<<<<<
>> pg_status              | up
>> lb_weight              | 0.666667
>> role                   | primary
>> pg_role                | primary
>> select_cnt             | 0
>> load_balance_node      | false
>> replication_delay      | 0
>> replication_state      |
>> replication_sync_state |
>> last_status_change     | 2023-02-03 23:07:59
>> -[ RECORD 2 ]----------+--------------------
>> node_id                | 1
>> hostname               | FISPCCPGS405b
>> port                   | 5432
>> status                 | up
>> pg_status              | up
>> lb_weight              | 0.333333
>> role                   | standby
>> pg_role                | standby
>> select_cnt             | 0
>> load_balance_node      | true
>> replication_delay      | 0
>> replication_state      |
>> replication_sync_state |
>> last_status_change     | 2023-02-03 23:07:59
>>
>>
>> On 2/3/23 18:15, Tatsuo Ishii wrote:
>>> It seems pgpool thinks backend node 0 is down. To confirm this, can
>>> you share pool_status file and the result of show pool_nodes?
>>>
>>>> Logs attached, with log_statement = 'all'.
>>>>
>>>> I don't see any attempted connections to the primary server when
>>>> pgpool is starting up.
>>>>
>>>> On 2/3/23 03:25, Tatsuo Ishii wrote:
>>>>> Can you share PostgreSQL log of the primary with log_statement =
>>>>> 'all'?  I would like to confirm that queries sent from sr_check worker
>>>>> are reached to the primary. If so, you should see something like:
>>>>>
>>>>> 1771450 2023-02-03 18:19:05.585 JST LOG: statement: SELECT
>>>>> pg_is_in_recovery()
>>>>> 1771463 2023-02-03 18:19:15.597 JST LOG: statement: SELECT
>>>>> pg_current_wal_lsn()
>>>>>
>>>>> Best reagards,
>>>>> --
>>>>> Tatsuo Ishii
>>>>> SRA OSS LLC
>>>>> English:http://www.sraoss.co.jp/index_en/
>>>>> Japanese:http://www.sraoss.co.jp
>>>>>
>>>>>> Attached are three log files (pgpool, the primary and replicated
>>>>>> servers).
>>>>>>
>>>>>> The primary is definitely not in replication mode.
>>>>>>
>>>>>> On 2/1/23 00:04, Tatsuo Ishii wrote:
>>>>>>>> There must have been a miscommunication; I thought I attached my
>>>>>>>> pgpool.conf and the log file to a previous email, but maybe not.
>>>>>>>>
>>>>>>>> I fixed the backend_port0 problem last week.
>>>>>>> Ok.
>>>>>>>
>>>>>>>> pgppol is already running with pgpool.conf log_min_messages=debug3. Is
>>>>>>>> that sufficient?
>>>>>>> Yes.
>>>>>>>
>>>>>>>> Attached is the error log from when I last started pgpool, and the
>>>>>>>> pgpool.conf from that time.
>>>>>>> I see some errors with streaming replication check process:
>>>>>>>
>>>>>>> 2023-01-26 13:31:04.594: sr_check_worker pid 796880: DEBUG: do_query:
>>>>>>> extended:0 query:"SELECT pg_current_wal_lsn()"
>>>>>>> 2023-01-26 13:31:04.594: sr_check_worker pid 796880: CONTEXT: while
>>>>>>> checking replication time lag
>>>>>>> 2023-01-26 13:31:09.594: health_check1 pid 796881: DEBUG: health
>>>>>>> check: clearing alarm
>>>>>>> 2023-01-26 13:31:09.603: health_check1 pid 796881: DEBUG: authenticate
>>>>>>> kind = 10
>>>>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: SCRAM
>>>>>>> authentication successful for user:pool_health_check
>>>>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: authenticate
>>>>>>> backend: key data received
>>>>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: authenticate
>>>>>>> backend: transaction state: I
>>>>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: health
>>>>>>> check: clearing alarm
>>>>>>> 2023-01-26 13:31:09.612: health_check1 pid 796881: DEBUG: health
>>>>>>> check: clearing alarm
>>>>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: FATAL: Backend
>>>>>>> throw an error message
>>>>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: DETAIL: Exiting
>>>>>>> current session because of an error from backend
>>>>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: HINT: BACKEND
>>>>>>> Error: "recovery is in progress"
>>>>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: CONTEXT: while
>>>>>>> checking replication time lag
>>>>>>>
>>>>>>> sr_check_process tried to dtermin WAL LSN on backend0 by issuing
>>>>>>> "SELECT pg_current_wal_lsn()" to backend0 but failed with:
>>>>>>>
>>>>>>>> 2023-01-26 13:31:14.595: sr_check_worker pid 796880: HINT: BACKEND
>>>>>>>> Error: "recovery is in progress"
>>>>>>> This suggests that backend0 is running as a standby server. I guess
>>>>>>> there's something wrong with the setting in backend0.  Maybe
>>>>>>> standby.signal exists?  Can you share PostgreSQL log of backend0 at
>>>>>>> it's start up?
>>>>>>>
>>>>>>> Best reagards,
>>>>>>> --
>>>>>>> Tatsuo Ishii
>>>>>>> SRA OSS LLC
>>>>>>> English:http://www.sraoss.co.jp/index_en/
>>>>>>> Japanese:http://www.sraoss.co.jp
>>>>>> -- 
>>>>>> Born in Arizona, moved to Babylonia.
>>>> -- 
>>>> Born in Arizona, moved to Babylonia.
>> -- 
>> Born in Arizona, moved to Babylonia.

-- 
Born in Arizona, moved to Babylonia.