[pgpool-general: 7329] Re: Setup Pgpool2 with Postgresql Streaming Replication

Tatsuo Ishii ishii at sraoss.co.jp
Wed Oct 28 09:26:06 JST 2020


(Cc:ed to pgpool-general)

> Hi Tatsuo,
> 
> I have attached the additional files.  Yes, pcp_recovery_command hangs after it prompts me for the postgres password and I enter the password.  Then the cursor moves to the next line and it just hangs there.  It doesn't return the " pcp_recovery_node -- Command Successful" message as it should.

Ok, let's invest this first. Can you share the pgpool log file right
after you hit pcp_recovery_node command, along with the exact command
line you hit?

If normal, you see something like these lines in the log file (note
that the following lines are from Pgpool-II 4.2 beta1. Pgpool-II 4.0
will show a little bit different log)

2020-10-28 09:19:18: main pid 19853: LOG:  fork a new worker child process with pid: 19915
2020-10-28 09:19:24: pcp_main pid 19914: LOG:  forked new pcp worker, pid=19919 socket=6
2020-10-28 09:19:24: pcp_child pid 19919: LOG:  starting recovering node 1
2020-10-28 09:19:24: pcp_child pid 19919: LOG:  executing recovery
2020-10-28 09:19:24: pcp_child pid 19919: DETAIL:  starting recovery command: "SELECT pgpool_recovery('basebackup.sh', 'localhost', '/home/t-ishii/work/Pgpool-II/current/d/data1', '11002', 1, '11003')"
2020-10-28 09:19:24: pcp_child pid 19919: LOG:  executing recovery
2020-10-28 09:19:24: pcp_child pid 19919: DETAIL:  disabling statement_timeout
2020-10-28 09:19:26: pcp_child pid 19919: LOG:  node recovery, 1st stage is done
2020-10-28 09:19:26: pcp_child pid 19919: LOG:  checking if postmaster is started
2020-10-28 09:19:26: pcp_child pid 19919: DETAIL:  trying to connect to postmaster on hostname:/tmp database:postgres user:t-ishii (retry 0 times)
2020-10-28 09:19:26: pcp_child pid 19919: LOG:  checking if postmaster is started
2020-10-28 09:19:26: pcp_child pid 19919: DETAIL:  failed to connect to postmaster on hostname:/tmp database:postgres user:t-ishii
2020-10-28 09:19:29: pcp_child pid 19919: LOG:  checking if postmaster is started
2020-10-28 09:19:29: pcp_child pid 19919: DETAIL:  trying to connect to postmaster on hostname:/tmp database:postgres user:t-ishii (retry 1 times)
2020-10-28 09:19:29: pcp_child pid 19919: LOG:  node recovery, node: 1 restarted
2020-10-28 09:19:29: pcp_child pid 19919: LOG:  received failback request for node_id: 1 from pid [19919]
2020-10-28 09:19:29: main pid 19853: LOG:  Pgpool-II parent process has received failover request
2020-10-28 09:19:29: main pid 19853: LOG:  starting fail back. reconnect host /tmp(11003)
2020-10-28 09:19:29: main pid 19853: LOG:  Node 0 is not down (status: 1)
:
:
:

> Thanks,
> Alan
> 
> -----Original Message-----
> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp] 
> Sent: Saturday, October 24, 2020 2:21 AM
> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
> Cc: ishii at sraoss.co.jp
> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
> 
> Hi Alan,
> 
> I also need pg_hba.conf and recovery_1st_stage script.
> 
> Also please answer my queston.
> 
>>> but now I'm getting the new issue where it just hangs and does nothing.
>> 
>> You mean pcp_recovery_command hangs?
> 
>> The filename starts with "server1" which is actually the primary node "name: ltpgsql11".  I was just interchanging the names for documentation purposes.  I didn't include the standby node "name: ltpgsql12" configuration files because I think they should be the same except the host names are switched around in the files.  I'm wondering if it's my .pgpass file location too which is currently stored in /home/postgres user.  I think I saw somewhere in one of the scripts, the default path was set to a different location (/var/lib/...)?
>> 
>> Thanks,
>> Alan
>> 
>> 
>> -----Original Message-----
>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>> Sent: Friday, October 23, 2020 11:53 PM
>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>> Cc: ishii at sraoss.co.jp; pgpool-general at pgpool.net
>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>> 
>>> but now I'm getting the new issue where it just hangs and does nothing.
>> 
>> You mean pcp_recovery_command hangs?
>> 
>>> I suspect it's some settings in one of the configuration files (pgpool.conf, postgresql.conf, pg_hba.conf, pool_hba.conf, recovery_1st_stage, pgpool_remote_start, etc.) or system settings (authentication, etc.) that I applied when following the instructions in the documentation.  Probably need to correct something.
>> 
>> Can you share those files (including recovery.conf) so that I can examine them?
>> 
>>> Yeah, I was probably restarting both pgpool and postgresql service repeatedly.  Looks like after I used the scripts from your new link, it went away, but now I'm getting the new issue where it just hangs and does nothing.  I suspect it's some settings in one of the configuration files (pgpool.conf, postgresql.conf, pg_hba.conf, pool_hba.conf, recovery_1st_stage, pgpool_remote_start, etc.) or system settings (authentication, etc.) that I applied when following the instructions in the documentation.  Probably need to correct something.
>>> 
>>> -----Original Message-----
>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>> Sent: Friday, October 23, 2020 11:21 PM
>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>> Cc: ishii at sraoss.co.jp; pgpool-general at pgpool.net
>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>> 
>>> It seems the primary server was restarted again and again without any shutdown message. Pretty strange. Maybe you manually reset PostgreSQL or whole system repeatedly?
>>> 
>>>> 2020-10-21 01:25:12 HST [865]: [1-1]     0 00000: LOG:  starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>> 2020-10-21 01:25:12 HST [865]: [2-1]     0 00000: LOG:  listening on IPv4 address "0.0.0.0", port 6432
>>>> 2020-10-21 01:25:12 HST [865]: [3-1]     0 00000: LOG:  listening on IPv6 address "::", port 6432
>>>> 2020-10-21 01:25:12 HST [865]: [4-1]     0 00000: LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.6432"
>>>> 2020-10-21 01:25:12 HST [865]: [5-1]     0 00000: LOG:  redirecting log output to logging collector process
>>>> 2020-10-21 01:25:12 HST [865]: [6-1]     0 00000: HINT:  Future log output will appear in directory "log".
>>>> 2020-10-21 01:27:52 HST [1505]: [1-1]     0 00000: LOG:  starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_6 4-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>> 2020-10-21 01:27:52 HST [1505]: [2-1]     0 00000: LOG:  listening on IPv4 address "0.0.0.0", port 6432
>>>> 2020-10-21 01:27:52 HST [1505]: [3-1]     0 00000: LOG:  listening on IPv6 address "::", port 6432
>>>> 2020-10-21 01:27:52 HST [1505]: [4-1]     0 00000: LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.6432"
>>>> 2020-10-21 01:27:52 HST [1505]: [5-1]     0 00000: LOG:  redirecting log output to logging collector process
>>>> 2020-10-21 01:27:52 HST [1505]: [6-1]     0 00000: HINT:  Future log output will appear in directory "log".
>>>> 2020-10-21 01:32:25 HST [2726]: [1-1]     0 00000: LOG:  starting PostgreSQL 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>> 2020-10-21 01:32:25 HST [2726]: [2-1]     0 00000: LOG:  listening on IPv4 address "0.0.0.0", port 6432
>>>> 2020-10-21 01:32:25 HST [2726]: [3-1]     0 00000: LOG:  listening on IPv6 address "::", port 6432
>>> 
>>>> Looking back at our emails, looks like I sent it on Wednesday, Oct. 21.  Don't remember what time, maybe several hours before you responded.
>>>> 
>>>> -----Original Message-----
>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>> Sent: Friday, October 23, 2020 10:58 PM
>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>> Cc: pgpool-general at pgpool.net
>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>> 
>>>> The log started at 2020-10-18 22:42:46.989 HST and ended at 2020-10-23 06:46:54 HST.
>>>> When did you execute pcp_recovery_node exactly?
>>>> 
>>>>> Hi Tatsuo,
>>>>> 
>>>>> Here are the logs.
>>>>> 
>>>>> Thanks,
>>>>> Alan
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>>> Sent: Friday, October 23, 2020 10:30 PM
>>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>>> Cc: pgpool-general at pgpool.net
>>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> Looks like I may have been able to run the pcp_recovery_node command, but after I entered the password, it goes to the next line and just stays there, no errors or anything.  Maybe it's stuck?
>>>>> 
>>>>> If you provide the log I requested, I can examine what's wrong.
>>>>> 
>>>>>> 1) I need the PostgreSQL log while you are getting:
>>>>>>>> Ran this command on the primary server: pcp_recovery_node -h
>>>>>>>> 192.168.80.90 -p 9898 -U postgres -n 1 And received this error:
>>>>>>>> ERROR:  recovery is checking if postmaster is started
>>>>>>>> DETAIL:  postmaster on hostname:"ltpgsql12" database:"template1"
>>>>>>>> user:"postgres" failed to start in 90 second
>>>>> 
>>>>>>I looked on the standby server node I'm trying to recover, with the "top" command and it says CPU usage is around 35% for the pg_basebackup command.  It's been almost 5 hours like this.  Does anyone know how long it takes?  I heard that the standby recovery takes a long time, but does it really take this long?  I have around 15 databases that totals about 300-500 MB of data that Pgpool2 is supposedly trying to recover right now.
>>>>>> 
>>>>>> Thanks,
>>>>>> Alan
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>>>> Sent: Thursday, October 22, 2020 1:43 AM
>>>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>>>> Cc: pgpool-general at pgpool.net
>>>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>> 
>>>>>>> 2020-10-19 10:06:44.941 HST [24277] LOG:  starting PostgreSQL 
>>>>>>> 12.3 (Ubuntu 12.3-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled 
>>>>>>> by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
>>>>>> 
>>>>>> The document is for CentOS, while you are using Ubuntu. I am not sure the documentation is useful for you or not. Anyway...
>>>>>> 
>>>>>> 1) I need the PostgreSQL log while you are getting:
>>>>>> 
>>>>>>>> I'm at step 7.3.8.1 Set up PostgreSQL standby server:
>>>>>>>> Ran this command on the primary server: pcp_recovery_node -h
>>>>>>>> 192.168.80.90 -p 9898 -U postgres -n 1 And received this error:
>>>>>>>> ERROR:  recovery is checking if postmaster is started
>>>>>>>> DETAIL:  postmaster on hostname:"ltpgsql12" database:"template1"
>>>>>>>> user:"postgres" failed to start in 90 second
>>>>>> 
>>>>>> 2) Do you follow the step "7.3.4. Before Starting"? I am asking because you got this error:
>>>>>> 
>>>>>> 2020-10-19 13:10:23.521 HST [6757] postgres at postgres DETAIL:  User "postgres" has no password assigned.
>>>>>> 	Connection matched pg_hba.conf line 8: "host    all             all             0.0.0.0/0               md5"
>>>>>> 
>>>>>> Have you executed below in the documentation?
>>>>>> 
>>>>>>   [server1]# psql -U postgres -p 5432
>>>>>>        postgres=# SET password_encryption = 'scram-sha-256';
>>>>>>        postgres=# CREATE ROLE pgpool WITH LOGIN;
>>>>>>        postgres=# CREATE ROLE repl WITH REPLICATION LOGIN;
>>>>>>        postgres=# \password pgpool
>>>>>>        postgres=# \password repl
>>>>>>        postgres=# \password postgres
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> From: "Nguyen, Tu" <tnguyen3 at honolulu.gov>
>>>>>> Subject: RE: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>> Date: Thu, 22 Oct 2020 09:35:36 +0000
>>>>>> Message-ID: 
>>>>>> <F152F79048C4864E9825A767687750E404E82E97F0 at CchExMail2.cchnl.hnl>
>>>>>> 
>>>>>>> Hi Tatsuo,
>>>>>>> 
>>>>>>> Yes I'm using Pgpool-II version 4.0.
>>>>>>> 
>>>>>>> Ssh doesn't seem to be the issue, I was able to test ssh 
>>>>>>> successfully to both servers using this command from the new
>>>>>>> documentation:  ssh postgres at serverX -i ~/.ssh/id_rsa_pgpool
>>>>>>> 
>>>>>>> I have attached the postgresql-12-main.log file.
>>>>>>> 
>>>>>>> After we get the online recovery task fixed, are there recommended step by step instructions on how to best setup Postgresql Streaming Replication for Pgpool-II 4.0 clustering?  When I execute the "show pool_nodes" command, the replication_state and replication_sync_state values are blank and data is not being replicated.
>>>>>>> 
>>>>>>> On another topic, can the Pgpool2 Connection Pooling feature handle 3,000-4,000 simultaneous connections effectively?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Alan
>>>>>>> 
>>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Tatsuo Ishii [mailto:ishii at sraoss.co.jp]
>>>>>>> Sent: Wednesday, October 21, 2020 10:21 PM
>>>>>>> To: Nguyen, Tu <tnguyen3 at honolulu.gov>
>>>>>>> Cc: pgsql-general at postgresql.org
>>>>>>> Subject: Re: Setup Pgpool2 with Postgresql Streaming Replication
>>>>>>> 
>>>>>>> Hi Alan,
>>>>>>> 
>>>>>>> This is not the best forum to discuss Pgpool-II related topics. I advice you to go to the Pgpool-II dedicated forum:
>>>>>>> 
>>>>>>> https://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I'm following the steps from:
>>>>>>>> 
>>>>>>>> https://access.crunchydata.com/documentation/pgpool/4.0.0/exampl
>>>>>>>> e
>>>>>>>> -
>>>>>>>> c
>>>>>>>> l
>>>>>>>> u
>>>>>>>> s
>>>>>>>> ter.html
>>>>>>> 
>>>>>>> This documentation looks pretty old. The original and the latest documentation for Pgpool-II 4.0 is here (I assume you are using Pgpool-II 4.0):
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> https://www.pgpool.net/docs/40/en/html/example-cluster.html
>>>>>>> 
>>>>>>> I strongly suggest to look into this.
>>>>>>> 
>>>>>>>> I'm at step 7.3.8.1 Set up PostgreSQL standby server:
>>>>>>>> Ran this command on the primary server: pcp_recovery_node -h
>>>>>>>> 192.168.80.90 -p 9898 -U postgres -n 1 And received this error:
>>>>>>>> ERROR:  recovery is checking if postmaster is started
>>>>>>>> DETAIL:  postmaster on hostname:"ltpgsql12" database:"template1"
>>>>>>>> user:"postgres" failed to start in 90 second
>>>>>>>> 
>>>>>>>> How can I get this command to run successfully?
>>>>>>> 
>>>>>>> Probably you have a problem with ssh settings. The newer and original document describes far detailed steps to set up ssh settings. Please take a look at "7.3.2. Requirements" section in the newer document.
>>>>>>> 
>>>>>>> In the mean time to confirm that the problem is related to ssh, we need to look into the PostgreSQL log (not Pgpool-II log) on primary PostgreSQL node. Please share it (again, you'd better to post messages to the pgpool-general mailing list).
>>>>>>> 
>>>>>>>> Also, when I ran this command: psql -p 5433 -c "show pool_nodes"
>>>>>>>> It shows the following pgpool2 node status, but when I did a 
>>>>>>>> listing of databases on the primary and standby servers, I don't 
>>>>>>>> see the databases on the primary replicated to the standby.  How 
>>>>>>>> can I setup the Postgresql Streaming Replication and check if it's working?
>>>>>>> 
>>>>>>> Standby status is down because you failed to execute online recovery. You need to fix it.
>>>>>>> 
>>>>>>>> node_id | hostname  | port | status | lb_weight |  role   | select_cnt |
>>>>>>>> load_balance_node | replication_delay | replication_state | 
>>>>>>>> replication_sync_state | last_status_change
>>>>>>>> ---------+-----------+------+--------+-----------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>>>>>>>  0       | ltpgsql11 | 6432 | up     | 0.500000  | primary | 0          |
>>>>>>>> true              | 0                 |                   |                       
>>>>>>>> | 2020-10-21 11:56:48
>>>>>>>>  1       | ltpgsql12 | 6432 | down   | 0.500000  | standby | 0          |
>>>>>>>> false             | 0                 |                   |                       
>>>>>>>> | 2020-10-21 11:56:48
>>>>>>>> (2 rows)
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Alan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Sent from: 
>>>>>>>> https://www.postgresql-archive.org/PostgreSQL-general-f1843780.h
>>>>>>>> t
>>>>>>>> m
>>>>>>>> l
>>>>>>>> 
>>>>>>>> 


More information about the pgpool-general mailing list