[pgpool-general: 8256] Re: pcp_recovery_node command fails

Todd Stein todd.stein at microfocus.com
Sat Jun 25 23:23:47 JST 2022


Still no luck...  Now I really have to quit working...


Regards,

Todd Stein




-----Original Message-----
From: Tatsuo Ishii <ishii at sraoss.co.jp> 
Sent: Saturday, June 25, 2022 9:32 AM
To: Todd Stein <todd.stein at microfocus.com>
Cc: pgpool-general at pgpool.net
Subject: Re: [pgpool-general: 8251] Re: pcp_recovery_node command fails

It's not the right version of pgpool_recovery extension for Pgpool-II 4.3.x.  You need to install 1.4.

> When I run the \dx query on the template1 db I see that version 1.3 is installed.
> 
> 
> Regards,
> 
> Todd Stein
> 
> 
> -----Original Message-----
> From: pgpool-general <pgpool-general-bounces at pgpool.net> On Behalf Of 
> Tatsuo Ishii
> Sent: Saturday, June 25, 2022 9:10 AM
> To: Todd Stein <todd.stein at microfocus.com>
> Cc: pgpool-general at pgpool.net
> Subject: [pgpool-general: 8251] Re: pcp_recovery_node command fails
> 
> Do you have the extension on template1 database as well? Pgpool-II connects to template1 database while calling pgpool_recovery.
> 
>> I've gone back and run the \dx query on each of the nodes.  Same 
>> result
>> 
>> 
>> -----Original Message-----
>> From: pgpool-general <pgpool-general-bounces at pgpool.net> On Behalf Of 
>> Tatsuo Ishii
>> Sent: Friday, June 24, 2022 7:02 PM
>> To: Todd Stein <todd.stein at microfocus.com>
>> Cc: pgpool-general at pgpool.net
>> Subject: [pgpool-general: 8249] Re: pcp_recovery_node command fails
>> 
>> Where did you run the \dx command? You need to run \dx on the PostgreSQL primary node (probably catvmdxcpg12b.ftc.hpeswlab.net?).
>> 
>>> Looks like I have the same extension.
>>> postgres=# \dx pgpool_recovery
>>>                           List of installed extensions
>>>       Name       | Version | Schema |                Description
>>> -----------------+---------+--------+-------------------------------
>>> -----------------+---------+--------+-
>>> -----------------+---------+--------+-
>>> -----------------+---------+--------+----------
>>>  pgpool_recovery | 1.4     | public | recovery functions for pgpool-II for V4.3
>>> (1 row)
>>> 
>>> postgres=#
>> 
>> Where did you run the \dx command? You need to run \dx on the PostgreSQL primary node.
>> 
>> Also I suggest followings:
>> 
>> - Share pgpool.conf so that people (including me) confirm that your
>>   configuration and attempts are correct.
>> 
>> - Disable watchdog for now. At this point there are too many
>>   possibilities for the problem (exetnsion is not installed,
>>   pgpool.conf is not correct and/or unknown problem with
>>   watchdog). Once you confirm the system works, you can enable
>>   watchdog and continue testing. Let's proceed step by step.
>>   
>>> I want to run pcp_recovery_node command /usr/bin/pcp_recovery_node 
>>> -d -U postgres -h 16.78.121.246 -p 9898 -n 0
>>> 
>>> AFAIK the first step (stage) in the pcp_recovery_node process is to run the following:
>>> recovery_1st_stage_command = '/var/lib/pgsql/12/data/recovery_1st_stage'
>>> then the pgpool_remote_start script is run.
>>> 
>>> When the pcp_recovery_node command is run, it recieves the following list of arguments:
>>> 	PRIMARY_NODE_PGDATA=/var/lib/pgsql/12/data %R
>>> 	DEST_NODE_HOST=catvmdxcpg12a.ftc.hpeswlab.net %h
>>> 	DEST_NODE_PGDATA=/var/lib/pgsql/12/data %D
>>> 	PRIMARY_NODE_PORT=5432 %r
>>> 	DEST_NODE_ID=0 %d
>>> 	DEST_NODE_PORT=5432 %p
>>> 	PRIMARY_NODE_HOST=catvmdxcpg12b.ftc.hpeswlab.net %H
>>> 
>>> When the pgpool_remote_start script is run, it recieves the following list of arguments:
>>> 	DEST_NODE_HOST=catvmdxcpg12a.ftc.hpeswlab.net %h
>>> 	DEST_NODE_PGDATA=/var/lib/pgsql/12/data %D
>>> 	
>>> When I run /usr/bin/pcp_recovery_node, the following error is sent to stdout.
>>> 	-bash-4.2$ /usr/bin/pcp_recovery_node -U postgres -h 16.78.121.246 -p 9898 -n 0
>>> 	Password:
>>> 	ERROR:  executing recovery, execution of command failed at "1st stage"
>>> 	DETAIL:  command:"recovery_1st_stage"
>>> 	
>>> However, if I run the two scripts manually with the arguments, the process works.
>>> 
>>> -bash-4.2$ $PGDATA/recovery_1st_stage /var/lib/pgsql/12/data 
>>> catvmdxcpg12a.ftc.hpeswlab.net /var/lib/pgsql/12/data 5432 0 9898 
>>> catvmdxcpg12b.ftc.hpeswlab.net
>>> + MAIN_NODE_PGDATA=/var/lib/pgsql/12/data
>>> + DEST_NODE_HOST=catvmdxcpg12a.ftc.hpeswlab.net
>>> + DEST_NODE_PGDATA=/var/lib/pgsql/12/data
>>> + MAIN_NODE_PORT=5432
>>> + DEST_NODE_ID=0
>>> + DEST_NODE_PORT=9898
>>> + MAIN_NODE_HOST=catvmdxcpg12b.ftc.hpeswlab.net
>>> + PGHOME=/usr/pgsql-12
>>> + ARCHIVEDIR=/var/lib/pgsql/archivedir
>>> + REPLUSER=repl
>>> + MAX_DURATION=60
>>> + echo recovery_1st_stage: start: pg_basebackup for Standby node 0
>>> recovery_1st_stage: start: pg_basebackup for Standby node 0 ...
>>> ...
>>> recovery_1st_stage: end: recovery_1st_stage is completed 
>>> successfully
>>> + exit 0
>>> 
>>> Next, manually run pgpool_remote_start:
>>> -bash-4.2$ $PGDATA/pgpool_remote_start 
>>> catvmdxcpg12a.ftc.hpeswlab.net /var/lib/pgsql/12/data
>>> + DEST_NODE_HOST=catvmdxcpg12a.ftc.hpeswlab.net
>>> + DEST_NODE_PGDATA=/var/lib/pgsql/12/data
>>> + PGHOME=/usr/pgsql-12
>>> + echo pgpool_remote_start: start: remote start Standby node 
>>> + catvmdxcpg12a.ftc.hpeswlab.net
>>> pgpool_remote_start: start: remote start Standby node 
>>> catvmdxcpg12a.ftc.hpeswlab.net
>>> + ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null 
>>> + postgres at catvmdxcpg12a.ftc.hpeswlab.net -i 
>>> + /var/lib/pgsql/.ssh/id_rsa_pgpool ls /tmp
>>> Warning: Permanently added 'catvmdxcpg12a.ftc.hpeswlab.net,16.78.126.184' (ECDSA) to the list of known hosts.
>>> + '[' 0 -ne 0 ']'
>>> + ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null postgres at catvmdxcpg12a.ftc.hpeswlab.net -i /var/lib/pgsql/.ssh/id_rsa_pgpool '
>>>     /usr/pgsql-12/bin/pg_ctl -l /dev/null -w -D 
>>> /var/lib/pgsql/12/data start '
>>> Warning: Permanently added 'catvmdxcpg12a.ftc.hpeswlab.net,16.78.126.184' (ECDSA) to the list of known hosts.
>>> waiting for server to start.... done server started
>>> + '[' 0 -ne 0 ']'
>>> + echo pgpool_remote_start: end: PostgreSQL on catvmdxcpg12a.ftc.hpeswlab.net is started successfully.
>>> pgpool_remote_start: end: PostgreSQL on catvmdxcpg12a.ftc.hpeswlab.net is started successfully.
>>> + exit 0
>>> -bash-4.2$
>>> 
>>> The postgres server did not start! (only according to systemctl 
>>> status
>>> postgresql-12 running pg_ctl status shows that it is running.
>>> 
>>> Looking at the replication_delay for node 0 shows a value of 67109080.
>>> All of the files in $PGDATA have a very recent time stamp indicating that pg_basebackup had run.
>> 
>> You can check the standby PostgreSQL log to see what's going on.
>> 
>>> Regards,
>>> 
>>> Todd Stein
>>> 
>>> -----Original Message-----
>>> From: Tatsuo Ishii <ishii at sraoss.co.jp>
>>> Sent: Thursday, June 23, 2022 7:46 PM
>>> To: Todd Stein <todd.stein at microfocus.com>
>>> Cc: jon.schewe at raytheon.com; pgpool-general at pgpool.net
>>> Subject: Re: [pgpool-general: 8244] Re: pcp_recovery_node command 
>>> fails
>>> 
>>>> Many responses recommended installing the pgpool_recovery extension, I had done it as part of the install.  My install was done with RPMs.
>>>> 
>>>> ERROR:  extension "pgpool_recovery" already exists
>>>> 2022-06-23 16:29:25.782 EDT [21981] STATEMENT:  CREATE EXTENSION 
>>>> pgpool_recovery; The recovery_1st_stage script came from a sample provided with the RPM version.  The only thing I should need to do with it is to adjust the path of $PGHOME.
>>> 
>>> It's apparent that the correct version of pgpool_recovery extension 
>>> was not installed or pgpool_recovery extension was not installed at 
>>> all. You can check it by following command using psql on the primary
>>> PostgreSQL:
>>> 
>>> test=# \dx pgpool_recovery
>>>                           List of installed extensions
>>>       Name       | Version | Schema |                Description                
>>> -----------------+---------+--------+-------------------------------
>>> -----------------+---------+--------+-
>>> -----------------+---------+--------+-
>>> -----------------+---------+--------+--
>>> -----------------+---------+--------+--------
>>>  pgpool_recovery | 1.4     | public | recovery functions for pgpool-II for V4.3
>>> (1 row)
>>> 
>>>> Regards,
>>>> 
>>>> Todd Stein
>>>> 
>>>> -----Original Message-----
>>>> From: Todd Stein
>>>> Sent: Thursday, June 23, 2022 4:08 PM
>>>> To: Jon SCHEWE <jon.schewe at raytheon.com>; pgpool-general at pgpool.net
>>>> Subject: RE: pcp_recovery_node command fails
>>>> 
>>>> This is the stdout:
>>>> ERROR:  executing recovery, execution of command failed at "1st stage"
>>>> DETAIL:  command:"recovery_1st_stage"
>>>> 
>>>> The pgpool logs don't have much useful info.  Even when I set them to debug, it's not very helpful.
>>>> 
>>>> This seems to be a pretty common issue, lots of people post about the issue, but I've not seen a resolution to it yet.
>>>> 
>>>> the postgres log is actually more useful:
>>>> ERROR:  function pgpool_recovery(unknown, unknown, unknown, 
>>>> unknown, integer, unknown, unknown) does not exist at character 8
>>>> 2022-06-23 16:03:53.740 EDT [25708] HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
>>>> 2022-06-23 16:03:53.740 EDT [25708] STATEMENT:  SELECT 
>>>> pgpool_recovery('recovery_1st_stage', 'nodea', 
>>>> '/var/lib/pgsql/12/data', '5432', 0, '5432', 'nodeb')
>>>> 
>>>> 
>>>> Regards,
>>>> 
>>>> Todd Stein
>>>> 
>>>> -----Original Message-----
>>>> From: pgpool-general <pgpool-general-bounces at pgpool.net> On Behalf 
>>>> Of Jon SCHEWE
>>>> Sent: Thursday, June 23, 2022 3:34 PM
>>>> To: Todd Stein <todd.stein at microfocus.com>; 
>>>> pgpool-general at pgpool.net
>>>> Subject: [pgpool-general: 8242] Re: pcp_recovery_node command fails
>>>> 
>>>>> I'm trying to use pcp_recovery_node for online recovery in a pgpool/postgresql-12 cluster.
>>>>> 
>>>>> My cluster has PostgreSQL 12.8 and pgpool 4.3.2 running on CentOS 7.9 linux.
>>>>> 
>>>>>  
>>>>> 
>>>>> I've tried so many things, I'll not go into those details just yet. 
>>>>> 
>>>>>  
>>>>> 
>>>>> To start with, here is the output of the pcp_recovery_node command:
>>>>> 
>>>>>  
>>>>> 
>>>>> pcp_recovery_node -U postgres -h <VIP> -p 9898 -n 0
>>>>> 
>>>>> Password:
>>>>> 
>>>>> ERROR:  executing recovery, execution of command failed at "1st stage"
>>>>> 
>>>>> DETAIL:  command:"recovery_1st_stage"
>>>>> 
>>>> 
>>>> Do you see anything in your logs about the errors? Usually this is either on stdout from the service or in /var/log/pgpool...
>>>> I'm guessing that your recovery_1st_stage script either isn't defined or isn't doing what you expect.
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list