[Pgpool-general] failover done, now need help in online recovery - Please help!

Tue Apr 19 08:49:57 UTC 2011

After recovering old primary(5432) can you connect to pgpool without
problem? Then this is normal. Pgpool needs to restart all child
process after recovery.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Alright. Once the failover is done successfully and the standby is promoted to 
> new primary, I start the recovery of the old primary (5432).
> 
> The ol primary is getting recovered. But I found that pgpool is getting killed. 
> Why? Here is the pgpool log:
> ......
> 2011-04-19 11:23:42 LOG:   pid 15894: send_failback_request: fail back 0 th node 
> request from pid 15894
> 2011-04-19 11:23:42 DEBUG: pid 21977: s_do_auth: auth kind: 0
> 2011-04-19 11:23:42 ERROR: pid 21977: s_do_auth: unknown response "E" before 
> processing BackendKeyData
> 2011-04-19 11:23:42 ERROR: pid 21977: s_do_auth: unknown response "^@" before 
> processing BackendKeyData
> 2011-04-19 11:23:42 ERROR: pid 21977: s_do_auth: unknown response "^@" before 
> processing BackendKeyData
> 2011-04-19 11:23:42 ERROR: pid 21977: s_do_auth: unknown response "^@" before 
> processing BackendKeyData
> 2011-04-19 11:23:42 ERROR: pid 21977: s_do_auth: unknown response "V" before 
> processing BackendKeyData
> 2011-04-19 11:23:42 DEBUG: pid 21977: s_do_auth: parameter status data received
> 2011-04-19 11:23:42 ERROR: pid 21977: pool_read2: failed to realloc
> 2011-04-19 11:23:42 DEBUG: pid 15861: failover_handler called 
> 2011-04-19 11:23:42 DEBUG: pid 15861: failover_handler: starting to select new 
> master node
> 2011-04-19 11:23:42 LOG:   pid 15861: starting fail back. reconnect host 
> localhost(5432)
> 2011-04-19 11:23:42 LOG:   pid 15861: execute command: touch 
> /home/sandeep/PostgreSQL9.0/inst/bin/../failback.log
> 2011-04-19 11:23:42 DEBUG: pid 20267: child received shutdown request signal 3
> 2011-04-19 11:23:42 DEBUG: pid 15861: failover_handler: kill 20267
> 2011-04-19 11:23:42 DEBUG: pid 15861: failover_handler: kill 20268
> 2011-04-19 11:23:42 DEBUG: pid 15861: failover_handler: kill 20269
> 2011-04-19 11:23:42 DEBUG: pid 20268: child received shutdown request signal 3
> 2011-04-19 11:23:42 DEBUG: pid 20270: child received shutdown request signal 3
> .....
> 2011-04-19 11:23:42 LOG:   pid 15861: failover_handler: set new master node: 0
> 2011-04-19 11:23:42 DEBUG: pid 22011: I am 22011
> 2011-04-19 11:23:42 LOG:   pid 15861: failback done. reconnect host 
> localhost(5432)
> 
> My command to start recovery is:
> pcp_recovery_node  -d 20 localhost 9898 pg pg 0
> 
> 
>  The other thing I noticed is that the above command does not return, I have to 
> press Control-C, to get the prompt back. I'm working on CentOS 64bit.
> Thanks for your help.
> 
> 
> 
> ________________________________
> From: Sandeep Thakkar <sandeeptt at yahoo.com>
> To: Sandeep Thakkar <sandeeptt at yahoo.com>; pgpool-general at pgfoundry.org
> Sent: Thu, April 14, 2011 3:28:34 PM
> Subject: Re: [Pgpool-general] failover done, now need help in online recovery
> 
> 
> Can we bring the primary server up again? I found in the doc that "In 
> master/slave mode with streaming replication, online recovery can be  performed. 
> Only a standby node can be recovered. You cannot recover the primary node. To 
> recover the primary node, you have to stop all DB nodes and pgpool-II, and then 
> restore it from a backup." 
> 
> 
> 
> 
> So, can't we restore the primary without making the standby (new primary) down? 
> 
> 
> Thanks.
> 
> 
> 
> 
> ________________________________
> From: Sandeep Thakkar <sandeeptt at yahoo.com>
> To: pgpool-general at pgfoundry.org
> Sent: Wed, April 13, 2011 3:05:25 PM
> Subject: [Pgpool-general] failover done, now need help in online recovery
> 
> 
> Hi,
> 
> Failover:
> I have one Master (PG9.0), one Standby (PG9.0) and one instance of pgpool-II 
> (3.0.3) on the same box. I created a recovery.conf in the Standby and did all 
> the other required settings in pgool.conf and postgresql.conf. To mimic the 
> failover scenario, I killed the Master server, and found that failover process 
> started successfully. Standby stops in recovering mode, and is promoted to 
> primary (read-write). I could even execute write query on Standby (new Primary) 
> now.
> 
> Online recovery:
> To take it forward, I want to bring my old primary up and once up, it should 
> behave as a Standby (read only).  One step I know of is to execute the 
> basebackup.sh on the new Primary, which will copy it's base directory to the old 
> primary's data directory. Then what? I do not have the  recovery.conf on the old 
> primary yet. Do I need to keep it there? What else do I need to do?
> 
> Thanks