[Pgpool-general] Cannot add node after failure

Fernando Morgenstern fernando at consultorpc.com
Tue Dec 15 19:27:38 UTC 2009


Hello,

Thanks for your info!

I was able to do some progress with node recovery when using  
pgpool_recovery on both recovery command.

I am able to recovery most of the times, but sometimes it fails with  
the following error:

$ pcp_recovery_node  -d 90 localhost 9898 postgres ******* 2
DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="e", len=20, data=recovery failed
DEBUG: command failed. reason=recovery failed
BackendError
DEBUG: send: tos="X", len=4

pgpool log

2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet  
type of service 'M'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: salt sent to the client
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet  
type of service 'R'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: authentication OK
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet  
type of service 'O'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: start online recovery
2009-12-15 20:10:56 LOG:   pid 8747: starting recovering node 2
2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: start checkpoint
2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: finish checkpoint
2009-12-15 20:10:56 LOG:   pid 8747: CHECKPOINT in the 1st stage done
2009-12-15 20:10:56 LOG:   pid 8747: starting recovery command:  
"SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/ 
data')"
2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: start recovery
2009-12-15 20:10:56 ERROR: pid 8747: exec_recovery: pgpool_recovery  
command failed at 1st stage
2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: finish recovery
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet  
type of service 'X'
2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: client disconnecting.  
close connection
2009-12-15 20:11:22 DEBUG: pid 8446: starting health checking

Unfortunately i am not sure what this error means. Did it failed at  
"SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/ 
data')"? How can i find the reason?

Best Regards,
---

Fernando Marcelo
www.consultorpc.com
fernando at consultorpc.com


Em 15/12/2009, às 13:36, Jaume Sabater escreveu:

> On Tue, Dec 15, 2009 at 4:20 PM, Fernando Morgenstern
> <fernando at consultorpc.com> wrote:
>
>> While reading pgpool manual i found this:
>> Note that there is a restriction about online recovery. If pgpool- 
>> II works
>> on multiple hosts, online recovery does not work correctly, because
>> pgpool-II stops clients on the 2nd stage of online recovery. If  
>> there are
>> some pgpool hosts, pgpool-II excepted for receiving online recovery  
>> request
>> cannot block connections.
>
> It means running two or more pgpool-II instances simultaneously, which
> won't be your case since, with Heartbeat, you'll configure pgpool-II
> as a resource, hence it will only be active in one node at a given
> time.
>
> -- 
> Jaume Sabater
> http://linuxsilo.net/
>
> "Ubi sapientas ibi libertas"



More information about the Pgpool-general mailing list