[Pgpool-general] Cannot add node after failure
Tatsuo Ishii
ishii at sraoss.co.jp
Wed Dec 16 08:47:22 UTC 2009
> Hello,
>
> Thanks for your info!
>
> I was able to do some progress with node recovery when using
> pgpool_recovery on both recovery command.
>
> I am able to recovery most of the times, but sometimes it fails with
> the following error:
>
> $ pcp_recovery_node -d 90 localhost 9898 postgres ******* 2
> DEBUG: send: tos="R", len=46
> DEBUG: recv: tos="r", len=21, data=AuthenticationOK
> DEBUG: send: tos="D", len=6
> DEBUG: recv: tos="e", len=20, data=recovery failed
> DEBUG: command failed. reason=recovery failed
> BackendError
> DEBUG: send: tos="X", len=4
>
> pgpool log
>
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
> type of service 'M'
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: salt sent to the client
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
> type of service 'R'
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: authentication OK
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
> type of service 'O'
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: start online recovery
> 2009-12-15 20:10:56 LOG: pid 8747: starting recovering node 2
> 2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: start checkpoint
> 2009-12-15 20:10:56 DEBUG: pid 8747: exec_checkpoint: finish checkpoint
> 2009-12-15 20:10:56 LOG: pid 8747: CHECKPOINT in the 1st stage done
> 2009-12-15 20:10:56 LOG: pid 8747: starting recovery command:
> "SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/
> data')"
> 2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: start recovery
> 2009-12-15 20:10:56 ERROR: pid 8747: exec_recovery: pgpool_recovery
> command failed at 1st stage
> 2009-12-15 20:10:56 DEBUG: pid 8747: exec_recovery: finish recovery
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: received PCP packet
> type of service 'X'
> 2009-12-15 20:10:56 DEBUG: pid 8747: pcp_child: client disconnecting.
> close connection
> 2009-12-15 20:11:22 DEBUG: pid 8446: starting health checking
>
> Unfortunately i am not sure what this error means. Did it failed at
> "SELECT pgpool_recovery('pgpool_recovery', 'im-pp3', '/usr/local/pgsql/
> data')"? How can i find the reason?
Recovery command "pgpool_recovery" failed for some reason. Check
PostgreSQL log on master node. If it is not clear, try to add -x to
shell in your pgpool_recovery script. i.e.
#! /bin/sh -x
--
Tatsuo Ishii
SRA OSS, Inc. Japan
> Best Regards,
> ---
>
> Fernando Marcelo
> www.consultorpc.com
> fernando at consultorpc.com
>
>
> Em 15/12/2009, às 13:36, Jaume Sabater escreveu:
>
> > On Tue, Dec 15, 2009 at 4:20 PM, Fernando Morgenstern
> > <fernando at consultorpc.com> wrote:
> >
> >> While reading pgpool manual i found this:
> >> Note that there is a restriction about online recovery. If pgpool-
> >> II works
> >> on multiple hosts, online recovery does not work correctly, because
> >> pgpool-II stops clients on the 2nd stage of online recovery. If
> >> there are
> >> some pgpool hosts, pgpool-II excepted for receiving online recovery
> >> request
> >> cannot block connections.
> >
> > It means running two or more pgpool-II instances simultaneously, which
> > won't be your case since, with Heartbeat, you'll configure pgpool-II
> > as a resource, hence it will only be active in one node at a given
> > time.
> >
> > --
> > Jaume Sabater
> > http://linuxsilo.net/
> >
> > "Ubi sapientas ibi libertas"
>
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general
More information about the Pgpool-general
mailing list