[Pgpool-general] failback setup problem

Uwe Bartels uwe.bartels at gmail.com
Sat Apr 30 09:17:08 UTC 2011


On 30 April 2011 09:44, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> > Hi Tatsuo,
> >
> > ok. now its working fine. thanks for your help.
> >
> > after getting through that initial setup for the first time, I'd like to
> > give you some feedback about the (at least for me) missing information in
> > the documentation.
> >
> > - please document the control flow during recovery, e.g.
> > connect to master server with recovery_user/recovery_password (connection
> > check)
> > run recovery_1st_stage_command
> > run checkpoint
> > run recovery_2nd_stage_command
> > ...
> > run failback_command
> >
> > - please document that recovery_1st_stage_command and
> > recovery_2nd_stage_command system calls by the current postgres
> masterserver
> > in the PGDATA directory are. And that the failback_command a shell script
> > command or system call from the pgpool server is. I needed to search for
> it
> > in the source code.
>
> Sorry for inconvenience. I will add info to the docs as you suggested.
>

thanks.



>
> > I have a different approach of recovering the postgres server. I'm
> > recovering from an existing backup. I do that because it is faster and I
> > don't put additional i/o load on the just activated server. I guess (or
> > hope) most people will have an existing backup.
> >
> > So my question is - if the aproach of recovering the failed server via a
> sql
> > command is optimal? What if both servers failed? then I'm not able to use
> > pcp-tools or pgpoolAdmin for recovering?
>
> I'm not sure what you are trying to do here. If "backup" means it was
> created by pg_dump_all, I don't think your approach works. Streaming
> replication requires a base backup(binary backup) which is managed by
> pg_start_backup/pg_stop_backup.
>

Yes of course I have a backup created as described in
http://www.postgresql.org/docs/9.0/static/continuous-archiving.html#BACKUP-BASE-BACKUP
.

My question is, what happens if both server failed somehow? Can I still use
pgpoolAdmin to recover a database server?

Best..
Uwe



>
> > I'm asking because I worked for several years as an
> > it-production-responsible and I learned a little how administrators
> > think/work. They are happy if they have a (or better ONE) defined
> recovery
> > procedure.
> > Where am I getting? I'm asking you if it would makes sense to recode or
> > reduce the recovery procedure code to one system call e.g.
> failback_command.
> > Most people have their backup and restore functionality coded and ready
> for
> > training and/or desaster. If they could simply use this very same
> > functionality within pgpooladmin that would be great.
> >
> > It might be that i have overseen something (as before) and this is
> already
> > possible. If so please tell me how.
> >
> > Best Regards,
> > Uwe
> >
> >
> > On 26 April 2011 07:34, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> >> > thanks, that message already helped. I tried to recover the postgres
> >> server
> >> > with the failback_command.
> >>
> >> You are welcome.
> >>
> >> > I didn't realize these recovery_* parameters yet.
> >> > So I use the recovery_* parameters for recovering the failed postgres
> >> > server.
> >>
> >> pgpool-II connects to backend to issue some SQLs including CHECKPOINT.
> >> recovery_* parameters define the user and password for the connection.
> >> Usually they are for PostgreSQL super user (postgres).
> >>
> >> > And the failback_command to attach the postgres server into pgpool
> >> > right?
> >>
> >> If you want to do something special, for example mailing to DBA, then
> >> you might want to specify it.  Otherwise you can leave it empty.
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese: http://www.sraoss.co.jp
> >>
> >> > Best Regards,
> >> > Uwe
> >> >
> >> >
> >> > On 26 April 2011 01:06, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >> >
> >> >> > I'm using pg-pool-II 3.0.3 with streaming replication.
> >> >> > I coded the failback scenario/script for the slave server and the
> >> script
> >> >> > itself works fine.
> >> >> >
> >> >> > I now configured the failback script in pgpool.conf and during
> testing
> >> an
> >> >> > error message comes up:
> >> >> > 2011-04-09 06:42:18 LOG:   pid 16863: starting recovering node 1
> >> >> > 2011-04-09 06:42:18 ERROR: pid 16863: start_recover: could not
> connect
> >> >> > master node.
> >> >> >
> >> >> > [root at adt-web01 pgpool-II-3.0.3]# pcp_node_info 10 adt-web01 9898
> >> >> postgres
> >> >> > postgres 0
> >> >> > adt-db01 5432 1 0.500000
> >> >> > [root at adt-web01 pgpool-II-3.0.3]# pcp_node_info 10 adt-web01 9898
> >> >> postgres
> >> >> > postgres 1
> >> >> > adt-db02 5432 3 0.500000
> >> >> >
> >> >> > pcp commands and pgpooladmin report that the master is up and
> running
> >> and
> >> >> > I'm able to connect to the master directly and through pgpool.
> >> >> > So what's wrong? So far everything else works fine.
> >> >>
> >> >> Assuming you have set recovery_user and recovery_passwd correctly,
> I'm
> >> >> not sure what's going on. IMO, the error message is very rare. It's
> so
> >> >> rare and there's a bug in the error path, which had not been found
> for
> >> >> long time. Can please try attached patch? The patch add a little bit
> >> >> usefull info to the error message above.
> >> >> --
> >> >> Tatsuo Ishii
> >> >> SRA OSS, Inc. Japan
> >> >> English: http://www.sraoss.co.jp/index_en.php
> >> >> Japanese: http://www.sraoss.co.jp
> >> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20110430/e2e495b4/attachment.html>


More information about the Pgpool-general mailing list