[pgpool-general: 4512] Re: pgpool unable to attach a node during failback command

Tatsuo Ishii ishii at postgresql.org
Wed Mar 2 07:43:28 JST 2016


> I'm using online_recovery *only* for full copy purposes - if node gets
> corrupted. pcp_recovery_node will execute recovery_1st_stage command which
> using pg_basebackup (full copy).

Why do you use pg_basebackup here if rsync is faster?

> but for switchover between the nodes (*none* of the nodes gets corrupted) -
> just for *switch roles:*
> *1. shutdown the primary.*
> *2. pgpool promotes the secondary.*
> *3. perform pcp_attach_node (old primary) which calls the failback.sh:*
>       the failback.sh does exactly what you describe:
>          a.pg_start_backup()
>          b. rsync (should be fast)
>          c. pg_stop_backup()
>          d. creates recovery.conf
>          e. start the node.

As I said before this will not work because pcp_attach_node assumes
the target node (in your case old primary) is already up and running
as a standby and sync with the new primary. You should use online
recovery if you want to make the old primary turn into the new
standby.

> any ideas or comments??
> 
> 
> Thanks,
> cohavisi
> 
> 
> On Tue, Mar 1, 2016 at 3:37 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> 
>> > OK....Thanks!
>> >
>> > I'm trying to implement an failover/failback on the nodes:
>> > 1. primary node gets down.
>> > 2. pgpool promotes the secondary node - make it primary.
>> > 3. by attaching the failed node (old primary) -  the failback.sh is
>> called
>> > and recovering the failed node (using rsync - much more faster) and make
>> it
>> > online secondary!
>>
>> I don't know what failback.sh is doing but if it just runs rsync, it's
>> not safe.  You should use pg_start_backup()/pg_stop_backup().
>>
>> BTW, if rsync is much faster for you, why don't you use it for online
>> recovery as well?
>>
>> > from what you are saying...
>> > just to make sure, I *can not* use the failback.sh script (which called
>> by
>> > pcp_attach_node) in order "recover" the node and make it online (as
>> > scondary).
>>
>> Ok, but failback.sh is not supposed to do what you want.
>> I recommend you to look into follow master command.
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > Thanks,
>> > cohavisi
>> >
>> > On Tue, Mar 1, 2016 at 2:15 PM, Tatsuo Ishii <ishii at postgresql.org>
>> wrote:
>> >
>> >> I'm not sure what you want to do (especialy I'm confused by
>> >> "secondary": what does it mean?). Have you taken look at follow master
>> >> script?
>> >>
>> >> Anyway...
>> >>
>> >> pcp_attach_node should be used for the case PostgreSQL server is
>> >> online and ready to use. Not for recovering a PostgreSQL server.
>> >>
>> >> Best regards,
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS, Inc. Japan
>> >> English: http://www.sraoss.co.jp/index_en.php
>> >> Japanese:http://www.sraoss.co.jp
>> >>
>> >> > Hi,
>> >> > Thanks for your replay...
>> >> > I do use online recovery in case a full recovery is needed (using
>> >> > pg_basebackup - via pcp_recovery_node).
>> >> > but I added an ability to perform a switchover between the nodes using
>> >> > stop/detach primary - failover occurs and reattach it as secondary
>> (using
>> >> > failback script).
>> >> > but as the failback finished the pgpool does not attach it as
>> secondary!!
>> >> >
>> >> >
>> >> > Can you please advice?
>> >> >
>> >> > cohavisi
>> >> >
>> >> >
>> >> > On Tue, Mar 1, 2016 at 10:41 AM, Tatsuo Ishii <ishii at postgresql.org>
>> >> wrote:
>> >> >
>> >> >> You should use online recovery instead of pcp_attach_node.
>> >> >>
>> >> >> Best regards,
>> >> >> --
>> >> >> Tatsuo Ishii
>> >> >> SRA OSS, Inc. Japan
>> >> >> English: http://www.sraoss.co.jp/index_en.php
>> >> >> Japanese:http://www.sraoss.co.jp
>> >> >>
>> >> >> > Hi,
>> >> >> > I have a Hugh problem regarding attaching a node (as secondary) to
>> the
>> >> >> pool
>> >> >> > after I performing pcp_attach_node.
>> >> >> >
>> >> >> > after failover is being completed successfully and valid primary
>> node
>> >> is
>> >> >> > active, i'm performing an *pcp_attach (via sql)* to the faulty
>> node in
>> >> >> > order to failback as secondary!
>> >> >> >
>> >> >> > *select pcp_attach_node
>> (0,'10.10.61.99',1200,9898,'*****','*****') *
>> >> >> >
>> >> >> > during this command, a failback script is being executed and
>> performs
>> >> the
>> >> >> > following:
>> >> >> > 1. rsync between the DB nodes.
>> >> >> > 2. create recovery.conf.
>> >> >> > 3. startup the node(as secondary).
>> >> >> >
>> >> >> > *the failback could take for 20 min to finish.*
>> >> >> >
>> >> >> > after the failback finished *successfully* (exit status 0) and the
>> >> node
>> >> >> > started as *secondary* (according to postgres) - streaming
>> >> replication.
>> >> >> >
>> >> >> > *the pgpool reportes the node status from 1 to 3 (instead of 2).*
>> >> >> >
>> >> >> > *** when failback finished early (less then few min) the pgpool
>> >> reports
>> >> >> the
>> >> >> > node status as 2 - as aspected.*
>> >> >> >
>> >> >> >
>> >> >> > please advice regarding this issue...
>> >> >> >
>> >> >> >
>> >> >> > *Thanks,*
>> >> >> > *cohavisi*
>> >> >>
>> >>
>>


More information about the pgpool-general mailing list