[Pgpool-general] inconsistency when using two pgpool instances in warm standby?

Fri Dec 12 09:01:21 UTC 2008

Duco,

I have tested the following design which perhaps may give you some  
light/ideas.

Basically I had the following:

1. Several appservers that talk to a pgpool VIP

2. Two pgpool servers, one as the master and one in standby. Each  
pgpool server has a VIP which is controlled and monitored by ucarp.

3. That means if ucarp cannot see the VIP up on the master server it  
brings the VIP online on the standby and also pgpool.

So if the network card that the VIP is on fails or the cable fails on  
the master node, ucarp moves the VIP and starts pgpool on standby  
server. The appservers don't even care what pgpool it talks to,  the  
only thing that matters is if they can reach the VIP.

4. The pgpool is configured to talk with 2 backends. If one fails I  
get an email and then perform an online recovery to bring the failed  
backend back on the pool. I could have the online recovery be done  
automatically but I think it is safer to do it manually since you  
never know what really happened to the failed node until you login and  
investigate.

  hope that helps or gives some ideas

-
Marcelo

On Dec 11, 2008, at 8:33, Duco Fijma <duco at fijma.net> wrote:

> Hello,
>
> I'm designing a warm-standby Postgres/pgpool system accepting
> connections from a number of application servers.
>
> Of course, running a single instance of pgpool introduces a single  
> point
> of failure. However, I'm having difficulties seeing how multiple
> instances of pgpool, in general, can have a consistent picture of the
> (failure) state of the "master" database.
>
> For example, consider two appservers "A" and "B", both running an
> instance of pgpool. Additionally, we have a master postgres database
> server and a standby postgres database server.
>
> In that situation, the network connection between "B" and the master
> database server is going down. The pgpool instance on B detect a  
> failure
> and triggers a failover to the standby server, which therefore exits
> recovery mode and starts accepting database connections.
>
> Especially in a design without single points of failure, the network
> connection between appserver "A" and the master database is likely not
> to be failing at the same time. Therefore, from the perspective of
> appserver "A" the master database is not failing at all. Its pgpool
> instance therefore happily continues to proxy database connections to
> the master database.
>
> The net effect is that our system is broken into two independent  
> halves.
> It just takes one database transaction to make these two halves into  
> two
> inconsistent systems :)
>
> I find it difficult to think of a solutions to this. Whatever schema I
> use, the very same failure that caused the database failover in the
> first place, can or will also hinder (direct or indirect)  
> communication
> between the two pgpool instances to prevent inconsistent failover  
> states.
>
> --
> Duco
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general