[pgpool-general: 5197] Re: Architecture Questions

Fri Dec 23 04:53:45 JST 2016

The core problem here is that PGPool wasn’t designed to do what you are trying to do. Now that said, since you specify what commands or scripts PGPool calls to execute a failover once it detects a failure of the primary, you can write a script that does NOT execute the promote if the cause is a network partition rather than a primary database failure (although PGPool would still failover app connections to what it thinks is the promoted standby database).  But the question remains:  How would that script know the issue is a network partition rather a failed primary?  And at that point, why introduce the overhead and complexity of PGPool at all?

Postgres-BDR + HAProxy would be your best bet, but if you just *want* to use PGPool for some reason, you can certainly circumvent everything it does in streaming replication mode to try to make it work for this use case.  Instead of streaming replication mode, you might want to look at PGPool built-in replication mode (allows you to write to multiple masters, so you could have one in each region).  However, I would suspect that incurring that write overhead and latency (particularly over the WAN) is not going to yield very good performance metrics, but it IS more likely to work in your scenario without creating a split-brain, versus trying to use PGPool with streaming replication.

Hope this helps…good luck!

[banner2]

David Sisk
Engineer - Software
dsisk at cisco.com<mailto:dsisk at cisco.com>
Tel:

Cisco Systems, Inc.
7025-6 Kit Creek Road PO Box 14987
RESEARCH TRIANGLE PARK
27709-4987
United States
cisco.com

[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif]Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.

From: pgpool-general-bounces at pgpool.net [mailto:pgpool-general-bounces at pgpool.net] On Behalf Of Yates, James C. -ND
Sent: Wednesday, December 21, 2016 5:12 PM
To: Muhammad Usama <m.usama at gmail.com>
Cc: pgpool-general at pgpool.net
Subject: [pgpool-general: 5191] Re: Architecture Questions

I would prefer to be in read only mode if the West/East link goes down. The App is opening two connections one for read and one for write.  The app will trap any write errors and continue.  The East Application Server will connect to the first East Database server that it can connect to and depend on PgPool to redirect any write traffic, The same with the West App Servers they would connect to the first West database server that it can connect to.  The idea is that we would not fail over across regions unless the entire region goes down.

On Dec 21, 2016, at 1:56 PM, Muhammad Usama <m.usama at gmail.com<mailto:m.usama at gmail.com>> wrote:

On Tue, Dec 20, 2016 at 10:24 PM, Yates, James C. -ND <James.C.Yates.-ND at disney.com<mailto:James.C.Yates.-ND at disney.com>> wrote:

What I’m concerned about is the corporate network link between the AWS regions going down and PgPool doing a failover.  Then I could have the East and West region both think they have a master and doing updates/inserts on each.  Then when the link comes back up, my databases are out of sync and I could lose data.

Yes, This is a valid point. Earlier I thought you were concerned about the Watchdog going into the split-brain which is not likely to happen especially in the active-active watchdog configurations.
Unfortunately, I can't think of any perfect solution for backend failover in case of a network partitioning between East and West regions at the moment, Can you please explain the availability requirements of your application, Like when the regions are isolated from each other, is it okay as per the requirements that the region that does not have the primary PostgreSQL keeps offline until the link is restored?

Regards
Muhammad Usama

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20161222/8c50704f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 121724 bytes
Desc: image002.png
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20161222/8c50704f/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 1469 bytes
Desc: image004.png
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20161222/8c50704f/attachment-0003.png>