[pgpool-general: 2722] Re: Using one pgpoll instance per app server?

Tue Apr 8 08:11:58 JST 2014

>> Yes. Also I recommend to enable watchdog. In this case watchdog acts
>> as just a coordinator for same operations which need interlocking:
>> failover script and online recovery script.
> 
> I guess I don't understand the need for watchdog. My hope was that each
> pgpool instance would independently be able to figure out that the
> master DB had failed and would start to use the slave DB instead. Is
> that not the case?
> 
> To clarify, the suggested architecture is like this:
> 
> 
>            +---- appserver 1 --- pgpool 1 ---+
>            |                                 +--- Master DB
>     LB ----+---- appserver 2 --- pgpool 2 ---+
>            |                                 +--- Slave DB
>            +---- appserver 3 --- pgpool 3 ---+
> 
> Where "appserver 1" is on the same machine as "pgpool 1", "appserver 2"
> on the same as "pgpool 2", and so on.
> 
> If the master DB fails, should not all pgpool instances independently
> from each other detect this and switch over to the slave DB? What would
> I need to use watchdog for?

Yes, if master DB fails, all pgpool instances eventually recognize
it. Problem is, all pgpool instances try to start failover script,
which may or may not work depending on the coding of the script. For
example, pgpool1 promotes Slave DB to Master, then pgpool2 tries to
promote as well without success if the script did not expect promoting
PostgreSQL is not standby.

More serious problem is, it is possible that on a flaky network
(switch) pgpool1 detects Master DB goes down because of connection
problem between pgpool1 and Master DB (actually Master DB does not go
down), and promotes Slave DB to master. On the other hand, connection
between pgpool2 and Master DB is healthy and pgpool2 keeps on sending
update requests to Master DB. As a result, both Master DB and Slave
DB are now indecent master DB (we call it "split brain").

This kind of problem can be solved by watchdog: all watchdog process on
pgpool[123] communicates each other to solve the problem described
above. In the connection problem scenario between pgpool1 and Master
DB, pgpool1 initiates failover, then let pgpool2 and pgpool3 recognize
Master DB goes down.

> Now imagine that replication between master and slave DB is handled by
> those pgpool instances as well. Would this still be possible?

What do you mean by "handled"? You mean pgpool's native replication
(without using streaming replication of PostgreSQL)?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp