[pgpool-general: 7857] Re: Rejecting database mutations without a quorum

Emond Papegaaij emond.papegaaij at gmail.com
Sat Nov 6 01:13:11 JST 2021


Hi Luca,

I'm aware our setup is somewhat atypical. The reason for this setup is that
we ship our application as a virtual appliance that's supposed to be a
single deployment. Running in a cluster means deploying the same appliance
3 times. Things like a VIP work very nicely in most situations, but for us
its really not an option as the cluster setup is supposed to be an internal
detail from the perspective of the customer.

We are thinking about implementing a safeguard that monitors the local
pgpool watchdog and shuts down the database, pgpool and the application in
case it reports having lost the quorum. However, this requires additional
monitoring, polling and it would be much more reliably if pgpool could
initiate this itself.

Best regards,
Emond

On Fri, Nov 5, 2021 at 4:53 PM Luca Maranzano <liuk001 at gmail.com> wrote:

> Hi,
> IMHO the application should not run on the same server of the database,
> but in any case the application should connect to the VIP managed by the
> pgpool cluster not to the local pgpool.
> The case is anyway interesting because there is no way for the Primary
> Postgres to kill itself (a la STONITH way) in case of network partition.
>
> Regards,
> Luca
>
> On Fri, Nov 5, 2021 at 4:26 PM Emond Papegaaij <emond.papegaaij at gmail.com>
> wrote:
>
>> Hi,
>>
>> While running various cluster failover and recovery tests, we came across
>> an issue that I would like to discuss here. These tests were performed in a
>> setup with 3 nodes (for simplicity called node 1, 2 and 3). Each node runs
>> a database, pgpool and an application instance. The application connects to
>> the local pgpool, which in turn connects to all 3 databases, sending all
>> queries to the one that is currently primary. Suppose node 1 runs the
>> primary database. When this node is disconnected from the other 2 nodes via
>> a simulated network failure, the other nodes establish consensus to perform
>> a failover and either node 2 or 3 is selected for the new primary. However,
>> on node 1, the database remains writable and the application and pgpool
>> running. If the application on this node is still reachable from the load
>> balancer, it will continue to serve requests, resulting in a split brain
>> and ultimately database corruption.
>>
>> For many of our customers this is unwanted behavior. They would rather
>> see the service become unavailable than continue to operate in a split
>> brain. I went through the available options on pgpool, but could not find
>> an option that would help me here. I'm looking for a way to prevent pgpool
>> from accessing its backends with its watchdog is not part of a quorum. Is
>> this currently possible in pgpool? If not, is it worth considering adding a
>> feature for this?
>>
>> Best regards,
>> Emond Papegaaij
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20211105/acb28349/attachment-0001.htm>


More information about the pgpool-general mailing list