[pgpool-general: 8552] Re: Issues taking a node out of a cluster

Fri Jan 20 22:37:52 JST 2023

>> Can you elaborate the motivation behind this? It seems you just want
>> to stop PostgreSQL node 1 and then rename PostgreSQL node 2 to node
>> 1. I don't see benefit from this for admins/users except avoiding the
>> node 1 "whole" from the configuration file.
>>
> 
> We use pgpool as an integral part of solution to provide high availability
> to our customers. We recommend our customers to run 3 instances on 3 sites
> for the highest availability. Every instance of our application is a
> virtual machine, running the application, pgpool, postgresql and some other
> components. Sometimes, things break. For example, we've seen a case where
> connectivity to one of the sites was bad. This would cause intermittent
> failures. For this, we offer the option to temporarily disable one of the
> nodes in the cluster. This will take the node out of the cluster,
> preventing the other nodes from trying to communicate with it over the
> unreliable connection. When the issue is fixed, the node can be re-enabled
> and put back into the cluster.
> 
> In the above scenario, the changes to the topology of the cluster are made
> on a live environment and downtime should be reduced to a minimum. 2 of the
> 3 nodes are healthy and capable of handling requests. They will however
> need to be reconfigured to (temporarily) forget about the faulty node. We
> can perform reconfiguration on one node at a time, taking it out of the
> load balancer during this process, thus avoiding any downtime. If, however,
> we need to restart pgpool on all nodes simultaneously, rather than one at a
> time, that would interrupt service.
> 
> Initially, we implemented this feature keeping the indexes of the backends
> in place. So node 0 would only have a backend0 and a backend2, but that
> didn't work. I don't know exactly what the problem was with that setup, as
> this was quite some time ago (is such a configuration even allowed in
> pgpool?). Because that setup did not work, we switched to reindexing the
> backends, making sure we always start at 0 and do not skip any numbers.
> This however confuses pgpool during the reconfiguration phase.
> 
> I hope this makes our situation clear.

Still I don't see why you can't leave the backend1 as "down" status
instead of trying to take out the backend1 (if my understanding is
correct, at the same time the pgpool node1 is brought to down status
because node1 and backend1 are on the same virtual machine).

This way, the node0 and the node2 can access the backend0 and the
backend2 without being disturbed by the backend1.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp