[pgpool-general: 7071] Re: Rebooting the cluster?

Fri Jun 5 03:44:34 JST 2020

On Sat, 30 May 2020, Tatsuo Ishii wrote:

>> Hi again,
>>
>> Thank you for a quick answer.
>>
>> Booting the services on one node is not a problem. But booting the
>> whole cluster is. The booting sequence for all the nodes one at a time
>> goes like this:
>>
>> 1st node starts up
>> ==================
>>
>> PgPool:
>> centos8-1v2-int 5432 9000 4 MASTER
>> centos8-2v2-int 5432 9000 0 DEAD
>> centos8-3v2-int 5432 9000 0 DEAD
>>
>> Database:
>>  node_id | hostname | port | status | role | replication_state
>> ---------+-----------------+------+------------+---------+-------------------
>>  0       | centos8-1v2-int | 5433 | up         | primary |
>>  1       | centos8-2v2-int | 5433 | quarantine | standby |
>>  2       | centos8-3v2-int | 5433 | quarantine | standby |
>>
>> => Delegate IP is not up yet because there is no quorum. This is ok.
>>
>> 2nd node starts up
>> ==================
>>
>> PgPool:
>> centos8-1v2-int 5432 9000 4 MASTER
>> centos8-2v2-int 5432 9000 7 STANDBY
>> centos8-3v2-int 5432 9000 0 DEAD
>>
>> At first we are waiting...
>>
>>  node_id | hostname | port | status | role | replication_state
>> ---------+-----------------+------+---------+---------+-------------------
>>  0       | centos8-1v2-int | 5433 | up      | primary |
>>  1       | centos8-2v2-int | 5433 | up      | standby |
>>  2       | centos8-3v2-int | 5433 | waiting | standby |
>>
>> And then PgPool detaches the last node:
>>
>>  node_id | hostname | port | status | role | replication_state
>> ---------+-----------------+------+--------+---------+-------------------
>>  0       | centos8-1v2-int | 5433 | up     | primary |
>>  1       | centos8-2v2-int | 5433 | up     | standby | streaming
>>  2       | centos8-3v2-int | 5433 | down   | standby |
>>
>> => The PgPool service is up and delegate IP is also up.
>>
>> 3rd node starts up
>> ==================
>>
>> centos8-1v2-int 5432 9000 4 MASTER
>> centos8-2v2-int 5432 9000 7 STANDBY
>> centos8-3v2-int 5432 9000 7 STANDBY
>>
>>  node_id | hostname | port | status | role | replication_state
>> ---------+-----------------+------+--------+---------+-------------------
>>  0       | centos8-1v2-int | 5433 | up     | primary |
>>  1       | centos8-2v2-int | 5433 | up     | standby | streaming
>>  2       | centos8-3v2-int | 5433 | down   | standby |
>>
>> So the 3rd node stays down. I've configured failback.sh so that it
>> recreates the replication slot on primary DB on failback event, so
>> pcp_attach_node is enough to bring the last node up:
>>
>>  node_id | hostname | port | status | role | replication_state
>> ---------+-----------------+------+--------+---------+-------------------
>>  0       | centos8-1v2-int | 5433 | up     | primary |
>>  1       | centos8-2v2-int | 5433 | up     | standby | streaming
>>  2       | centos8-3v2-int | 5433 | up     | standby | streaming
>>
>> Otherwise my configuration is pretty much the same as the
>> PgPool+Watchdog example in the PgPool web pages.
>>
>> So, would it be possible to configure this so that just starting up
>> all the nodes is enough to bring the whole cluster online? It may be
>> so if I start them all at the same time but in production this is not
>> always possible.
>
> There's a parameter called "auto_failback" which automatically
> attaches a PostgreSQL standby node if it's safe. auto_failback is
> available in 4.1 or later.

Thank you again for quick answer!

That works but requires some changes to the example failover.sh -script. 
The replication slot is dropped if standby node goes down. So to get 
auto_failback working, the replication slot must be recreated or not to be 
dropped at all.

Cheers,
   - Anssi

-- 
anssi at iki.fi