[pgpool-general: 7588] Re: Promote specific backend to primary in streaming replication

Mon Jun 7 11:41:23 JST 2021

Hi Nathan,

I have revisited this and am leaning toward the idea: modifying
existing pcp command so that we can specify which node to be promoted.

My idea is using the detaching node protocol of pcp command. Currently
the protocol specifies the node to be detached. I add a new request
detail flag bit "REQ_DETAIL_PROMOTE". If the flag is set, the protocol
treats the requested node id to be promoted, rather than to be
detached. In this case the node to be detached will be the current
primary node. The node to be promoted is passed to a failover script
as the existing argument "new main node" (the live node (not including
current primary node) which has the smallest node id). Existing
failover scripts usually treat the "new main node" as the node to be
promoted.

Attached is the PoC patch for this. After applying the patch, the -g
argument (detach gracefully) of pcp_detach_node specifies the
"REQ_DETAIL_PROMOTE" flag and the node id argument now represents the
node id to be promoted.  For example:

# create 4-node cluster.
$ pgpool_setup -n 4

# primary node is 0.
$ psql -p 11000 -c "show pool_nodes" test
 node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | /tmp     | 11002 | up     | up        | 0.250000  | primary | primary | 0          | false             | 0                 |                   |                        | 2021-06-07 11:02:25
 1       | /tmp     | 11003 | up     | up        | 0.250000  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-06-07 11:02:25
 2       | /tmp     | 11004 | up     | up        | 0.250000  | standby | standby | 0          | true              | 0                 | streaming         | async                  | 2021-06-07 11:02:25
 3       | /tmp     | 11005 | up     | up        | 0.250000  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-06-07 11:02:25
(4 rows)

# promote node 2.
pcp_detach_node -g -p 11001 -w 2
psql -p 11000 -c "show pool_nodes" test
 node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change  
---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | /tmp     | 11002 | up     | up        | 0.250000  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-06-07 11:03:17
 1       | /tmp     | 11003 | up     | up        | 0.250000  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-06-07 11:03:17
 2       | /tmp     | 11004 | up     | up        | 0.250000  | primary | primary | 0          | true              | 0                 |                   |                        | 2021-06-07 11:02:43
 3       | /tmp     | 11005 | up     | up        | 0.250000  | standby | standby | 0          | false             | 0                 | streaming         | async                  | 2021-06-07 11:03:27
(4 rows)

(note that the node 3 may become in down status after the execution of
pcp_detach_node but it's because other issue. See [pgpool-hackers:
3915] ERROR: failed to process PCP request at the moment for more
details.)

One of the merits of this method is, existing failover scripts need
not to be changed.  (Of course we could modify existing
pcp_promote_node so that it does the job).

However I see difficulties with modifying existing pcp commands
(either pcp_detach_node and pcp_promote_node). The protocol used by
pcp commands are very limited in that the only argument to be able to
use is, the node id. So how pcp_detach_node implements -g flag?
pcp_detach_node uses two kinds of pcp protocol: one is for without -g
and the other is with -g. (I think eventually we should overhaul
existing pcp protocol someday to overcome these limitations but it's
other story).

Probably we need to invent new pcp protocol and use it for this
purpose.

Comments and suggestions are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Hello!
> 
>> On 7/04/2021, at 12:57 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> 
>> Hi,
>> 
>>> Thanks for your time on this so far!
>>> 
>>> Would it be possible to allow these to be changed at runtime without a config change + reload? Perhaps a more generic pcp command, to allow any “reloadable” configuration item to be changed? I’m not sure how many of these there are, and if any of them require deeper changes. Maybe we need a 3rd “class” of configuration item, to allow it to be changed without changing config.
>> 
>> Not sure. PostgreSQL does not have such a 3rd"class".
> 
> That’s true, though you can modify a special configuration file without editing files directly with ALTER SYSTEM and then reload the configuration - i.e. you can make changes to the running server without editing configuration files directly.
> 
> Pgpool could somewhat mirror postgres functionality here, if there was an include directive, then a reload is required. If we enabled configuration to be added to such a file with pcp, that would be useful, so it could be done from a remote system perhaps?
> 
> Just some thoughts.. these seem like bigger changes, and I’m new here :-)
> 
>>> We (and I am sure many others) use config management tools to keep our config in sync. In particular, we use Puppet. I have a module which I’m hoping to be able to publish soon, in fact.
>>> 
>>> Usually, environments which use config management like this have policies to not modify configuration which is managed.
>>> 
>>> Switching to a different backend would mean we have to manually update that managed configuration (on the pgpool primary, presumably) and reload, and then run pcp_detach_node against the primary backend. In the time between updating the configuration, and detaching the node, our automation might run and undo our changes which would mean we may get an outcome we don’t expect.
>>> An alternative would be to update the configuration in our management system and deploy it - but in many environments making these changes can be quite involved. I don’t think it’s good to require operational changes (i.e. “move the primary function to this backend”) to require configuration updates.
>>> 
>>> A work-around to this would be to have an include directive in the configuration and include a local, non-managed file for these sorts of temporary changes, but pgpool doesn’t have this at the moment I believe?
>> 
>> No, pgpool does not have. However I think it would be a good idea to
>> have "include" feature like PostgreSQL.
> 
> I would certainly use that. My puppet module writes the configuration data twice:
> 1) In to the main pgpool.conf file.
> 2) In to one of two files, either a “reloadable” config, or a “restart required” config - so that I can use puppet notifications for changes to those files to trigger either a reload or a restart.
> It then does some checks to make sure the pgpool.conf is older than the reloadable/restart required config, etc. etc. - it’s quite messy!
> 
> An include directive would make that all a lot easier.. and of course would enable this use case.
> 
> --
> Nathan Ward
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: promote_specified_node.diff
Type: text/x-patch
Size: 4586 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20210607/df3705ac/attachment.bin>