<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 5, 2021 at 6:13 PM Tatsuo Ishii <<a href="mailto:ishii@sraoss.co.jp">ishii@sraoss.co.jp</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">> Ah - I had not thought of this option. However, doesn't failover_command<br>

> get told what the new primary will be, rather than being able to decide<br>

> this itself (%m %H %r %R)?<br>

<br>

No. Actually failover_command always decides what the new primary will<br>

be.<br>

<br>

"New main node" means the live node (not in down status) which has the<br>

youngest node id. Usually (and our sample script) just chooses the new<br>

main node as the candidate of new primary because it's the easiest<br>

way. (otherwise the script must check whether new primary candidate is<br>

actually up or not).<br></blockquote><div><br></div><div>Oh! I see! So, does this mean that pgpool checks pg_is_in_recovery() to find out which node is primary after failover_command has run?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

>> With this change you can do:<br>

>><br>

>> 1) explicitly specify the new primary node<br>

>><br>

>> 2) run pcp_detach_node against the current primary and the current<br>

>>    primary nodes goes into down status<br>

>><br>

>> 3) failover_command will run to make the specified standby node to be<br>

>>    new primary<br>

>><br>

>> 4) follow_primary_command will run on rest of standbys and the old<br>

>>    primary to follow the new primary<br>

> <br>

> <br>

> I think this would work yes - which is very close to what I was suggesting<br>

> with a flag on pcp_promote_node. If this requires 2 commands<br>

> (pcp_set_next_primary or something, then pcp_detach_node) I would wrap them<br>

> in a single script which runs both in order to keep things simple and fast<br>

> for our operations team.<br>

<br>

Wrapping pcp_set_next_primary or something, and pcp_detach_node looks<br>

nice idea because it's flexible. Even you could look for a node which<br>

has the least replication delay and let the node be the next primary.<br></blockquote><div><br></div><div>OK - this sounds great. I can have a stab at getting something like that written.</div><div><br></div><div>So, in terms of implementation this sounds like:</div><div>1) New pcp command to set a next primary backend ID. I'm not sure how this would be stored, perhaps update the parsed configuration state (is this pool_config), so it could also be in the configuration file? Or, it could update a new flag on bkinfo?</div><div>2) Modify get_next_main_node to include a check for the above, and check that it is a valid backend and not the current (failed/detached) primary backend.</div><div><br></div><div>Is modifying get_next_main_node() appropriate here? My thinking is that doing this means we can continue to use the existing scripts, which use %m and %H and so on.</div><div>Perhaps this is best implemented as a backend "primary-ship" priority - i.e. set a priority for each backend (via config and pcp) to become the primary, as a uint8 on bkinfo, then in get_next_main_node sort first by priority then by node id.</div><div><br></div><div>--</div><div>Nathan Ward</div><div><br></div></div></div>