[pgpool-general: 7466] Re: Expected behaviour after pcp_detach_node ?

Anssi Kanninen anssi at iki.fi
Mon Mar 29 17:27:43 JST 2021


Please ignore the attachment in the previous post. That was my own script 
with all the typos.

Best regards,
   Anssi Kanninen

On Mon, 29 Mar 2021, Anssi Kanninen wrote:

> Thanks for answering! I found out I had a major typo in follow_master.sh . So 
> that problem is solved. Sorry for false problem report.
>
> But I found a typo also in follow_master.sh.sample distributed with Pgpool:
>
>    # If start Standby failed, drop replication slot "${FAILED_NODE_HOST}"
>    else
>        ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null 
> postgres@${NEW_MASTER_NODE_HOST} -i ~/.ssh/id_rsa_pgpool \
>        ${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c "SELECT 
> pg_drop_replication_slot('${FAILED_NODE_HOST}')"
>
>        logger -i -p local1.error follow_master.sh: end: follow master 
> command failed
>        exit 1
>    fi
>
> The SSH command above won't work unless the quotes around the actual SQL 
> command are escaped with '\'. There is a working example later in the script. 
> :-)
>
> Cheers,
>
> Anssi Kanninen
>
>
>
>
> $ diff follow_master.sh.sample follow_master.sh
> 5c5
> < set -o xtrace
> ---
>> #set -o xtrace
> 34,35c34,35
> < PGHOME=/usr/pgsql-11
> < ARCHIVEDIR=/var/lib/pgsql/archivedir
> ---
>> PGHOME=/usr/pgsql-12
>> ARCHIVEDIR=/backup/pg-archivedir
> 40a41,42
>> REPL_SLOT_NAME=${FAILED_NODE_HOST//[-.]/_}
>> 
> 70c72
> <     # Create replication slot "${FAILED_NODE_HOST}"
> ---
>>     # Create replication slot "${REPL_SLOT_NAME}"
> 72c74
> <         ${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c \"SELECT 
> pg_create_physical_replication_slot('${FAILED_NODE_HOST}');\"
> ---
>>         ${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c \"SELECT 
>> pg_create_physical_replication_slot('${REPL_SLOT_NAME}');\"
> 85c87
> < primary_slot_name = '${FAILED_NODE_HOST}'
> ---
>> primary_slot_name = '${REPL_SLOT_NAME}'
> 119c121
> < primary_slot_name = '${FAILED_NODE_HOST}'
> ---
>> primary_slot_name = '${REPL_SLOT_NAME}'
> 131,134c133
> <             ssh -T -o StrictHostKeyChecking=no -o 
> UserKnownHostsFile=/dev/null postgres@${NEW_MASTER_NODE_HOST} -i 
> ~/.ssh/id_rsa_pgpool "
> <                 ${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c \"SELECT 
> pg_drop_replication_slot('${FAILED_NODE_HOST}')\"
> <             "
> <
> ---
>>             ssh -T -o StrictHostKeyChecking=no -o 
>> UserKnownHostsFile=/dev/null po
> 155c154
> <     # If start Standby failed, drop replication slot "${FAILED_NODE_HOST}"
> ---
>>     # If start Standby failed, drop replication slot "${REPL_SLOT_NAME}"
> 159c158
> <         ${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c "SELECT 
> pg_drop_replication_slot('${FAILED_NODE_HOST}')"
> ---
>>         ${PGHOME}/bin/psql -p ${NEW_MASTER_NODE_PORT} -c "SELECT 
> pg_drop_replication_slot('${REPL_SLOT_NAME}')"
> 166c165
> <     logger -i -p local1.info follow_master.sh: 
> failed_nod_id=${FAILED_NODE_ID} is not running. skipping follow master 
> command
> ---
>>     logger -i -p local1.info follow_master.sh: 
> failed_node_id=${FAILED_NODE_ID} is not running. skipping follow master 
> command
>
>
>
>
>
> On Mon, 29 Mar 2021, Tatsuo Ishii wrote:
>
>> Hi,
>> 
>>> Hi!
>>> 
>>> What is the expected behaviour when detaching a node?
>>> I have a three-node test cluster with pgpool-II 4.1.3
>>> and PostgreSQL 12 like below.
>>> 
>>> 
>>> id | hostname | status | role | repl_delay | repl_state | repl_sync_st
>>> | last_status_change
>>>  0 | 10.0.0.1 | up | primary | 0 | | | 2021-03-25 12:18:27
>>>  1 | 10.0.0.2 | up | standby | 0 | streaming | async | 2021-03-25
>>>  12:18:56
>>>  2 | 10.0.0.3 | up | standby | 0 | streaming | async | 2021-03-25
>>>  12:18:56
>>> 
>>> *** Then I run "pcp_detach_node 0". The second node was promoted as
>>> *** expected:
>>> 
>>> id | hostname | status | role | repl_delay | repl_state | repl_sync_st
>>> | last_status_change
>>>  0 | 10.0.0.1 | down | standby | 0 | | | 2021-03-25 12:25:27
>>>  1 | 10.0.0.2 | up | primary | 0 | | | 2021-03-25 12:25:27
>>>  2 | 10.0.0.3 | down | standby | 0 | | | 2021-03-25 12:25:27
>>> 
>>> *** But then, the first one is back as primary again?
>>> 
>>> id | hostname | status | role | repl_delay | repl_state | repl_sync_st
>>> | last_status_change
>>>  0 | 10.0.0.1 | up | primary | 0 | | | 2021-03-25 12:25:36
>>>  1 | 10.0.0.2 | up | standby | 0 | | | 2021-03-25 12:25:36
>>>  2 | 10.0.0.3 | down | standby | 0 | | | 2021-03-25 12:25:27
>> 
>> This behavior is probably caused by the follow_master_command. Can you
>> share follow_master.sh?
>> 
>>> *** After that, returning as standby, the second node is primary:
>>> 
>>> id | hostname | status | role | repl_delay | repl_state | repl_sync_st
>>> | last_status_change
>>>  0 | 10.0.0.1 | up | standby | 0 | | | 2021-03-25 12:25:45
>>>  1 | 10.0.0.2 | up | primary | 0 | | | 2021-03-25 12:25:45
>>>  2 | 10.0.0.3 | up | standby | 0 | | | 2021-03-25 12:25:45
>>> 
>>> *** After all, this is the final state.
>>> 
>>> id | hostname | status | role | repl_delay | repl_state | repl_sync_st
>>> | last_status_change
>>>  0 | 10.0.0.1 | up | standby | 33463880 | | | 2021-03-25 12:25:45
>>>  1 | 10.0.0.2 | up | primary | 0 | | | 2021-03-25 12:25:45
>>>  2 | 10.0.0.3 | up | standby | 0 | streaming | async | 2021-03-25
>>>  12:25:45
>>> 
>>> So finally, the second node was promoted as expected.
>>> 
>>> But why is the first node "up" again even if it's not replicating?
>>> 
>>> See pgpool.conf attached.
>>> 
>>> Best regards,
>>>   Anssi Kanninen
>>> 
>>> 
>>> 
>>> 
>>> --
>>> anssi at iki.fi
>> 
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>> 
>
> -- 
> anssi at iki.fi

-- 
anssi at iki.fi


More information about the pgpool-general mailing list