[pgpool-general: 6660] Re: Cluster with 3 nodes

Tatsuo Ishii ishii at sraoss.co.jp
Wed Jul 31 22:29:48 JST 2019


You are welcome. Glad to hear that your problem has been solved!

> Yes, it was problem with passwordless ssh. Thanks for help!
> 
> пн, 29 июл. 2019 г. в 17:56, Гиа Хурцилава <khurtsilava.g at gmail.com>:
> 
>> Sorry, here is the pgpool.conf from the master node
>>
>> So I delete >/dev/null from the script and here is the result:
>>
>>  + FAILED_NODE_ID=0
>>  + FAILED_NODE_HOST=master
>>  + FAILED_NODE_PORT=5432
>>  + FAILED_NODE_PGDATA=/var/lib/pgsql/11/data
>>  + NEW_MASTER_NODE_ID=1
>>  + OLD_MASTER_NODE_ID=0
>>  + NEW_MASTER_NODE_HOST=slave
>>  + OLD_PRIMARY_NODE_ID=0
>>  + NEW_MASTER_NODE_PORT=5432
>>  + NEW_MASTER_NODE_PGDATA=/var/lib/pgsql/11/data
>>  + PGHOME=/usr/pgsql-11
>>  + ARCHIVEDIR=/var/lib/pgsql/archivedir
>>  + REPL_USER=repl
>>  + PCP_USER=pgpool
>>  + PGPOOL_PATH=/usr/bin
>>  + PCP_PORT=9898
>>  + logger -i -p local1.info follow_master.sh: start: pg_basebackup for 0
>>  + ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
>> postgres at master /usr/pgsql-11/bin/pg_ctl -w -D /var/lib/pgsql/11/data
>> status
>>  Warning: Permanently added 'master,192.168.56.110' (ECDSA) to the list of
>> known hosts.
>>  Permission denied, please try again.
>>  Permission denied, please try again.
>>  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
>>  + [[ 255 -eq 0 ]]
>>  + logger -i -p local1.info follow_master.sh: failed_nod_id=0 is not
>> running. skipping follow master command.
>> follow_master.sh: failed_nod_id=0 is not running. skipping follow master
>> command.
>>  + exit 0
>>  [192-1] 2019-07-29 13:55:02: pid 2504: LOG:  execute command:
>> /etc/pgpool-II/follow_master.sh 2 reserve 5432 /var/lib/pgsql/11/data 1 0
>> slave 0 5432 /var/lib/pgsql/11/data
>>  follow_master.sh: start: pg_basebackup for 2
>>  + FAILED_NODE_ID=2
>>  + FAILED_NODE_HOST=reserve
>>  + FAILED_NODE_PORT=5432
>>  + FAILED_NODE_PGDATA=/var/lib/pgsql/11/data
>>  + NEW_MASTER_NODE_ID=1
>>  + OLD_MASTER_NODE_ID=0
>>  + NEW_MASTER_NODE_HOST=slave
>>  + OLD_PRIMARY_NODE_ID=0
>>  + NEW_MASTER_NODE_PORT=5432
>>  + NEW_MASTER_NODE_PGDATA=/var/lib/pgsql/11/data
>>  + PGHOME=/usr/pgsql-11
>>  + ARCHIVEDIR=/var/lib/pgsql/archivedir
>>  + REPL_USER=repl
>>  + PCP_USER=pgpool
>>  + PGPOOL_PATH=/usr/bin
>>  + PCP_PORT=9898
>>  + logger -i -p local1.info follow_master.sh: start: pg_basebackup for 2
>>  + ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
>> postgres at reserve /usr/pgsql-11/bin/pg_ctl -w -D /var/lib/pgsql/11/data
>> status
>>  Warning: Permanently added 'reserve,192.168.56.112' (ECDSA) to the list
>> of known hosts.
>>  Permission denied, please try again.
>>  Permission denied, please try again.
>>  Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
>>  + [[ 255 -eq 0 ]]
>>  + logger -i -p local1.info follow_master.sh: failed_nod_id=2 is not
>> running. skipping follow master command.
>>  slave root[2550]: follow_master.sh: failed_nod_id=2 is not running.
>> skipping follow master command.
>>  + exit 0
>>
>> I'm starting to think that there some problem with ssh connection, but not
>> sure
>>
>> вс, 28 июл. 2019 г. в 03:58, Tatsuo Ishii <ishii at sraoss.co.jp>:
>>
>>> I noticed followings in the log files:
>>>
>>> /home/t-ishii/slave log.txt:Jul 25 22:30:53 reserve root[2011]:
>>> follow_master.sh: failed_nod_id=1 is not running. skipping follow master
>>> command.
>>> /home/t-ishii/slave log.txt:Jul 25 22:30:53 reserve root[2019]:
>>> follow_master.sh: failed_nod_id=2 is not running. skipping follow master
>>> command.
>>>
>>> I don't know which is node 1 and 2 (because you didn't share
>>> pgpool.conf) , but I don't think two nodes were skipped by follow
>>> master command was normal because you have only 3 nodes and just one
>>> of 3 is already down.
>>>
>>> I suspect following code in follow_master.sh did not succeed:
>>>
>>> ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
>>>     postgres@${FAILED_NODE_HOST} ${PGHOME}/bin/pg_ctl -w -D
>>> ${FAILED_NODE_PGDATA} status >/dev/null 2>&1
>>>
>>> You would want to remove ">/dev/null" to see what was going on there.
>>>
>>> Best regards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese:http://www.sraoss.co.jp
>>>
>>> > "slave" -primary
>>> > "master" and "reserve"- standby
>>> > After I shut down "slave", "master" became primary, but "reserve" got
>>> > status down. Configs are same from the documentation (changed just
>>> > hostnames and ip's). Failover config is the same also
>>> >
>>> > пт, 26 июл. 2019 г. в 12:54, Tatsuo Ishii <ishii at sraoss.co.jp>:
>>> >
>>> >> Hi,
>>> >>
>>> >> Yes, please provide log and config files.
>>> >>
>>> >> My intuition is that there's something wrong with the follow master
>>> >> command script or related settings (especially ssh), because the
>>> >> script shutdowns standby server to resync with new primary database
>>> >> server.
>>> >>
>>> >> Best regards,
>>> >> --
>>> >> Tatsuo Ishii
>>> >> SRA OSS, Inc. Japan
>>> >> English: http://www.sraoss.co.jp/index_en.php
>>> >> Japanese:http://www.sraoss.co.jp
>>> >>
>>> >> > Гиа Хурцилава <khurtsilava.g at gmail.com>
>>> >> > чт, 25 июл., 13:56 (21 час назад)
>>> >> > кому: pgpool-general
>>> >> >
>>> >> > Hi there.
>>> >> >
>>> >> > I’ve got 3 machines with pgpool-4.0.5 and postgresql-11. I have done
>>> >> > configuration for pgpool from the official documentations (
>>> >> > http://www.pgpool.net/docs/latest/en/html/example-cluster.html) and
>>> >> > everything works fine, except 1 thing. When I’m shutting down master
>>> >> node,
>>> >> > one of the slaves is correctly promoted, and another one is going
>>> down
>>> >> with
>>> >> > master. Just like that:
>>> >> >
>>> >> > node_id | hostname | port | status | lb_weight |  role   |
>>> select_cnt |
>>> >> > load_balance_node | replication_delay | last_status_change
>>> >> >
>>> >> >
>>> >>
>>> ---------+----------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
>>> >> >
>>> >> >  0       | master   | 5432 | down   | 0.333333  | standby | 0
>>>   |
>>> >> > false             | 0                 | 2019-07-25 13:49:22
>>> >> >
>>> >> >  1       | slave      | 5432 | up         | 0.333333  | primary | 0
>>> >> >   | true              | 0                | 2019-07-25 13:49:22
>>> >> >
>>> >> >  2       | reserve  | 5432 | down   | 0.333333  | standby | 0
>>>   |
>>> >> > false             | 0                 | 2019-07-25 13:49:22
>>> >> >
>>> >> >
>>> >> >
>>> >> > What reason can be of this behavior? How can I fix it?
>>> >> >
>>> >> > If you’ll need logs or config files-let me know. Thanks.
>>> >>
>>>
>>


More information about the pgpool-general mailing list