0000367: Manual failover with Pgpool and repmgr - Pgpool-II Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000367	Pgpool-II	Bug	public	2017-11-15 19:05	2017-11-23 09:19

Reporter	jplinux	Assigned To	t-ishii
Priority	normal	Severity	block	Reproducibility	always
Status	closed	Resolution	open
Platform	Linux	OS	Ubuntu	OS Version	14.04
Product Version	3.6.7

Summary	0000367: Manual failover with Pgpool and repmgr
Description	Hi guys After executed a manual failover I have been recovered the repmgr replication between s1 (master - read/write) and s2 (standby - read only): repmgr cluster show Role \| Name \| Upstream \| Connection String ----------+------\|----------\|---------------------------------------------- * master \| s1 \| \| host=192.168.0.1 dbname=repmgr user=repmgr standby \| s2 \| s1 \| host=192.168.0.2 dbname=repmgr user=repmgr So, the problem is after swapping the active nodes using repmgr (1. stop postgres on standby, 2. promote the master, 3. clone the standby), pgpool can't recognize the nodes correctly and shows me the master node as down: show pool_nodes; node_id \| hostname \| port \| status \| lb_weight \| role \| select_cnt \| load_balance_node \| replication_delay ---------+----------------+------+--------+-----------+---------+------------+-------------------+------------------- 0 \| 192.168.0.1 \| 5432 \| down \| 0.500000 \| standby \| 0 \| false \| 0 1 \| 192.168.0.2 \| 5432 \| up \| 0.500000 \| standby \| 0 \| true \| 0 The replication is working fine and repmgr shows me everything is correct: repmgr cluster show Role \| Name \| Upstream \| Connection String ----------+------\|----------\|---------------------------------------------- * master \| s1 \| \| host=192.168.0.1 dbname=repmgr user=repmgr standby \| s2 \| s1 \| host=192.168.0.2 dbname=repmgr user=repmgr So, I have tried to fix pgpool using pcp commands without success, and restarted pgpool service: Detach command is not accepted: pcp_detach_node 0 -h localhost -U postgres ERROR: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover I can promote the node 0 (down) but nothing happens: pcp_promote_node 0 -U postgres -h localhost pcp_promote_node -- Command Successful show pool_nodes node_id \| hostname \| port \| status \| lb_weight \| role \| select_cnt \| load_balance_node \| replication_delay ---------+----------------+------+--------+-----------+---------+------------+-------------------+------------------- 0 \| 192.168.0.1 \| 5432 \| down \| 0.500000 \| standby \| 0 \| false \| 0 1 \| 192.168.0.2 \| 5432 \| up \| 0.500000 \| standby \| 3 \| true \| 0 (2 rows) And I can't recovery node 1 (standby): pcp_recovery_node 1 -U postgres -h localhost ERROR: process recovery request failed DETAIL: primary server cannot be recovered by online recovery. Here is the main config on pgpool.conf backend_flag0 = 'ALLOW_TO_FAILOVER' backend_flag1 = 'ALLOW_TO_FAILOVER' load_balance_mode = on master_slave_mode = on master_slave_sub_mode = 'stream' failover_command = '' recovery_1st_stage_command = '' Please, help me. I don't know what I am doing wrong
Additional Information	btw I cant't report a bug in firefox looks like the site is incompatible
Tags	No tags attached.

jplinux 2017-11-16 18:56 reporter ~0001833	I have fixed it and pgpool have recognized the new replication configuration, the solution is below: 1) So, basically I have stopped the pgpool service (debian variant): sudo service pgpool2 stop 2) And started the pgpool process manually using the --discard-status option "-D, Discard pgpool_status file and do not restore previous status": pgpool -n -D & Note: The -n (--dont-detach) option it was used just to show the output on the tty 3) And finally, I have restarted the pgpool and restored as a service again: service pgpool2 restart Result: show pool_nodes; -[ RECORD 1 ]-----+--------------- node_id \| 0 hostname \| 192.168.0.1 port \| 5432 status \| up lb_weight \| 0.500000 role \| primary select_cnt \| 1 load_balance_node \| false replication_delay \| 0 -[ RECORD 2 ]-----+--------------- node_id \| 1 hostname \| 192.168.0.2 port \| 5432 status \| up lb_weight \| 0.500000 role \| standby select_cnt \| 0 load_balance_node \| true replication_delay \| 0 This post can be closed!

t-ishii 2017-11-23 09:19 developer ~0001841	Thanks for the report. Also discussions were made regarding this topic at: https://www.pgpool.net/pipermail/pgpool-hackers/2017-November/002596.html Issue closed.

Date Modified	Username	Field	Change
2017-11-15 19:05	jplinux	New Issue
2017-11-16 18:56	jplinux	Note Added: 0001833
2017-11-23 09:15	t-ishii	Assigned To	=> t-ishii
2017-11-23 09:15	t-ishii	Status	new => assigned
2017-11-23 09:19	t-ishii	Note Added: 0001841
2017-11-23 09:19	t-ishii	Status	assigned => closed