View Issue Details

IDProjectCategoryView StatusLast Update
0000367Pgpool-IIBugpublic2017-11-23 09:19
Reporterjplinux Assigned Tot-ishii  
PrioritynormalSeverityblockReproducibilityalways
Status closedResolutionopen 
PlatformLinuxOSUbuntuOS Version14.04
Product Version3.6.7 
Summary0000367: Manual failover with Pgpool and repmgr
DescriptionHi guys

After executed a manual failover I have been recovered the repmgr replication between s1 (master - read/write) and s2 (standby - read only):

repmgr cluster show
Role | Name | Upstream | Connection String
----------+------|----------|----------------------------------------------
* master | s1 | | host=192.168.0.1 dbname=repmgr user=repmgr
  standby | s2 | s1 | host=192.168.0.2 dbname=repmgr user=repmgr


So, the problem is after swapping the active nodes using repmgr (1. stop postgres on standby, 2. promote the master, 3. clone the standby), pgpool can't recognize the nodes correctly and shows me the master node as down:

show pool_nodes;
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0 | 192.168.0.1 | 5432 | down | 0.500000 | standby | 0 | false | 0
 1 | 192.168.0.2 | 5432 | up | 0.500000 | standby | 0 | true | 0

The replication is working fine and repmgr shows me everything is correct:
repmgr cluster show
Role | Name | Upstream | Connection String
----------+------|----------|----------------------------------------------
* master | s1 | | host=192.168.0.1 dbname=repmgr user=repmgr
  standby | s2 | s1 | host=192.168.0.2 dbname=repmgr user=repmgr

So, I have tried to fix pgpool using pcp commands without success, and restarted pgpool service:

Detach command is not accepted:
pcp_detach_node 0 -h localhost -U postgres
ERROR: invalid degenerate backend request, node id : 0 status: [3] is not valid for failover

I can promote the node 0 (down) but nothing happens:
pcp_promote_node 0 -U postgres -h localhost
pcp_promote_node -- Command Successful

show pool_nodes
 node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay
---------+----------------+------+--------+-----------+---------+------------+-------------------+-------------------
 0 | 192.168.0.1 | 5432 | down | 0.500000 | standby | 0 | false | 0
 1 | 192.168.0.2 | 5432 | up | 0.500000 | standby | 3 | true | 0
(2 rows)

And I can't recovery node 1 (standby):
pcp_recovery_node 1 -U postgres -h localhost
ERROR: process recovery request failed
DETAIL: primary server cannot be recovered by online recovery.

Here is the main config on pgpool.conf
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_flag1 = 'ALLOW_TO_FAILOVER'

load_balance_mode = on

master_slave_mode = on
master_slave_sub_mode = 'stream'

failover_command = ''
recovery_1st_stage_command = ''

Please, help me. I don't know what I am doing wrong
Additional Informationbtw I cant't report a bug in firefox looks like the site is incompatible
TagsNo tags attached.

Activities

jplinux

2017-11-16 18:56

reporter   ~0001833

I have fixed it and pgpool have recognized the new replication configuration, the solution is below:

1) So, basically I have stopped the pgpool service (debian variant):
sudo service pgpool2 stop

2) And started the pgpool process manually using the --discard-status option "-D, Discard pgpool_status file and do not restore previous status":
pgpool -n -D &
Note: The -n (--dont-detach) option it was used just to show the output on the tty

3) And finally, I have restarted the pgpool and restored as a service again:
service pgpool2 restart

Result:
show pool_nodes;
-[ RECORD 1 ]-----+---------------
node_id           | 0
hostname          | 192.168.0.1
port              | 5432
status            | up
lb_weight         | 0.500000
role              | primary
select_cnt        | 1
load_balance_node | false
replication_delay | 0
-[ RECORD 2 ]-----+---------------
node_id           | 1
hostname          | 192.168.0.2
port              | 5432
status            | up
lb_weight         | 0.500000
role              | standby
select_cnt        | 0
load_balance_node | true
replication_delay | 0

This post can be closed!

t-ishii

2017-11-23 09:19

developer   ~0001841

Thanks for the report.

Also discussions were made regarding this topic at:
https://www.pgpool.net/pipermail/pgpool-hackers/2017-November/002596.html

Issue closed.

Issue History

Date Modified Username Field Change
2017-11-15 19:05 jplinux New Issue
2017-11-16 18:56 jplinux Note Added: 0001833
2017-11-23 09:15 t-ishii Assigned To => t-ishii
2017-11-23 09:15 t-ishii Status new => assigned
2017-11-23 09:19 t-ishii Note Added: 0001841
2017-11-23 09:19 t-ishii Status assigned => closed