[pgpool-general: 4119] Testing pgpool + failover on random servers restarts
Ioana Danes
ioanadanes at gmail.com
Sat Oct 17 02:15:58 JST 2015
Hello,
I have setup pgpool (version 3.4) in master slave mode with 2 postgres
databases (no load balancing) using streaming replication:
backend_hostname0 = 'voldb1'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/data01/postgres'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'voldb2'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/data01/postgres'
backend_flag1 = 'ALLOW_TO_FAILOVER'
connection_cache = on
load_balance_mode = off
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 10
health_check_period = 40
health_check_timeout = 10
health_check_max_retries = 3
health_check_retry_delay = 1
connect_timeout = 10000
failover_command = '/usr/local/bin/pgpool_failover.sh %P %m %H'
failback_command = '/usr/local/bin/pgpool_failback.sh %d %P %m %H'
fail_over_on_backend_error = on
search_primary_node_timeout = 10
I was testing the following scenario:
1. Initial setup:
node_id | hostname | port | status | lb_weight | role
---------+---------------+------+--------+-----------+---------
0 | voldb1.ls.cbn | 5432 | 2 | 0.500000 | primary
1 | voldb2.ls.cbn | 5432 | 2 | 0.500000 | standby
2. I stopped postgres on voldb1 and voldb2 became the new primary so now I
have this:
node_id | hostname | port | status | lb_weight | role
---------+---------------+------+--------+-----------+---------
0 | voldb1.ls.cbn | 5432 | 3 | 0.500000 | standby
1 | voldb2.ls.cbn | 5432 | 2 | 0.500000 | primary
3. I stopped postgres on voldb2 and there were no active masters attached
to pgpool:
pcp_node_info 10 localhost 9898 cbn_cluster t000r 0
voldb1.ls.cbn 5432 3 0.500000
pcp_node_info 10 localhost 9898 cbn_cluster t000r 1
voldb2.ls.cbn 5432 3 0.500000
4. I started postgres on both voldb1 and voldb2. No changes as pgpool does
not search for the nodes to reattach.
5. I killed pgpool (to simulate a server crash, reboot): pkill -9 pgpool
and I started pgpool again. Now this is the state I ended up with:
node_id | hostname | port | status | lb_weight | role
---------+---------------+------+--------+-----------+---------
0 | voldb1.ls.cbn | 5432 | 2 | 0.500000 | primary
1 | voldb2.ls.cbn | 5432 | 2 | 0.500000 | standby
So the old master became master again which is a big deal in my case, I
can't afford to have any kind of data loss!!!
How is pgpool determining what server was master last? Is there a way I
could overcome this issue? I thought I could update the pgpool status file
in /var/log/pgpool in my scripts once a failover occurs. Would that be the
way to go or there are better ways to fix this?
Thanks a lot,
Ioana Danes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20151016/8ebee357/attachment.htm>
More information about the pgpool-general
mailing list