[pgpool-general: 3047] Failover did not occur during test
John Scalia
jayknowsunix at gmail.com
Sat Jul 19 05:04:06 JST 2014
Hi all,
I was trying to test my failover script inside pgpool. I had previously invoked it by hand passing it the correct parameters I've seen pgpool use, so I know it works. I had just
wanted to see if pgpool would do everything correctly. Here the relevant portion of log after I disconnected the primary server from the network:
2014-07-18 14:45:32 LOG: pid 32822: check_pgpool_status_by_hb: lifecheck failed. pgpool 1 (10.10.1.128:9999) seems not to be working
2014-07-18 14:45:32 LOG: pid 32822: pgpool_down: 10.10.1.128:9999 is going down
2014-07-18 14:45:32 LOG: pid 32822: pgpool_down: I'm oldest so standing for master
2014-07-18 14:45:32 LOG: pid 32822: wd_escalation: escalating to master pgpool WARNING: interface is ignored: Operation not permitted
2014-07-18 14:45:36 LOG: pid 32822: wd_IP_up: ifconfig up succeeded
2014-07-18 14:45:36 LOG: pid 32822: wd_escalation: escalated to master pgpool successfully
2014-07-18 14:45:39 LOG: pid 32822: check_pgpool_status_by_hb: pgpool 1 (10.10.1.128:9999) is in down status
.
.
2014-07-18 15:51:05 LOG: pid 32822: check_pgpool_status_by_hb: pgpool 1 (10.10.1.128:9999) is in down status
That last line simply repeated itself for over 100 times with no intervening entries between them. Now, you can see that it complained for more than an hour which is way too long
for a production failure. I killed the test at this point. What I don't understand is I've watched pgpool call the failover script in previous tests. That's how I knew what to pass
my script during a manual invocation after I had it completed. Something obviously malfunctioned causing the failover to not complete, but I've not ever seen a failure like that
before. Ishii-san or Nagata-san have you ever seen failover not run through completion?
--
Jay
More information about the pgpool-general
mailing list