[pgpool-general: 3047] Failover did not occur during test

John Scalia jayknowsunix at gmail.com
Sat Jul 19 05:04:06 JST 2014


Hi all,

I was trying to test my failover script inside pgpool. I had previously invoked it by hand passing it the correct parameters I've seen pgpool use, so I know it works. I had just 
wanted to see if pgpool would do everything correctly. Here the relevant portion of log after I disconnected the primary server from the network:

2014-07-18 14:45:32 LOG: pid 32822: check_pgpool_status_by_hb: lifecheck failed. pgpool 1 (10.10.1.128:9999) seems not to be working
2014-07-18 14:45:32 LOG: pid 32822: pgpool_down: 10.10.1.128:9999 is going down
2014-07-18 14:45:32 LOG: pid 32822: pgpool_down: I'm oldest so standing for master
2014-07-18 14:45:32 LOG: pid 32822: wd_escalation: escalating to master pgpool WARNING: interface is ignored: Operation not permitted
2014-07-18 14:45:36 LOG: pid 32822: wd_IP_up: ifconfig up succeeded
2014-07-18 14:45:36 LOG: pid 32822: wd_escalation: escalated to master pgpool successfully
2014-07-18 14:45:39 LOG: pid 32822: check_pgpool_status_by_hb: pgpool 1 (10.10.1.128:9999) is in down status
  .
  .
2014-07-18 15:51:05 LOG: pid 32822: check_pgpool_status_by_hb: pgpool 1 (10.10.1.128:9999) is in down status

That last line simply repeated itself for over 100 times with no intervening entries between them. Now, you can see that it complained for more than an hour which is way too long 
for a production failure. I killed the test at this point. What I don't understand is I've watched pgpool call the failover script in previous tests. That's how I knew what to pass 
my script during a manual invocation after I had it completed. Something obviously malfunctioned causing the failover to not complete, but I've not ever seen a failure like that 
before. Ishii-san or Nagata-san have you ever seen failover not run through completion?
--
Jay




More information about the pgpool-general mailing list