View Issue Details

IDProjectCategoryView StatusLast Update
0000228Pgpool-IIBugpublic2016-08-04 23:51
Reportersupp_k Assigned ToMuhammad Usama  
PriorityhighSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
PlatformpgpoolOSCentOSOS Version6 & 7
Product Version3.5.3 
Summary0000228: pgpool doesnt de-escalate IP in case netowkr restored
DescriptionPgpool doesn't de-escalate IP address in case the split brain is resolved when it turns from Master into Standby.
Steps To ReproduceEnvironment:
1) Pgpool A (Master) hosts VIP (virtual IP)
2) Pgpool B (Standby)
Watchdog and heartbit processes are OK.

Steps to reproduce:
- emulate network failure between Pgpool A & Pgpool B for the heartbit receive time period. When the heartbit time is exceeded Pgpool initiates voting and up the VIP - it is Ok! Now we have 2 Pgpool masters within the network => it is the "split brain" case.

- restore network connectivity between Pgpool A & Pgpool B => pgpools restart voting and one of the masters turns into Standby (let it be the Pgpool B) - it is OK as well but at the same moment the Pgpool B doesnt down (ip addr del ...) the VIP. Should it be?
Tagswatchdog

Activities

Muhammad Usama

2016-08-03 23:14

developer  

de-esc_bug_228.diff (643 bytes)   
de-esc_bug_228.diff (643 bytes)   

Muhammad Usama

2016-08-03 23:16

developer   ~0000963

Hi

I was able to reproduce the issue, Can you please try the attached patch "de-esc_bug_228.diff" if it solves your problem

supp_k

2016-08-04 01:41

reporter   ~0000964

Hi,

yes the problem disappeared.

Here are the log records:
2016-08-03 19:37:34: pid 2604: WARNING: "Linux_warm1.local_9999" is the coordinator as per our record but "Linux_warm0.local_9999" is also announcing as a coordinator
2016-08-03 19:37:34: pid 2604: DETAIL: re-initializing the cluster
2016-08-03 19:37:34: pid 2604: LOG: watchdog node state changed from [MASTER] to [JOINING]
2016-08-03 19:37:34: pid 2952: LOG: watchdog: de-escalation started
2016-08-03 19:37:34: pid 2604: WARNING: the coordinator as per our record is not coordinator anymore
2016-08-03 19:37:34: pid 2604: DETAIL: re-initializing the cluster
2016-08-03 19:37:34: pid 2604: LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
2016-08-03 19:37:35: pid 2604: LOG: watchdog node state changed from [INITIALIZING] to [STANDING FOR MASTER]
2016-08-03 19:37:35: pid 2604: LOG: watchdog node state changed from [STANDING FOR MASTER] to [PARTICIPATING IN ELECTION]
2016-08-03 19:37:35: pid 2604: LOG: watchdog node state changed from [PARTICIPATING IN ELECTION] to [INITIALIZING]
2016-08-03 19:37:35: pid 2605: LOG: informing the node status change to watchdog
2016-08-03 19:37:35: pid 2605: DETAIL: node id :1 status = "NODE ALIVE" message:"Heartbeat signal found"
2016-08-03 19:37:35: pid 2604: LOG: new IPC connection received
2016-08-03 19:37:35: pid 2604: LOG: received node status change ipc message
2016-08-03 19:37:35: pid 2604: DETAIL: Heartbeat signal found
2016-08-03 19:37:36: pid 2604: LOG: watchdog node state changed from [INITIALIZING] to [STANDBY]
2016-08-03 19:37:40: pid 2604: LOG: successfully joined the watchdog cluster as standby node
2016-08-03 19:37:40: pid 2604: DETAIL: our join coordinator request is accepted by cluster leader node "Linux_warm0.local_9999"
2016-08-03 19:37:46: pid 2952: WARNING: watchdog failed to ping host"192.168.7.7"
2016-08-03 19:37:46: pid 2952: DETAIL: ping process exits with code: 1
2016-08-03 19:37:46: pid 2952: LOG: watchdog bringing down delegate IP
2016-08-03 19:37:46: pid 2952: DETAIL: if_down_cmd succeeded
2016-08-03 19:37:46: pid 2604: LOG: watchdog de-escalation process with pid: 2952 exit with SUCCESS.



Thank you!

Muhammad Usama

2016-08-04 23:51

developer   ~0000965

Thanks for the confirmation of fix. I have committed the same in master and 3.5 branches

http://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=cf57d9970f46a92c52315b42eae9dbee73c90525

Issue History

Date Modified Username Field Change
2016-08-02 01:53 supp_k New Issue
2016-08-02 10:22 t-ishii Assigned To => Muhammad Usama
2016-08-02 10:22 t-ishii Status new => assigned
2016-08-02 13:44 t-ishii Tag Attached: watchdog
2016-08-03 23:14 Muhammad Usama File Added: de-esc_bug_228.diff
2016-08-03 23:16 Muhammad Usama Status assigned => feedback
2016-08-03 23:16 Muhammad Usama Note Added: 0000963
2016-08-04 01:41 supp_k Note Added: 0000964
2016-08-04 01:41 supp_k Status feedback => assigned
2016-08-04 23:51 Muhammad Usama Status assigned => resolved
2016-08-04 23:51 Muhammad Usama Resolution open => fixed
2016-08-04 23:51 Muhammad Usama Note Added: 0000965