[pgpool-general: 6503] Re: Pgpool Primary node not automatically returning to cluster

Pierre Timmermans ptim007 at yahoo.com
Thu Apr 4 22:41:26 JST 2019


Hi, 
I am not sure I understand you use case, normally when master goes down then pgpool will detach the node and perform a failover. In the failover script you will usually promote the standby node.
When a detached node comes back, pgpool will not re-attach it automatically, it is not a bug but rather a feature I believe.
Pierre 

    On Thursday, April 4, 2019, 2:45:54 PM GMT+2, Nitish Kumar <itcell.mpwz at mp.gov.in> wrote:  
 
 Hi Team,
I am using Pgpoo II 3.7 with 3 PostgreSQL 10.6 nodes at the backend.
Everything is working fine. But today we have noticed something unusual.
During a normal production run with heavy traffic our Primary Node went down due to network failure i.e the network between pgpool-11(master) serverand primary node went off ! So the pgpool output the following lines in log :
2019-04-04 16:12:56: pid 27680:LOG:  failed to connect to PostgreSQL server on "172.18.0.160:5432", getsockopt() detected error "No route to host"
Our write requests started failing ! When we got alerted we debugged and found that Master DB or primary node was up and working fine. Only the network betweenprimary node & pgpool II master server was down. 
We fixed it ! and Pgpool II master process was able to connect to Primary Node. But it did not returned the primary node automatically to Pgpool cluster.We got following lines in the logs continuously :
Apr  4 16:13:12 pgpool2 pgpool[21822]: [2325-1] 2019-04-04 16:13:12: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:12 pgpool2 pgpool[21822]: [2326-1] 2019-04-04 16:13:12: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:12 pgpool2 pgpool[21822]: [2327-1] 2019-04-04 16:13:12: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:13 pgpool2 pgpool[21822]: [2328-1] 2019-04-04 16:13:13: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:13 pgpool2 pgpool[21822]: [2329-1] 2019-04-04 16:13:13: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:13 pgpool2 pgpool[21822]: [2330-1] 2019-04-04 16:13:13: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:14 pgpool2 pgpool[21822]: [2331-1] 2019-04-04 16:13:14: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:14 pgpool2 pgpool[21822]: [2332-1] 2019-04-04 16:13:14: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:14 pgpool2 pgpool[21822]: [2333-1] 2019-04-04 16:13:14: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:15 pgpool2 pgpool[21822]: [2334-1] 2019-04-04 16:13:15: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:15 pgpool2 pgpool[21822]: [2335-1] 2019-04-04 16:13:15: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:15 pgpool2 pgpool[21822]: [2336-1] 2019-04-04 16:13:15: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:16 pgpool2 pgpool[21822]: [2337-1] 2019-04-04 16:13:16: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:16 pgpool2 pgpool[21822]: [2338-1] 2019-04-04 16:13:16: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:16 pgpool2 pgpool[21822]: [2339-1] 2019-04-04 16:13:16: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:17 pgpool2 pgpool[27247]: [2976-1] 2019-04-04 16:13:17: pid 27247:LOG:  Replication of node:2 is behind 695032 bytes from the primary server (node:0)Apr  4 16:13:17 pgpool2 pgpool[27247]: [2976-2] 2019-04-04 16:13:17: pid 27247:CONTEXT:  while checking replication time lagApr  4 16:13:17 pgpool2 pgpool[21822]: [2340-1] 2019-04-04 16:13:17: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:17 pgpool2 pgpool[21822]: [2341-1] 2019-04-04 16:13:17: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:17 pgpool2 pgpool[21822]: [2342-1] 2019-04-04 16:13:17: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:19 pgpool2 pgpool[21822]: [2343-1] 2019-04-04 16:13:19: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:19 pgpool2 pgpool[21822]: [2344-1] 2019-04-04 16:13:19: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:19 pgpool2 pgpool[21822]: [2345-1] 2019-04-04 16:13:19: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:20 pgpool2 pgpool[21822]: [2346-1] 2019-04-04 16:13:20: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:20 pgpool2 pgpool[21822]: [2347-1] 2019-04-04 16:13:20: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:20 pgpool2 pgpool[21822]: [2348-1] 2019-04-04 16:13:20: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:21 pgpool2 pgpool[21822]: [2349-1] 2019-04-04 16:13:21: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:21 pgpool2 pgpool[21822]: [2350-1] 2019-04-04 16:13:21: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:21 pgpool2 pgpool[21822]: [2351-1] 2019-04-04 16:13:21: pid 21822:LOG:  find_primary_node: checking backend no 2Apr  4 16:13:22 pgpool2 pgpool[21822]: [2352-1] 2019-04-04 16:13:22: pid 21822:LOG:  find_primary_node: checking backend no 0Apr  4 16:13:22 pgpool2 pgpool[21822]: [2353-1] 2019-04-04 16:13:22: pid 21822:LOG:  find_primary_node: checking backend no 1Apr  4 16:13:22 pgpool2 pgpool[21822]: [2354-1] 2019-04-04 16:13:22: pid 21822:LOG:  find_primary_node: checking backend no 2
To get the primary node back into pgpool II cluster we have to manually click return in PgPoolAdmin web-app. 
My concern is why primary node did not returned to the cluster automatically after the network resolved ???
Kindly help guys so that I can avert this kind of failovers in future. Is there something I am missing here ??
Regards,Nitish Kumar
_______________________________________________
pgpool-general mailing list
pgpool-general at pgpool.net
http://www.pgpool.net/mailman/listinfo/pgpool-general
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20190404/6d51a401/attachment.html>


More information about the pgpool-general mailing list