[pgpool-hackers: 3911] Re: Patch: Move auto_failback_interval in to BackendInfo, and update it any time the backend state is set to CON_DOWN

Wed Jun 2 09:14:47 JST 2021

And we should not forget this:

> Also we have found that detach_false_primary should only run on the
> leader watchdog node. Probably we should consider this for
> auto_failback too.

> Note that I have already implemented follow_primay_lock and pushed to
> the repository.
> 
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=455f00dd5f5b7b94bd91aa0b6b40aab21dceabb9
> 
>> Hi,
>> 
>> I am just coming back to this work now after some time on other projects.
>> 
>> I think there are several proposals around improving auto_failback in this thread:
>> 1) my patch
>> 2) Ishii-san「s patch to check follow_primary_count == 0
>> 3) Ishii-san「s proposal to implement a lock to avoid the window where follow_primary might run after checking follow_primary_count
>> 
>> My understanding is we think 1+2 are good, and we can look at 3 if there is still a problem - or perhaps we plan to look at 3 as a future improvement, to avoid a potential problem?
>> 
>> Would you like me to test 1+2?
>> 
>>> On 10/05/2021, at 7:16 PM, Takuma Hoshiai <hoshiai.takuma at nttcom.co.jp> wrote:
>>> 
>>> Hi,
>>> 
>>> On 2021/05/05 16:03, Tatsuo Ishii wrote:
>>>>>> On 27/04/2021, at 10:18 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>>>>> 
>>>>>> Hi Nathan,
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Sorry about that! I dragged them from the vscode file list directly to Mail - I suspect that that doesn「t work when using remote editing..!
>>>>>>> 
>>>>>>> I have attached the files now - does that work?
>>>>>> 
>>>>>> Yes! I will look into the patches. Hoshiai-san, can you please look
>>>>>> into the patches as well because you are the original author of the
>>>>>> feature.
>>>>> 
>>>>> Hi!
>>>>> 
>>>>> I was wondering if you had time to look at these patches yet? :-)
>>>>> 
>>>>> No rush - just making sure it doesn「t get missed!
>>>> I just have started to look into your patch. Also I was able to
>>>> reproduce the problem.
>>>> 1) create 3-node streaming replication cluster.
>>>> pgpool_setup -n 3
>>>> Enable auto_failback and set health_check_period to 1 so that
>>>> auto_failback runs more aggressively.
>>>> auto_failback = on
>>>> health_check_period0 = 1
>>>> health_check_period1 = 1
>>>> health_check_period2 = 1
>>>> start the whole system.
>>>> 2) detach node 0 (which is primary)
>>>> 3) node 3 becomes down and PostgreSQL won't start
>>>> psql -p 11000 -c "show pool_nodes" test
>>>>  node_id | hostname | port  | status | pg_status | lb_weight |  role   | pg_role | select_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
>>>> ---------+----------+-------+--------+-----------+-----------+---------+---------+------------+-------------------+-------------------+-------------------+------------------------+---------------------
>>>>  0       | /tmp     | 11002 | up     | up        | 0.333333  | standby | standby | 0          | true              | 0                 | streaming         | async                  | 2021-05-05 14:10:38
>>>>  1       | /tmp     | 11003 | up     | up        | 0.333333  | primary | primary | 0          | false             | 0                 |                   |                        | 2021-05-05 14:10:25
>>>>  2       | /tmp     | 11004 | down   | down      | 0.333333  | standby | unknown | 0          | false             | 0                 |                   |                        | 2021-05-05 14:10:38
>>>> (3 rows)
>>>> The cause of the problem is a race condition between the auto failback
>>>> and follow primary as you and Hoshiai-san suggested. Here are some
>>>> extraction from the pgpool.log.
>>>> $ egrep "degeneration|failback" log/pgpool.log|grep -v child
>>>> 2021-05-05 14:10:22: main pid 28630: LOG:  starting degeneration. shutdown host /tmp(11002)
>>>> 2021-05-05 14:10:25: main pid 28630: LOG:  starting follow degeneration. shutdown host /tmp(11002)
>>>> 2021-05-05 14:10:25: main pid 28630: LOG:  starting follow degeneration. shutdown host /tmp(11004)	-- #1
>>>> 2021-05-05 14:10:25: health_check2 pid 28673: LOG:  request auto failback, node id:2	-- #2
>>>> 2021-05-05 14:10:25: health_check2 pid 28673: LOG:  received failback request for node_id: 2 from pid [28673]
>>>> 2021-05-05 14:10:35: main pid 28630: LOG:  failback done. reconnect host /tmp(11004)
>>>> 2021-05-05 14:10:35: main pid 28630: LOG:  failback done. reconnect host /tmp(11002)	-- #3
>>>> 2021-05-05 14:10:36: pcp_child pid 29035: LOG:  starting recovering node 2
>>>> 2021-05-05 14:10:36: pcp_child pid 29035: ERROR:  node recovery failed, node id: 2 is alive	-- #4
>>>> 2021-05-05 14:10:38: child pid 29070: LOG:  failed to connect to PostgreSQL server by unix domain socket
>>>> 2021-05-05 14:10:38: child pid 29070: DETAIL:  executing failover on backend
>>>> 2021-05-05 14:10:38: main pid 28630: LOG:  Pgpool-II parent process has received failover request
>>>> 2021-05-05 14:10:38: main pid 28630: LOG:  starting degeneration. shutdown host /tmp(11004)	-- #5
>>>> 1) Follow primary started to shutdown node 2. At this point the
>>>>    backend node 2 was still running.
>>>> 2) auto failback found that backend is still alive and send failback
>>>>    request for node 2.
>>>> 3) pgpool main process reported that node 2 was back. But actual
>>>>    failback had not done and continued by follow primary command.
>>>> 4) follow primary command for node 2 failed because auto failback set
>>>>    the status of node 2 to "up".
>>>> 5) Node 2 PostgreSQL was down and health check detected it. Node 2
>>>>    status became down.
>>>>    So if auto failback did not run at #2, the follow primary should have
>>>> been succeeded.
>>>> BTW accidently I and a user found similar situation: conflicting
>>>> concurrent run of detach_false_primary and follow primary command:
>>>> https://www.pgpool.net/pipermail/pgpool-general/2021-April/007583.html
>>>> In the discussion I proposed a patch to prevent the concurrent run of
>>>> detach_false_primary and follow primary command. I think we can apply
>>>> the method to auto_failback as well. Attached is the patch to
>>>> implement it on top of the patch I posted here for the master branch:
>>>> https://www.pgpool.net/pipermail/pgpool-general/2021-April/007594.html
>>>> This patch actually has a small window between here:
>>>> 		if (check_failback && !Req_info->switching && slot &&
>>>> 			Req_info->follow_primary_count == 0)
>>>> and here:
>>>> 				ereport(LOG,
>>>> 					(errmsg("request auto failback, node id:%d", node)));
>>>> 				/* get current time to use auto_faliback_interval */
>>>> 				now = time(NULL);
>>>> 				auto_failback_interval = now + pool_config->auto_failback_interval;
>>>> 				send_failback_request(node, true, REQ_DETAIL_CONFIRMED);
>>>> because after checking Req_info->follow_primary_count, follow primary
>>>> might start just after this. I think the window and probably is
>>>> harmless in the wild. If you think it's not so small, we could do an
>>>> exclusive lock like in detach_false_primary to plug the window.
>>>> Also we have found that detach_false_primary should only run on the
>>>> leader watchdog node. Probably we should consider this for
>>>> auto_failback too.
>>> 
>>> I have started to look this patch too. But I have failed pgool_setup
>>> command in latest master branch with auto_failback_fixes-master.patch.
>>> This cause is researching now (It may be my environment is bad).
>>> 
>>> As far as I can see, auto_failback_fixes-master.patch is good.
>>> And I think that ishii-san's suggestion makes this patch better.
>>> 
>>> 
>>> 
>>>> Best regards,
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php <http://www.sraoss.co.jp/index_en.php>
>>>> Japanese:http://www.sraoss.co.jp <http://www.sraoss.co.jp/>
>>>> _______________________________________________
>>>> pgpool-hackers mailing list
>>>> pgpool-hackers at pgpool.net <mailto:pgpool-hackers at pgpool.net>
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-hackers <http://www.pgpool.net/mailman/listinfo/pgpool-hackers>
>>> 
>>> Best Regards,
>>> 
>>> -- 
>>> Takuma Hoshiai <hoshiai.takuma at nttcom.co.jp <mailto:hoshiai.takuma at nttcom.co.jp>>
>> 
> _______________________________________________
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-hackers