[pgpool-hackers: 4150] Re: exit handler in pgpool main process

Tatsuo Ishii ishii at sraoss.co.jp
Mon Apr 11 18:18:52 JST 2022


> While inspecting buildfarm failures, I noticed that the exit handler
> in pgpool maind process was interrupted while it was exectuing.
> 
> https://pgpool.net/buildfarm/20220408-buildfarm-CentOS7.tar.gz
> 
>> Subject: [pgpool-buildfarm: 2070] pgpool-II buildfarm results CentOS7
>> From: buildfarm at pgpool.net
>> To: pgpool-buildfarm at pgpool.net
>> Date: Sat, 09 Apr 2022 08:16:00 +0900
>> Sender: "pgpool-buildfarm" <pgpool-buildfarm-bounces at pgpool.net>
>> User-Agent: Heirloom mailx 12.5 7/5/10
>> 
>> =========================================================================
>> * master  PostgreSQL 13  CentOS7
>> testing 011.watchdog_quorum_failover...timeout.
> 
> Here is an excerption from the pgpool.log before timeout.
> 
> 2022-04-07 22:47:57.565: watchdog pid 16523: LOG:  I am the cluster leader node
> 2022-04-07 22:47:57.565: watchdog pid 16523: DETAIL:  our declare coordinator message is accepted by all nodes
> 2022-04-07 22:47:57.565: watchdog pid 16523: LOG:  setting the local node "localhost:11100 Linux 863294c37d9f" as watchdog cluster leader
> 2022-04-07 22:47:57.565: watchdog pid 16523: LOG:  signal_user1_to_parent_with_reason(1)
> 2022-04-07 22:47:57.565: watchdog pid 16523: LOG:  I am the cluster leader node but we do not have enough nodes in cluster
> 2022-04-07 22:47:57.565: watchdog pid 16523: DETAIL:  waiting for the quorum to start escalation process
> 2022-04-07 22:47:57.565: main pid 16515: LOG:  Pgpool-II parent process received SIGUSR1
> 2022-04-07 22:47:57.565: main pid 16515: LOG:  Pgpool-II parent process received watchdog state change signal from watchdog
> 2022-04-07 22:47:57.565: watchdog pid 16523: LOG:  new IPC connection received
> 2022-04-07 22:47:58.566: watchdog pid 16523: LOG:  adding watchdog node "localhost:11200 Linux 863294c37d9f" to the standby list
> 2022-04-07 22:47:58.566: watchdog pid 16523: LOG:  quorum found
> 2022-04-07 22:47:58.566: watchdog pid 16523: DETAIL:  starting escalation process
> 2022-04-07 22:47:58.567: main pid 16515: LOG:  shutting down
> 2022-04-07 22:47:58.567: main pid 16515: LOG:  terminating all child processes
> 2022-04-07 22:47:58.582: watchdog_utility pid 16751: LOG:  watchdog: escalation started
> 2022-04-07 22:57:03.253: main pid 16515: LOG:  shutting down
> 2022-04-07 22:57:03.254: main pid 16515: LOG:  terminating all child processes
> 
> The main process was entering exit signal handler at: 22:47:58.567 and
> then while doing reaping child process, it was interrupted again at:
> 22:57:03.253.
> 
> The signal handler (exit_handler) is registered for SIGTERM, SIGINT
> and SIGQUIT. It first blocks most of signal except SIGTERM, SIGQUIT
> and SIGALRM. As you know, SIGTERM is used for smart shutdown, and
> SIGQUIT is used for immediate shutdown. In my understanding, signal
> handlers are automatically protected from the same signal as it was
> interrupted by. So if exit_handler is interrupted by SIGTERM, the next
> SIGTERM will be blocked. BUT will not be blocked if other than SIGTERM
> (that is either SIGINT or SIGQUIT) for example. So my theory is,
> exit_handler was interrupted by one of SIGTERM, SIGINT or SIGQUIT,
> then was interrupted by other than the previous signal. I think this
> should be avoided because this causes infinite wait in
> terminate_childrens() which is called from exit_handler as we see in
> the buildfarm log.
> 
> I will think about fix for this.

Attached is the patch. In the patch I protect variable "exiting" by
using semaphore to make sure that only one instance of exit_handler
runs at the same time.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: exit_handler.patch
Type: text/x-patch
Size: 2729 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20220411/5bca7dac/attachment.bin>


More information about the pgpool-hackers mailing list