[pgpool-committers: 2569] pgpool: Fix "select() system call interrupted" error.

Tatsuo Ishii ishii at postgresql.org
Fri May 29 09:32:17 JST 2015


Fix "select() system call interrupted" error.

The health check process complains above then

ERROR:  failed to make persistent db connection
DETAIL:  connection to host:"x.x.x.x:5432 failed

But the healthchek does not trigger fail over nor retrying (if the
retry parameter is configured). So except the annoying message above
everything goes well.

I guess this is caused by SIGCHLD interrupt while select(2) waiting
for completion of connect(2) call. It is likely that pgpool child dies
because of child_life_time. The directive is triggered if child is
idle for specified period. Since this happens independently among each
child process, if num_init_children is big (in the reported case, it
is 2000). So the case could occur more easily if 1) num_init_children
is big and 2) pgpool children go into idle state (no query arrived
from client for child_life_time seconds).

We suppose a system call interruption could occur by SIGALRM which is
set by pgpool to detect time of the connect(2) call if
health_check_timeout is non 0. However we did not think about the case
above.

The fix is, if select(2) is interrupted by a system call, check
health_check_timeout variable and it is not set, we can assume that
the interruption is caused by other than SIGALRM and retries the
select(2).

Original bug report is [pgpool-general: 3756] Connection Interrupted.
Patch created by me. Enhancement from Usama.

Also I got similar report (but on 3.3) from another user. So I will
make similar commit to 3.3 stable tree as well.

Branch
------
master

Details
-------
http://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=90ae5d8531907c66099d0ceb12f606e12e9932cf

Modified Files
--------------
src/protocol/pool_connection_pool.c |   22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)



More information about the pgpool-committers mailing list