[pgpool-committers: 2567] pgpool: Fix "select() system call interrupted" error.

Tatsuo Ishii ishii at postgresql.org
Fri May 29 09:32:17 JST 2015


Fix "select() system call interrupted" error.

The health check process complains above then

ERROR:  failed to make persistent db connection
DETAIL:  connection to host:"x.x.x.x:5432 failed

But the healthchek does not trigger fail over nor retrying (if the
retry parameter is configured). So except the annoying message above
everything goes well.

I guess this is caused by SIGCHLD interrupt while select(2) waiting
for completion of connect(2) call. It is likely that pgpool child dies
because of child_life_time. The directive is triggered if child is
idle for specified period. Since this happens independently among each
child process, if num_init_children is big (in the reported case, it
is 2000). So the case could occur more easily if 1) num_init_children
is big and 2) pgpool children go into idle state (no query arrived
from client for child_life_time seconds).

We suppose a system call interruption could occur by SIGALRM which is
set by pgpool to detect time of the connect(2) call if
health_check_timeout is non 0. However we did not think about the case
above.

The fix is, if select(2) is interrupted by a system call, check
health_check_timeout variable and it is not set, we can assume that
the interruption is caused by other than SIGALRM and retries the
select(2).

Original bug report is [pgpool-general: 3756] Connection Interrupted.
Patch created by me. Enhancement from Usama.

Also I got similar report (but on 3.3) from another user. So I will
make similar commit to 3.3 stable tree as well.

Branch
------
V3_3_STABLE

Details
-------
http://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=dfeaca29bce9e927d688bbb11a5722ad9eea0888

Modified Files
--------------
pool_connection_pool.c |   21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)



More information about the pgpool-committers mailing list