[Pgpool-general] Healt Check issues
Tatsuo Ishii
ishii at sraoss.co.jp
Wed Jul 8 00:21:47 UTC 2009
> Hi,
>
> We started working with PG POOL II and for the last week we stumbled on the
> following issues:
> I have the following configuration:
>
> 2 servers referred to herein as LOCAL and REMOTE running CentOS.
> PGPOOL running on LOCAL
> PGPOOL VERSION: pgpool-II-2.2.2
>
> We tested the following scenarios:
>
> 1. If Istart PGPOOL II on the LOCAL host, and the PostgreSQL server on the
> REMOTE was not running, Invoking a PSQL client on the LOCAL (over PGPOOL
> port 9999) - the PSQL aborts.
> Q: We read in the documentation that PGPOOL closes all sessions (kills
> children) and then the client needs to reopen the sessiom:
> (a) Why
> (b) Is there a way just to shut off the connection to the REMOTE and
> continue working against the local DB
The answer is showed in:
Subject: Re: [Pgpool-general] SQL sessions die when a node fails
From: Tatsuo Ishii <ishii at sraoss.co.jp>
To: murojc at gmail.com
Cc: pgpool-general at pgfoundry.org
Date: Mon, 06 Jul 2009 09:22:34 +0900 (JST)
If you have additional questions, please let me know.
> 2. We set all the health check parameters (timeout, interval and user) and
> then we STOPPED (using "kill -STOP") all the postgres processes on the
> REMOTE, The PGPOOL (on the LOCAL) does not identify that, and continues
> running. By using "strace" on the PGPOOL parent process we could see that
> the READ from REMOTE fails with ERESTATSYS but the WRITE is successful and
> therefor the system does not cut off the failed REMOTE.
> When we changed the code in such a way that it will NOT ignore the READ
> failure - all children were killed (both LOCAL and REMOTE)
> Q: Does the failure to identify the connectivity issue evident by the READ
> is a PGPOOL BUG or there is a reason for that behavior?
>
> Below is the diff showing the change we introduced to the health check (in
> main.c):
>
> [root at lx tmp]# diff pgpool-II-2.2.2.new/main.c pgpool-II-2.2.2/main.c
> 1438,1445c1438
> < if (read(fd, &kind, 1) < 0) {
> < pool_error("health check failed during read. host %s at port %d is down.
> reason: %s. Perhaps wrong health check user?",
> < BACKEND_INFO(i).backend_hostname,
> < BACKEND_INFO(i).backend_port,
> < strerror(errno));
> < close(fd);
> < return i+1;
> < }
> ---
> > read(fd, &kind, 1);
I don't recall why health_check() ignores read():-< But it seems you are
right. Will fix. Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
More information about the Pgpool-general
mailing list