[Pgpool-general] Healt Check issues

Tatsuo Ishii ishii at sraoss.co.jp
Wed Jul 8 00:21:47 UTC 2009


> Hi,
> 
> We started working with PG POOL II and for the last week we stumbled on the  
> following issues:
> I have the following configuration:
> 
> 2 servers referred to herein as LOCAL and REMOTE running CentOS.
> PGPOOL running on LOCAL
> PGPOOL VERSION: pgpool-II-2.2.2
> 
> We tested the following scenarios:
> 
> 1. If Istart PGPOOL II on the LOCAL host, and the PostgreSQL server on the  
> REMOTE was not running, Invoking a PSQL client on the LOCAL (over PGPOOL  
> port 9999) - the PSQL aborts.
> Q: We read in the documentation that PGPOOL closes all sessions (kills  
> children) and then the client needs to reopen the sessiom:
> (a) Why
> (b) Is there a way just to shut off the connection to the REMOTE and  
> continue working against the local DB

The answer is showed in:

Subject: Re: [Pgpool-general] SQL sessions die when a node fails
From: Tatsuo Ishii <ishii at sraoss.co.jp>
To: murojc at gmail.com
Cc: pgpool-general at pgfoundry.org
Date: Mon, 06 Jul 2009 09:22:34 +0900 (JST)

If you have additional questions, please let me know.

> 2. We set all the health check parameters (timeout, interval and user) and  
> then we STOPPED (using "kill -STOP") all the postgres processes on the  
> REMOTE, The PGPOOL (on the LOCAL) does not identify that, and continues  
> running. By using "strace" on the PGPOOL parent process we could see that  
> the READ from REMOTE fails with ERESTATSYS but the WRITE is successful and  
> therefor the system does not cut off the failed REMOTE.
> When we changed the code in such a way that it will NOT ignore the READ  
> failure - all children were killed (both LOCAL and REMOTE)
> Q: Does the failure to identify the connectivity issue evident by the READ  
> is a PGPOOL BUG or there is a reason for that behavior?
> 
> Below is the diff showing the change we introduced to the health check (in  
> main.c):
> 
> [root at lx tmp]# diff pgpool-II-2.2.2.new/main.c pgpool-II-2.2.2/main.c
> 1438,1445c1438
> < if (read(fd, &kind, 1) < 0) {
> < pool_error("health check failed during read. host %s at port %d is down.  
> reason: %s. Perhaps wrong health check user?",
> < BACKEND_INFO(i).backend_hostname,
> < BACKEND_INFO(i).backend_port,
> < strerror(errno));
> < close(fd);
> < return i+1;
> < }
> ---
> > read(fd, &kind, 1);

I don't recall why health_check() ignores read():-< But it seems you are
right. Will fix. Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


More information about the Pgpool-general mailing list