[Pgpool-general] Fwd: PGPOOL II 2.3.3 hang in ssl mode

sean finney seanius at seanius.net
Wed May 19 18:55:35 UTC 2010


hi guys,

sorry, i have been busy with a few other things at work, and won't have
time this week to look at anything pgpool/ssl related.  if the problem
persists through next week i should have some time... but it sounds like
you might have already found something though :)


	sean

On Thu, May 20, 2010 at 12:41:19AM +0900, Tatsuo Ishii wrote:
> > Thanks for the info. 
> > 
> > > I have changed the source pool_stream.c as follows:
> > > 
> > > #DEFINE READBUF 102400
> > > #DEFINE WRITEBUF 819200
> > > 
> > > It solved the previous problem.
> > > Now image is uploaded.
> > 
> > Before pgpool-II reading data from SSL layer, it checks the socket
> > using select(2) to see if data is avilable. Problem is, OpenSSL does
> > its own buffering, and it could happen that select(2) indicates
> > there's no data, while pending data remains in the OpenSSL's
> > buffer. Extending the size of pgpool-II as you did to read out all
> > pending data in the OpenSSL buffer might have solved part of the
> > problem, I suspect.
> > 
> > But I am not sure that solves the problem at all. Actually you still
> > have a problem. I will investigate the problem further, but if
> > somebody on this list who are really familiar with OpenSSL could help
> > me, I will really appreciate.
> 
> A Googling suggested me an idea to use a function called SSL_pending()
> which tell us if SSL buffer has any pending data or not. Including is
> the patch which seems to solve the problem by using it. Please try. It
> works even without READBUF/WRITEBUF enlargement. Also I modify
> pool_ssl_read() by stealing code from PostgreSQL which seems does
> better thing.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
> 
> > > But it can't load in the web.
> > > 
> > > The following query runs from our application:
> > > 
> > > SELECT p.proname,p.oid  FROM pg_catalog.pg_proc p, pg_catalog.pg_namespace
> > > n  WHERE
> > > p.pronamespace=n.oid AND n.nspname='pg_catalog' AND ( proname = 'lo_open' or
> > > proname = 'lo_close' or proname = 'lo_creat' or
> > > proname = 'lo_unlink' or proname = 'lo_lseek' or proname = 'lo_tell' or
> > > proname = 'loread' or proname = 'lowrite')
> > > 
> > > 
> > > Pgpool II hangs at:
> > > 
> > > 2010-05-17 17:24:14 DEBUG: pid 29845: detect_error: kind: V
> > > 2010-05-17 17:24:14 DEBUG: pid 29845: detect_error: kind: V
> > > 2010-05-17 17:24:14 DEBUG: pid 29845: read_kind_from_backend: read kind from
> > > 0 th backend V NUM_BACKENDS: 1
> > > 2010-05-17 17:24:14 DEBUG: pid 29845: pool_process_query: kind from backend:
> > > V
> > > 
> > > At Postgresql log:
> > > 
> > > fastpath function call: "lo_open" (OID 952)
> > > fastpath function call: "lo_tell" (OID 958)
> > > fastpath function call: "lo_lseek" (OID 956)
> > > fastpath function call: "lo_tell" (OID 958)
> > > fastpath function call: "lo_lseek" (OID 956)
> > > fastpath function call: "lo_lseek" (OID 956)
> > > fastpath function call: "loread" (OID 954)
> > > 
> > > 
> > > Any help please how to solve it.
> > > 
> > > On Tue, May 11, 2010 at 8:15 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > > 
> > > > Sean,
> > > >
> > > > It seems the problem only occurs when SSL between pgpool-II and
> > > > PostgreSQL enabled. Also it seems that whatever SELECT query which
> > > > returns non trivial amount of rows hangs.
> > > >
> > > > Any idea why the problem occurs? Running pgpool-II with debug mode
> > > > shows that pgpool-II hangs here:
> > > >
> > > > 2010-05-11 09:49:50 DEBUG: pid 2460: pool_process_query: kind from backend:
> > > > D
> > > > 2010-05-11 09:49:50 DEBUG: pid 2460: read_kind_from_backend: read kind from
> > > > 0 th backend D NUM_BACKENDS: 1
> > > > 2010-05-11 09:49:50 DEBUG: pid 2460: pool_process_query: kind from backend:
> > > > D
> > > > --
> > > > Tatsuo Ishii
> > > > SRA OSS, Inc. Japan
> > > > English: http://www.sraoss.co.jp/index_en.php
> > > > Japanese: http://www.sraoss.co.jp
> > > >
> > > >
> > > > > Addition to the above, I found that,
> > > > >
> > > > > When I use the following query it works :
> > > > >
> > > > > SELECT d.datname as "Name",
> > > > >        r.rolname as "Owner",
> > > > >        d.encoding as "Encoding"
> > > > > FROM pg_catalog.pg_database d
> > > > >   JOIN pg_catalog.pg_roles r ON d.datdba = r.oid
> > > > > ORDER BY 1
> > > > >
> > > > > But the following query does not work:
> > > > >
> > > > > SELECT d.datname as "Name",
> > > > >        r.rolname as "Owner",
> > > > >        pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding"
> > > > > FROM pg_catalog.pg_database d
> > > > >   JOIN pg_catalog.pg_roles r ON d.datdba = r.oid
> > > > > ORDER BY 1
> > > > >
> > > > > The function call "pg_catalog.pg_encoding_to_char(d.encoding)" somehow
> > > > makes
> > > > > the Pgpool hang in SSL mode.
> > > > >
> > > > > ---------- Forwarded message ----------
> > > > > From: AI Rumman <rummandba at gmail.com>
> > > > > Date: Sun, May 9, 2010 at 12:06 PM
> > > > > Subject: PGPOOL II 2.3.3 hang in ssl mode
> > > > > To: pgpool-general at pgfoundry.org
> > > > >
> > > > >
> > > > > I am using Pgpool II 2.3.3 with Postgresql 8.3.8.
> > > > >
> > > > > When I use command \l at postgresql client the query is working
> > > > perfectly.
> > > > >
> > > > > But if I used the command from pgpool II client which is connected with
> > > > > postgresql in ssl mode, it gets hang.
> > > > >
> > > > > Again if I use the command from pgpool II client in non-ssl mode, it
> > > > works
> > > > > fine.
> > > > >
> > > > > Any help please.
> > > >

> Index: pool_process_query.c
> ===================================================================
> RCS file: /cvsroot/pgpool/pgpool-II/pool_process_query.c,v
> retrieving revision 1.202
> diff -c -r1.202 pool_process_query.c
> *** pool_process_query.c	10 May 2010 09:35:45 -0000	1.202
> --- pool_process_query.c	19 May 2010 15:34:16 -0000
> ***************
> *** 1025,1030 ****
> --- 1025,1039 ----
>   	struct timeval timeout;
>   	struct timeval *timeoutp;
>   
> + 	/*
> + 	 * If SSL is enabled, we need to check SSL internal buffer
> + 	 * is empty or not first. Otherwise select(2) will stuck.
> + 	 */
> + 	if (cp->ssl_active > 0 && SSL_pending(cp->ssl) > 0)
> + 	{
> + 		return 0;
> + 	}
> + 		
>   	fd = cp->fd;
>   
>   	if (timeoutsec > 0)
> ***************
> *** 3420,3425 ****
> --- 3429,3441 ----
>   {
>   	int i;
>   
> + 	/*
> + 	 * If SSL is enabled, we need to check SSL internal buffer
> + 	 * is empty or not first.
> + 	 */
> + 	if (frontend->ssl_active > 0 && SSL_pending(frontend->ssl) > 0 && !in_progress)
> + 		return 0;
> + 
>   	if (frontend->len > 0 && !in_progress)
>   		return 0;
>   
> ***************
> *** 3428,3433 ****
> --- 3444,3457 ----
>   		if (!VALID_BACKEND(i))
>   			continue;
>   
> + 		/*
> + 		 * If SSL is enabled, we need to check SSL internal buffer
> + 		 * is empty or not first.
> + 		 */
> + 		if (CONNECTION(backend, i)->ssl_active > 0 &&
> + 			SSL_pending(CONNECTION(backend, i)->ssl) > 0)
> + 			return 0;
> + 
>   		if (CONNECTION(backend, i)->len > 0)
>   			return 0;
>   	}
> Index: pool_ssl.c
> ===================================================================
> RCS file: /cvsroot/pgpool/pgpool-II/pool_ssl.c,v
> retrieving revision 1.5
> diff -c -r1.5 pool_ssl.c
> *** pool_ssl.c	4 Feb 2010 00:31:21 -0000	1.5
> --- pool_ssl.c	19 May 2010 15:34:16 -0000
> ***************
> *** 124,130 ****
>   }
>   
>   int pool_ssl_read(POOL_CONNECTION *cp, void *buf, int size) {
> ! 	return SSL_read(cp->ssl, buf, size);
>   }
>   
>   int pool_ssl_write(POOL_CONNECTION *cp, const void *buf, int size) {
> --- 124,178 ----
>   }
>   
>   int pool_ssl_read(POOL_CONNECTION *cp, void *buf, int size) {
> ! 	int n;
> ! 	int err;
> ! 
> !  retry:
> ! 	errno = 0;
> ! 	n = SSL_read(cp->ssl, buf, size);
> ! 	err = SSL_get_error(cp->ssl, n);
> ! 
> ! 	switch (err)
> ! 	{
> ! 		case SSL_ERROR_NONE:
> ! 			break;
> ! 		case SSL_ERROR_WANT_READ:
> ! 			n = 0;
> ! 			break;
> ! 		case SSL_ERROR_WANT_WRITE:
> ! 
> ! 			/*
> ! 			 * Returning 0 here would cause caller to wait for read-ready,
> ! 			 * which is not correct since what SSL wants is wait for
> ! 			 * write-ready.  The former could get us stuck in an infinite
> ! 			 * wait, so don't risk it; busy-loop instead.
> ! 			 */
> ! 			goto retry;
> ! 
> ! 		case SSL_ERROR_SYSCALL:
> ! 			if (n == -1)
> ! 			{
> ! 				pool_error("SSL_read error: %d", err);
> ! 			}
> ! 			else
> ! 			{
> ! 				pool_error("SSL_read error: EOF detected");
> ! 				n = -1;
> ! 			}
> ! 			break;
> ! 
> ! 		case SSL_ERROR_SSL:
> ! 		case SSL_ERROR_ZERO_RETURN:
> ! 			perror_ssl("SSL_read");
> ! 			n = -1;
> ! 			break;
> ! 		default:
> ! 			perror_ssl("SSL_read");
> ! 			n = -1;
> ! 			break;
> ! 	}
> ! 
> ! 	return n;
>   }
>   
>   int pool_ssl_write(POOL_CONNECTION *cp, const void *buf, int size) {

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20100519/9eb99373/attachment.bin>


More information about the Pgpool-general mailing list