[Pgpool-general] Fwd: PGPOOL II 2.3.3 hang in ssl mode
    Tatsuo Ishii 
    ishii at sraoss.co.jp
       
    Fri May 21 07:37:27 UTC 2010
    
    
  
Sean,
Glad to see your reply! Your work was great and I appreciate your
contribution. Please take a look at the issue when you have eough
time. Yes, I came across to some conclusions but I will appreciate if
you check whether my modifications are sane at all.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
> hi guys,
> 
> sorry, i have been busy with a few other things at work, and won't have
> time this week to look at anything pgpool/ssl related.  if the problem
> persists through next week i should have some time... but it sounds like
> you might have already found something though :)
> 
> 
> 	sean
> 
> On Thu, May 20, 2010 at 12:41:19AM +0900, Tatsuo Ishii wrote:
> > > Thanks for the info. 
> > > 
> > > > I have changed the source pool_stream.c as follows:
> > > > 
> > > > #DEFINE READBUF 102400
> > > > #DEFINE WRITEBUF 819200
> > > > 
> > > > It solved the previous problem.
> > > > Now image is uploaded.
> > > 
> > > Before pgpool-II reading data from SSL layer, it checks the socket
> > > using select(2) to see if data is avilable. Problem is, OpenSSL does
> > > its own buffering, and it could happen that select(2) indicates
> > > there's no data, while pending data remains in the OpenSSL's
> > > buffer. Extending the size of pgpool-II as you did to read out all
> > > pending data in the OpenSSL buffer might have solved part of the
> > > problem, I suspect.
> > > 
> > > But I am not sure that solves the problem at all. Actually you still
> > > have a problem. I will investigate the problem further, but if
> > > somebody on this list who are really familiar with OpenSSL could help
> > > me, I will really appreciate.
> > 
> > A Googling suggested me an idea to use a function called SSL_pending()
> > which tell us if SSL buffer has any pending data or not. Including is
> > the patch which seems to solve the problem by using it. Please try. It
> > works even without READBUF/WRITEBUF enlargement. Also I modify
> > pool_ssl_read() by stealing code from PostgreSQL which seems does
> > better thing.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > English: http://www.sraoss.co.jp/index_en.php
> > Japanese: http://www.sraoss.co.jp
> > 
> > > > But it can't load in the web.
> > > > 
> > > > The following query runs from our application:
> > > > 
> > > > SELECT p.proname,p.oid  FROM pg_catalog.pg_proc p, pg_catalog.pg_namespace
> > > > n  WHERE
> > > > p.pronamespace=n.oid AND n.nspname='pg_catalog' AND ( proname = 'lo_open' or
> > > > proname = 'lo_close' or proname = 'lo_creat' or
> > > > proname = 'lo_unlink' or proname = 'lo_lseek' or proname = 'lo_tell' or
> > > > proname = 'loread' or proname = 'lowrite')
> > > > 
> > > > 
> > > > Pgpool II hangs at:
> > > > 
> > > > 2010-05-17 17:24:14 DEBUG: pid 29845: detect_error: kind: V
> > > > 2010-05-17 17:24:14 DEBUG: pid 29845: detect_error: kind: V
> > > > 2010-05-17 17:24:14 DEBUG: pid 29845: read_kind_from_backend: read kind from
> > > > 0 th backend V NUM_BACKENDS: 1
> > > > 2010-05-17 17:24:14 DEBUG: pid 29845: pool_process_query: kind from backend:
> > > > V
> > > > 
> > > > At Postgresql log:
> > > > 
> > > > fastpath function call: "lo_open" (OID 952)
> > > > fastpath function call: "lo_tell" (OID 958)
> > > > fastpath function call: "lo_lseek" (OID 956)
> > > > fastpath function call: "lo_tell" (OID 958)
> > > > fastpath function call: "lo_lseek" (OID 956)
> > > > fastpath function call: "lo_lseek" (OID 956)
> > > > fastpath function call: "loread" (OID 954)
> > > > 
> > > > 
> > > > Any help please how to solve it.
> > > > 
> > > > On Tue, May 11, 2010 at 8:15 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> > > > 
> > > > > Sean,
> > > > >
> > > > > It seems the problem only occurs when SSL between pgpool-II and
> > > > > PostgreSQL enabled. Also it seems that whatever SELECT query which
> > > > > returns non trivial amount of rows hangs.
> > > > >
> > > > > Any idea why the problem occurs? Running pgpool-II with debug mode
> > > > > shows that pgpool-II hangs here:
> > > > >
> > > > > 2010-05-11 09:49:50 DEBUG: pid 2460: pool_process_query: kind from backend:
> > > > > D
> > > > > 2010-05-11 09:49:50 DEBUG: pid 2460: read_kind_from_backend: read kind from
> > > > > 0 th backend D NUM_BACKENDS: 1
> > > > > 2010-05-11 09:49:50 DEBUG: pid 2460: pool_process_query: kind from backend:
> > > > > D
> > > > > --
> > > > > Tatsuo Ishii
> > > > > SRA OSS, Inc. Japan
> > > > > English: http://www.sraoss.co.jp/index_en.php
> > > > > Japanese: http://www.sraoss.co.jp
> > > > >
> > > > >
> > > > > > Addition to the above, I found that,
> > > > > >
> > > > > > When I use the following query it works :
> > > > > >
> > > > > > SELECT d.datname as "Name",
> > > > > >        r.rolname as "Owner",
> > > > > >        d.encoding as "Encoding"
> > > > > > FROM pg_catalog.pg_database d
> > > > > >   JOIN pg_catalog.pg_roles r ON d.datdba = r.oid
> > > > > > ORDER BY 1
> > > > > >
> > > > > > But the following query does not work:
> > > > > >
> > > > > > SELECT d.datname as "Name",
> > > > > >        r.rolname as "Owner",
> > > > > >        pg_catalog.pg_encoding_to_char(d.encoding) as "Encoding"
> > > > > > FROM pg_catalog.pg_database d
> > > > > >   JOIN pg_catalog.pg_roles r ON d.datdba = r.oid
> > > > > > ORDER BY 1
> > > > > >
> > > > > > The function call "pg_catalog.pg_encoding_to_char(d.encoding)" somehow
> > > > > makes
> > > > > > the Pgpool hang in SSL mode.
> > > > > >
> > > > > > ---------- Forwarded message ----------
> > > > > > From: AI Rumman <rummandba at gmail.com>
> > > > > > Date: Sun, May 9, 2010 at 12:06 PM
> > > > > > Subject: PGPOOL II 2.3.3 hang in ssl mode
> > > > > > To: pgpool-general at pgfoundry.org
> > > > > >
> > > > > >
> > > > > > I am using Pgpool II 2.3.3 with Postgresql 8.3.8.
> > > > > >
> > > > > > When I use command \l at postgresql client the query is working
> > > > > perfectly.
> > > > > >
> > > > > > But if I used the command from pgpool II client which is connected with
> > > > > > postgresql in ssl mode, it gets hang.
> > > > > >
> > > > > > Again if I use the command from pgpool II client in non-ssl mode, it
> > > > > works
> > > > > > fine.
> > > > > >
> > > > > > Any help please.
> > > > >
> 
> > Index: pool_process_query.c
> > ===================================================================
> > RCS file: /cvsroot/pgpool/pgpool-II/pool_process_query.c,v
> > retrieving revision 1.202
> > diff -c -r1.202 pool_process_query.c
> > *** pool_process_query.c	10 May 2010 09:35:45 -0000	1.202
> > --- pool_process_query.c	19 May 2010 15:34:16 -0000
> > ***************
> > *** 1025,1030 ****
> > --- 1025,1039 ----
> >   	struct timeval timeout;
> >   	struct timeval *timeoutp;
> >   
> > + 	/*
> > + 	 * If SSL is enabled, we need to check SSL internal buffer
> > + 	 * is empty or not first. Otherwise select(2) will stuck.
> > + 	 */
> > + 	if (cp->ssl_active > 0 && SSL_pending(cp->ssl) > 0)
> > + 	{
> > + 		return 0;
> > + 	}
> > + 		
> >   	fd = cp->fd;
> >   
> >   	if (timeoutsec > 0)
> > ***************
> > *** 3420,3425 ****
> > --- 3429,3441 ----
> >   {
> >   	int i;
> >   
> > + 	/*
> > + 	 * If SSL is enabled, we need to check SSL internal buffer
> > + 	 * is empty or not first.
> > + 	 */
> > + 	if (frontend->ssl_active > 0 && SSL_pending(frontend->ssl) > 0 && !in_progress)
> > + 		return 0;
> > + 
> >   	if (frontend->len > 0 && !in_progress)
> >   		return 0;
> >   
> > ***************
> > *** 3428,3433 ****
> > --- 3444,3457 ----
> >   		if (!VALID_BACKEND(i))
> >   			continue;
> >   
> > + 		/*
> > + 		 * If SSL is enabled, we need to check SSL internal buffer
> > + 		 * is empty or not first.
> > + 		 */
> > + 		if (CONNECTION(backend, i)->ssl_active > 0 &&
> > + 			SSL_pending(CONNECTION(backend, i)->ssl) > 0)
> > + 			return 0;
> > + 
> >   		if (CONNECTION(backend, i)->len > 0)
> >   			return 0;
> >   	}
> > Index: pool_ssl.c
> > ===================================================================
> > RCS file: /cvsroot/pgpool/pgpool-II/pool_ssl.c,v
> > retrieving revision 1.5
> > diff -c -r1.5 pool_ssl.c
> > *** pool_ssl.c	4 Feb 2010 00:31:21 -0000	1.5
> > --- pool_ssl.c	19 May 2010 15:34:16 -0000
> > ***************
> > *** 124,130 ****
> >   }
> >   
> >   int pool_ssl_read(POOL_CONNECTION *cp, void *buf, int size) {
> > ! 	return SSL_read(cp->ssl, buf, size);
> >   }
> >   
> >   int pool_ssl_write(POOL_CONNECTION *cp, const void *buf, int size) {
> > --- 124,178 ----
> >   }
> >   
> >   int pool_ssl_read(POOL_CONNECTION *cp, void *buf, int size) {
> > ! 	int n;
> > ! 	int err;
> > ! 
> > !  retry:
> > ! 	errno = 0;
> > ! 	n = SSL_read(cp->ssl, buf, size);
> > ! 	err = SSL_get_error(cp->ssl, n);
> > ! 
> > ! 	switch (err)
> > ! 	{
> > ! 		case SSL_ERROR_NONE:
> > ! 			break;
> > ! 		case SSL_ERROR_WANT_READ:
> > ! 			n = 0;
> > ! 			break;
> > ! 		case SSL_ERROR_WANT_WRITE:
> > ! 
> > ! 			/*
> > ! 			 * Returning 0 here would cause caller to wait for read-ready,
> > ! 			 * which is not correct since what SSL wants is wait for
> > ! 			 * write-ready.  The former could get us stuck in an infinite
> > ! 			 * wait, so don't risk it; busy-loop instead.
> > ! 			 */
> > ! 			goto retry;
> > ! 
> > ! 		case SSL_ERROR_SYSCALL:
> > ! 			if (n == -1)
> > ! 			{
> > ! 				pool_error("SSL_read error: %d", err);
> > ! 			}
> > ! 			else
> > ! 			{
> > ! 				pool_error("SSL_read error: EOF detected");
> > ! 				n = -1;
> > ! 			}
> > ! 			break;
> > ! 
> > ! 		case SSL_ERROR_SSL:
> > ! 		case SSL_ERROR_ZERO_RETURN:
> > ! 			perror_ssl("SSL_read");
> > ! 			n = -1;
> > ! 			break;
> > ! 		default:
> > ! 			perror_ssl("SSL_read");
> > ! 			n = -1;
> > ! 			break;
> > ! 	}
> > ! 
> > ! 	return n;
> >   }
> >   
> >   int pool_ssl_write(POOL_CONNECTION *cp, const void *buf, int size) {
> 
    
    
More information about the Pgpool-general
mailing list