[Pgpool-general] Online recovery during load

Tatsuo Ishii ishii at sraoss.co.jp
Tue Jan 19 14:47:54 UTC 2010


For further investigation I need a test case which is reproducible the
problem. In my understanding, pgAdmin3 will always give us the
problem. Am I correct?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Hello Tatsuo,
> 
> Do you have any type of recommendations of how we can proceed in this situation?
> 
> Is there some kind of function that forces persistent connections to be closed or something like this?
> 
> Regards,
> ---
> 
> Fernando Marcelo
> www.consultorpc.com
> fernando at consultorpc.com
> 
> 
> Em 15/01/2010, às 11:21, Fernando Morgenstern escreveu:
> 
> > Hello,
> > 
> > I will join this thread ( without being invited :P ) because i am having the same problem with pgpool.
> > 
> > I have it running for 1 week on a high load environment and I decided to test online recovery, that always worked on test env, but i get :
> > 
> > ERROR: pid 1199: wait_connection_closed: existing connections did not close in 90 sec.
> > 
> > I tested with different values for client_idle_limit_in_recovery and recovery_timeout , but all of them fail. Pgpool simply can't close all connections.
> > 
> > Just sharing another experience and hopefully waiting for a possible solution.
> > 
> > Regards,
> > ---
> > 
> > Fernando Marcelo
> > www.consultorpc.com
> > fernando at consultorpc.com
> > 
> > Em 15/01/2010, às 06:11, Christophe Philemotte escreveu:
> > 
> >> Hi Jaume Sabater,
> >> 
> >> Thanks for answering me.
> >> 
> >>>> Before to present you my problem, just a question about the 2nd stage
> >>>> (you'll understand that this question is linked to my problem). Why the
> >>>> client connections have to be closed during this stage? Couldn't the
> >>>> recovered node catch up with the master node without stopping the service?
> >>> 
> >>> At the second stage, current idle connections are closed, and open
> >>> connections are offered some time to finish their work before being
> >>> closed. Connections need to be closed before the node being recovered
> >>> is started as, when it starts, it will obtain and process the pending
> >>> log files. This will put in the two nodes in sync, and then the queued
> >>> requests will be processed.
> >>> 
> >>> Conclusion: only when all connections are closed, pgpool-II can be
> >>> sure that the two nodes will be perfectly in sync.
> >> 
> >> Tell me if I'm wrong. That means that it is impossible to recover online
> >> if there are heavily used persistent opened connections. And I have to
> >> design my client application to not use persistent connection if I would
> >> like to perform online recovery. Is it correct?
> >> 
> >>>> Now, let me present you my problem. When I test online recovery during a
> >>>> typical database load, I've obtained two failed scenarios:
> >>>> 1. when the client_idle_limit_in_recovery is set (the best found value
> >>>> is 10s), the online recovery is done, but a few client requests have
> >>>> failed (timeout or closed connection);
> >>>> 2. when the client_idle_limit_in_recovery isn't set, the online recovery
> >>>> is not done, because of a few client connections that cannot be closed
> >>>> (There are effectively used by client processes, not lazy ones).
> >>> 
> >>> I have also been having problems under certain circumstances when
> >>> trying to recover a node since I first started with pgpool-II. Tatsuo
> >>> (and other contributors) have fixed some of these scenarios, but I
> >>> think there is still the problem that certain connections are not
> >>> dropped during the second stage, hence bloating the whole process.
> >> 
> >> Without using client_idle_limit_in_recovery, it is what I have noticed.
> >> 
> >>> I believe, but I cannot be sure, than when the DBA of my main client
> >>> has pgadmin3 open, it always fails. Usually, when connections are only
> >>> from the front-ends of the web platform (i.e. open connection, send a
> >>> request, (supposedly) close connection), everything goes fine. But
> >>> with "persistent" connections, so to speak, pgpool-II is not always
> >>> capable of dropping them.
> >> 
> >> Ok, it is my feeling I've just exposed above.
> >> 
> >>> Does it make any sense to you, Tatsuo?
> >> Does it?
> >> 
> >> Regards,
> >> 
> >> Christophe Philemotte
> >> _______________________________________________
> >> Pgpool-general mailing list
> >> Pgpool-general at pgfoundry.org
> >> http://pgfoundry.org/mailman/listinfo/pgpool-general
> > 
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> 
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general


More information about the Pgpool-general mailing list