[Pgpool-general] Online recovery during load

Jaume Sabater jsabater at gmail.com
Wed Jan 13 16:17:15 UTC 2010


On Wed, Jan 13, 2010 at 4:55 PM, Christophe Philemotte
<christophe.philemotte at apreco.be> wrote:

> Before to present you my problem, just a question about the 2nd stage
> (you'll understand that this question is linked to my problem). Why the
> client connections have to be closed during this stage? Couldn't the
> recovered node catch up with the master node without stopping the service?

At the second stage, current idle connections are closed, and open
connections are offered some time to finish their work before being
closed. Connections need to be closed before the node being recovered
is started as, when it starts, it will obtain and process the pending
log files. This will put in the two nodes in sync, and then the queued
requests will be processed.

Conclusion: only when all connections are closed, pgpool-II can be
sure that the two nodes will be perfectly in sync.

> Now, let me present you my problem. When I test online recovery during a
> typical database load, I've obtained two failed scenarios:
> 1. when the client_idle_limit_in_recovery is set (the best found value
> is 10s), the online recovery is done, but a few client requests have
> failed (timeout or closed connection);
> 2. when the client_idle_limit_in_recovery isn't set, the online recovery
> is not done, because of a few client connections that cannot be closed
> (There are effectively used by client processes, not lazy ones).

I have also been having problems under certain circumstances when
trying to recover a node since I first started with pgpool-II. Tatsuo
(and other contributors) have fixed some of these scenarios, but I
think there is still the problem that certain connections are not
dropped during the second stage, hence bloating the whole process.

I believe, but I cannot be sure, than when the DBA of my main client
has pgadmin3 open, it always fails. Usually, when connections are only
from the front-ends of the web platform (i.e. open connection, send a
request, (supposedly) close connection), everything goes fine. But
with "persistent" connections, so to speak, pgpool-II is not always
capable of dropping them.

Does it make any sense to you, Tatsuo?

-- 
Jaume Sabater
http://linuxsilo.net/

"Ubi sapientas ibi libertas"


More information about the Pgpool-general mailing list