[Pgpool-general] Online recovery during load

Fri Jan 15 08:11:53 UTC 2010

Hi Jaume Sabater,

Thanks for answering me.

>> Before to present you my problem, just a question about the 2nd stage
>> (you'll understand that this question is linked to my problem). Why the
>> client connections have to be closed during this stage? Couldn't the
>> recovered node catch up with the master node without stopping the service?
> 
> At the second stage, current idle connections are closed, and open
> connections are offered some time to finish their work before being
> closed. Connections need to be closed before the node being recovered
> is started as, when it starts, it will obtain and process the pending
> log files. This will put in the two nodes in sync, and then the queued
> requests will be processed.
> 
> Conclusion: only when all connections are closed, pgpool-II can be
> sure that the two nodes will be perfectly in sync.

Tell me if I'm wrong. That means that it is impossible to recover online
if there are heavily used persistent opened connections. And I have to
design my client application to not use persistent connection if I would
like to perform online recovery. Is it correct?

>> Now, let me present you my problem. When I test online recovery during a
>> typical database load, I've obtained two failed scenarios:
>> 1. when the client_idle_limit_in_recovery is set (the best found value
>> is 10s), the online recovery is done, but a few client requests have
>> failed (timeout or closed connection);
>> 2. when the client_idle_limit_in_recovery isn't set, the online recovery
>> is not done, because of a few client connections that cannot be closed
>> (There are effectively used by client processes, not lazy ones).
> 
> I have also been having problems under certain circumstances when
> trying to recover a node since I first started with pgpool-II. Tatsuo
> (and other contributors) have fixed some of these scenarios, but I
> think there is still the problem that certain connections are not
> dropped during the second stage, hence bloating the whole process.

Without using client_idle_limit_in_recovery, it is what I have noticed.

> I believe, but I cannot be sure, than when the DBA of my main client
> has pgadmin3 open, it always fails. Usually, when connections are only
> from the front-ends of the web platform (i.e. open connection, send a
> request, (supposedly) close connection), everything goes fine. But
> with "persistent" connections, so to speak, pgpool-II is not always
> capable of dropping them.

Ok, it is my feeling I've just exposed above.

> Does it make any sense to you, Tatsuo?
Does it?

Regards,

Christophe Philemotte