[pgpool-general: 4721] Re: "replication" mode inconsistencies

Muhammad Usama m.usama at gmail.com
Thu Jun 16 01:31:47 JST 2016


On Fri, Aug 7, 2015 at 5:54 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:

> > I've done some more thinking on this, and I think the solution is
> > simply *not* to record pgpool_status when *all* nodes are down (i.e.
> > only record it when at least one node is up).  So pgpool_status will
> > always reflect the last set of nodes to which any data was written.
> > Upon restart, if the up-to-date (previously "up") node is in fact down
> > (regardless of whether the stale ("down") node is back up), pgpool
> > will detect this in its health check and will fail; if the up-to-date
> > (previously "up") node is back up, then pgpool will commence using it.
>
> A downside of this approach is, the first health check after pgpool
> restarting may take long time due to certain health checking retry
> setting if the node is still down. However this downside is not new to
> your approach (and there's a workaround: users can manually edit
> pgpool_status file to set the node status to "down"). Besides this,
> your approach seems attractive.
>

I have been looking into this one and thinking of any other possible
solutions we can have. And I think this solution of not recording the
status of last working node when it goes down should work without any
problem, And I believe this is a very good solution in a sense that it
would require a minimum amount of code changes and the downside of the
solution that at startup might require a long time when the previously last
alive node is still down should not be worrisome since that situation would
always require a user intervention to fix and a little extra time should
not hurt in that case.

Along with that, I think while we are on that, it would be a good thing to
record some more information in the pool_status file regarding the node
that fails. Like along with the node status, if we also put the timestamp
of the node status change and the last successful query processed by
that backend node in the pool status file, It could be helpful in deciding
the direction of recovery and may be for debugging the cause of node
failure.

Kind regards
Muhammad Usama



> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20160615/3e991733/attachment.html>


More information about the pgpool-general mailing list