No subject


Fri Jan 30 20:15:43 JST 2015


	/*
	 * If no one woke up, we regard the status file bogus
	 */
	if (someone_wakeup == false)
	{
		for (i=0;i< pool_config->backend_desc->num_backends;i++)
		{
			BACKEND_INFO(i).backend_status = CON_CONNECT_WAIT;
		}
		(void)write_status_file();
	}

Here is the commit log:
-------------------------------------------------------------
commit a97eed16ef8c3a481c0cd0282b9950fb4ee28a89
Author: Tatsuo Ishii <ishii at sraoss.co.jp>
Date:   Sat Feb 13 11:23:55 2010 +0000

    Fix read_status_file so that if all nodes were marked down status,
    it is regarded that this file is bogus. This will prevent "all
    node down" syndrome.
-------------------------------------------------------------

The decision was made long time ago by me, but now I think this was
not correct decision as you pointed out. I think we need to remove
this part except in "raw mode", in which database incosistency problem
will not happen.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Thank you.  I've confirmed that if only *one* of the two servers is
> unreachable, pgpool behaves as expected (waits for the server to be
> manually reattached).
> 
> Although I wonder also, even if pgpool *did* correctly refuse to send
> traffic if both servers were "down" in pgpool_status on restart, how
> should we know in which direction to recover data (from A to B or B to
> A)?  Pgpool does not record in pgpool_status which "down" server was
> the last to go down (and is thus authoritative).  As a workaround I
> think it would work to write a failover/failback_command which records
> this information.
> 
> On Wed, Aug 5, 2015 at 6:59 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>> Pgpool should recognize that both A and B are in down status, but
>> actually not. Let me investigate...
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>>> Consider the following sequence, starting from a healthy system of two
>>> PG servers (A and B) joined in "replication" mode:
>>>
>>> 1) Server A loses connectivity.
>>> 2) A write comes in, which pgpool commits to server B.
>>> 3) Server B loses connectivity.
>>> 4) Server A regains connectivity.
>>> 5) pgpool restarts (due to either sysadmin action or failure).
>>>
>>> At this point, pgpool happily directs all traffic to server A, which
>>> does *not* have the most recent commit to server B.  This is very bad
>>> since I have now lost data consistency.
>>>
>>> Rather, I would expect that pgpool remembers that it has written data
>>> to B but not to A, and would refuse incoming connections until A has
>>> been recovered from B.
>>>
>>> Even to workaround, if before restarting pgpool, I had some tool which
>>> checked the state in which pgpool left the two servers and then
>>> rectified them, that would suffice.  However since pgpool doesn't seem
>>> to track at all the fact that it had written some data only to B but
>>> not to A, that information is not available (e.g. from pgpool_status).
>>>
>>> What am I missing?  How is it that others use pgpool in "replication"
>>> mode without encountering data inconsistencies when nodes fail?
>>> _______________________________________________
>>> pgpool-general mailing list
>>> pgpool-general at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list