[pgpool-general: 3929] Re: "replication" mode inconsistencies

Tatsuo Ishii ishii at postgresql.org
Thu Aug 6 15:58:00 JST 2015


Done.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> Good point. Will fix.
> 
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
> 
>> As a side note, as I'm browsing the source, I'm surprised that
>> write_status_file does not fsync() the file after it writes to it?
>> Otherwise the status could be lost if the system was restarted
>> unsafely.
>> 
>> On Wed, Aug 5, 2015 at 7:15 PM, Chris Pacejo <cpacejo at clearskydata.com> wrote:
>>> Thank you.  I've confirmed that if only *one* of the two servers is
>>> unreachable, pgpool behaves as expected (waits for the server to be
>>> manually reattached).
>>>
>>> Although I wonder also, even if pgpool *did* correctly refuse to send
>>> traffic if both servers were "down" in pgpool_status on restart, how
>>> should we know in which direction to recover data (from A to B or B to
>>> A)?  Pgpool does not record in pgpool_status which "down" server was
>>> the last to go down (and is thus authoritative).  As a workaround I
>>> think it would work to write a failover/failback_command which records
>>> this information.
>>>
>>> On Wed, Aug 5, 2015 at 6:59 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>>> Pgpool should recognize that both A and B are in down status, but
>>>> actually not. Let me investigate...
>>>>
>>>> Best regards,
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese:http://www.sraoss.co.jp
>>>>
>>>>> Consider the following sequence, starting from a healthy system of two
>>>>> PG servers (A and B) joined in "replication" mode:
>>>>>
>>>>> 1) Server A loses connectivity.
>>>>> 2) A write comes in, which pgpool commits to server B.
>>>>> 3) Server B loses connectivity.
>>>>> 4) Server A regains connectivity.
>>>>> 5) pgpool restarts (due to either sysadmin action or failure).
>>>>>
>>>>> At this point, pgpool happily directs all traffic to server A, which
>>>>> does *not* have the most recent commit to server B.  This is very bad
>>>>> since I have now lost data consistency.
>>>>>
>>>>> Rather, I would expect that pgpool remembers that it has written data
>>>>> to B but not to A, and would refuse incoming connections until A has
>>>>> been recovered from B.
>>>>>
>>>>> Even to workaround, if before restarting pgpool, I had some tool which
>>>>> checked the state in which pgpool left the two servers and then
>>>>> rectified them, that would suffice.  However since pgpool doesn't seem
>>>>> to track at all the fact that it had written some data only to B but
>>>>> not to A, that information is not available (e.g. from pgpool_status).
>>>>>
>>>>> What am I missing?  How is it that others use pgpool in "replication"
>>>>> mode without encountering data inconsistencies when nodes fail?
>>>>> _______________________________________________
>>>>> pgpool-general mailing list
>>>>> pgpool-general at pgpool.net
>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list