[pgpool-general: 3921] Re: "replication" mode inconsistencies

Chris Pacejo cpacejo at clearskydata.com
Thu Aug 6 08:41:09 JST 2015


As a side note, as I'm browsing the source, I'm surprised that
write_status_file does not fsync() the file after it writes to it?
Otherwise the status could be lost if the system was restarted
unsafely.

On Wed, Aug 5, 2015 at 7:15 PM, Chris Pacejo <cpacejo at clearskydata.com> wrote:
> Thank you.  I've confirmed that if only *one* of the two servers is
> unreachable, pgpool behaves as expected (waits for the server to be
> manually reattached).
>
> Although I wonder also, even if pgpool *did* correctly refuse to send
> traffic if both servers were "down" in pgpool_status on restart, how
> should we know in which direction to recover data (from A to B or B to
> A)?  Pgpool does not record in pgpool_status which "down" server was
> the last to go down (and is thus authoritative).  As a workaround I
> think it would work to write a failover/failback_command which records
> this information.
>
> On Wed, Aug 5, 2015 at 6:59 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>> Pgpool should recognize that both A and B are in down status, but
>> actually not. Let me investigate...
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>>> Consider the following sequence, starting from a healthy system of two
>>> PG servers (A and B) joined in "replication" mode:
>>>
>>> 1) Server A loses connectivity.
>>> 2) A write comes in, which pgpool commits to server B.
>>> 3) Server B loses connectivity.
>>> 4) Server A regains connectivity.
>>> 5) pgpool restarts (due to either sysadmin action or failure).
>>>
>>> At this point, pgpool happily directs all traffic to server A, which
>>> does *not* have the most recent commit to server B.  This is very bad
>>> since I have now lost data consistency.
>>>
>>> Rather, I would expect that pgpool remembers that it has written data
>>> to B but not to A, and would refuse incoming connections until A has
>>> been recovered from B.
>>>
>>> Even to workaround, if before restarting pgpool, I had some tool which
>>> checked the state in which pgpool left the two servers and then
>>> rectified them, that would suffice.  However since pgpool doesn't seem
>>> to track at all the fact that it had written some data only to B but
>>> not to A, that information is not available (e.g. from pgpool_status).
>>>
>>> What am I missing?  How is it that others use pgpool in "replication"
>>> mode without encountering data inconsistencies when nodes fail?
>>> _______________________________________________
>>> pgpool-general mailing list
>>> pgpool-general at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list