[pgpool-general: 3918] "replication" mode inconsistencies
cpacejo at clearskydata.com
Thu Aug 6 01:12:17 JST 2015
Consider the following sequence, starting from a healthy system of two
PG servers (A and B) joined in "replication" mode:
1) Server A loses connectivity.
2) A write comes in, which pgpool commits to server B.
3) Server B loses connectivity.
4) Server A regains connectivity.
5) pgpool restarts (due to either sysadmin action or failure).
At this point, pgpool happily directs all traffic to server A, which
does *not* have the most recent commit to server B. This is very bad
since I have now lost data consistency.
Rather, I would expect that pgpool remembers that it has written data
to B but not to A, and would refuse incoming connections until A has
been recovered from B.
Even to workaround, if before restarting pgpool, I had some tool which
checked the state in which pgpool left the two servers and then
rectified them, that would suffice. However since pgpool doesn't seem
to track at all the fact that it had written some data only to B but
not to A, that information is not available (e.g. from pgpool_status).
What am I missing? How is it that others use pgpool in "replication"
mode without encountering data inconsistencies when nodes fail?
More information about the pgpool-general