[pgpool-general: 3924] Re: "replication" mode inconsistencies

Chris Pacejo cpacejo at clearskydata.com
Thu Aug 6 10:42:52 JST 2015

Thank you, that makes sense.

On Wed, Aug 5, 2015 at 9:30 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> It appeared that the behavior (if all backend are down, pgpool_status
> is ignored) is intentional.
> From src/main/pgpool_main.c:
>         /*
>          * If no one woke up, we regard the status file bogus
>          */
>         if (someone_wakeup == false)
>         {
>                 for (i=0;i< pool_config->backend_desc->num_backends;i++)
>                 {
>                         BACKEND_INFO(i).backend_status = CON_CONNECT_WAIT;
>                 }
>                 (void)write_status_file();
>         }
> Here is the commit log:
> -------------------------------------------------------------
> commit a97eed16ef8c3a481c0cd0282b9950fb4ee28a89
> Author: Tatsuo Ishii <ishii at sraoss.co.jp>
> Date:   Sat Feb 13 11:23:55 2010 +0000
>     Fix read_status_file so that if all nodes were marked down status,
>     it is regarded that this file is bogus. This will prevent "all
>     node down" syndrome.
> -------------------------------------------------------------
> The decision was made long time ago by me, but now I think this was
> not correct decision as you pointed out. I think we need to remove
> this part except in "raw mode", in which database incosistency problem
> will not happen.
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>> Thank you.  I've confirmed that if only *one* of the two servers is
>> unreachable, pgpool behaves as expected (waits for the server to be
>> manually reattached).
>> Although I wonder also, even if pgpool *did* correctly refuse to send
>> traffic if both servers were "down" in pgpool_status on restart, how
>> should we know in which direction to recover data (from A to B or B to
>> A)?  Pgpool does not record in pgpool_status which "down" server was
>> the last to go down (and is thus authoritative).  As a workaround I
>> think it would work to write a failover/failback_command which records
>> this information.
>> On Wed, Aug 5, 2015 at 6:59 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>> Pgpool should recognize that both A and B are in down status, but
>>> actually not. Let me investigate...
>>> Best regards,
>>> --
>>> Tatsuo Ishii
>>> SRA OSS, Inc. Japan
>>> English: http://www.sraoss.co.jp/index_en.php
>>> Japanese:http://www.sraoss.co.jp
>>>> Consider the following sequence, starting from a healthy system of two
>>>> PG servers (A and B) joined in "replication" mode:
>>>> 1) Server A loses connectivity.
>>>> 2) A write comes in, which pgpool commits to server B.
>>>> 3) Server B loses connectivity.
>>>> 4) Server A regains connectivity.
>>>> 5) pgpool restarts (due to either sysadmin action or failure).
>>>> At this point, pgpool happily directs all traffic to server A, which
>>>> does *not* have the most recent commit to server B.  This is very bad
>>>> since I have now lost data consistency.
>>>> Rather, I would expect that pgpool remembers that it has written data
>>>> to B but not to A, and would refuse incoming connections until A has
>>>> been recovered from B.
>>>> Even to workaround, if before restarting pgpool, I had some tool which
>>>> checked the state in which pgpool left the two servers and then
>>>> rectified them, that would suffice.  However since pgpool doesn't seem
>>>> to track at all the fact that it had written some data only to B but
>>>> not to A, that information is not available (e.g. from pgpool_status).
>>>> What am I missing?  How is it that others use pgpool in "replication"
>>>> mode without encountering data inconsistencies when nodes fail?
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general

More information about the pgpool-general mailing list