[pgpool-general: 6693] Re: Query

Tatsuo Ishii ishii at sraoss.co.jp
Mon Sep 2 08:43:11 JST 2019


Hi Lakshmi,

Your attached files are too large to accept by the mailing list. Can
you compress them and post the message along the compressed attached
files?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

From: Lakshmi Raghavendra <lakshmiym108 at gmail.com>
Subject: Fwd: [pgpool-general: 6672] Query
Date: Sun, 1 Sep 2019 23:14:30 +0530
Message-ID: <CAHHVJ5sRoVFEEW4EoZLgudCTTm0cqGjXhbbkpnOiimcs4euUSw at mail.gmail.com>

> ---------- Forwarded message ---------
> From: Lakshmi Raghavendra <lakshmiym108 at gmail.com>
> Date: Sat, Aug 31, 2019 at 10:17 PM
> Subject: Re: [pgpool-general: 6672] Query
> To: Tatsuo Ishii <ishii at sraoss.co.jp>
> Cc: Muhammad Usama <m.usama at gmail.com>, <pgpool-general at pgpool.net>
> 
> 
> Hi Usama / Tatsuo,
> 
>          Received the email notification today, sorry for the delayed
> response.
> Please find attached the pgpool-II log for the same.
> 
> So basically below is the short summary of the issue:
> 
> 
> Node -1 : Pgpool Master + Postgres Master
> 
> Node -2 : Pgpool Standby + Postgres Standby
> 
> Node-3 : Pgpool Standby + Postgres Standby
> 
> 
> When network failure happens and Node-1 goes out of network, below is the
> status :
> 
> Node-1 : Pgpool Lost status + Postgres Standby (down)
> 
> Node -2 : Pgpool Master + Postgres Master
> 
> Node-3 : Pgpool Standby + Postgres Standby
> 
> 
> Now when Node-1 comes back to network , below is the status causing the
> pgpool cluster to get into imbalance :
> 
> 
> 
> lcm-34-189:~ # psql -h 10.198.34.191 -p 9999 -U pgpool postgres -c "show
> pool_nodes"
> Password for user pgpool:
>  node_id |   hostname    | port | status | lb_weight |  role   | select_cnt
> | load_balance_node | replication_delay | last_status_change
> ---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
>  0       | 10.198.34.188 | 5432 | up     | 0.333333  | primary | 0
>  | true              | 0                 | 2019-08-31 16:40:26
>  1       | 10.198.34.189 | 5432 | up     | 0.333333  | standby | 0
>  | false             | 1013552           | 2019-08-31 16:40:26
>  2       | 10.198.34.190 | 5432 | up     | 0.333333  | standby | 0
>  | false             | 0                 | 2019-08-31 16:40:26
> (3 rows)
> 
> lcm-34-189:~ # /usr/local/bin/pcp_watchdog_info -p 9898 -h 10.198.34.191 -U
> pgpool
> Password:
> 3 NO lcm-34-188.dev.lcm.local:9999 Linux lcm-34-188.dev.lcm.local
> 10.198.34.188
> 
> lcm-34-189.dev.lcm.local:9999 Linux lcm-34-189.dev.lcm.local
> lcm-34-189.dev.lcm.local 9999 9000 7 STANDBY
> lcm-34-188.dev.lcm.local:9999 Linux lcm-34-188.dev.lcm.local 10.198.34.188
> 9999 9000 4 MASTER
> lcm-34-190.dev.lcm.local:9999 Linux lcm-34-190.dev.lcm.local 10.198.34.190
> 9999 9000 4 MASTER
> lcm-34-189:~ #
> 
> 
> 
> Thanks And Regards,
> 
>    Lakshmi Y M
> 
> On Tue, Aug 20, 2019 at 8:55 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> > On Sat, Aug 17, 2019 at 12:28 PM Tatsuo Ishii <ishii at sraoss.co.jp>
>> wrote:
>> >
>> >> > Hi Pgpool Team,
>> >> >
>> >> >               *We are nearing a production release and running into
>> the
>> >> > below issues.*
>> >> > Replies at the earliest would be highly helpful and greatly
>> appreciated.
>> >> > Please let us know on how to get rid of the below issues.
>> >> >
>> >> > We have a 3 node pgpool + postgres cluster - M1 , M2, M3. The
>> pgpool.conf
>> >> > is as attached.
>> >> >
>> >> > *Case I :  *
>> >> > M1 - Pgpool Master + Postgres Master
>> >> > M2 , M3 - Pgpool slave + Postgres slave
>> >> >
>> >> > - M1 goes out of network. its marked as LOST in the pgpool cluster
>> >> > - M2 becomes postgres master
>> >> > - M3 becomes pgpool master.
>> >> > - When M1 comes back to the network, pgpool is able to solve split
>> brain.
>> >> > However, its changing the postgres master back to M1 by logging a
>> >> statement
>> >> > - "LOG:  primary node was chenged after the sync from new master", so
>> >> since
>> >> > M2 was already postgres master (and its trigger file is not touched)
>> its
>> >> > not able to sync to the new master.
>> >> > *I somehow want to avoid this postgres master change..please let us
>> know
>> >> if
>> >> > there is a way to avoid it*
>> >>
>> >> Sorry but I don't know how to prevent this. Probably when former
>> >> watchdog master recovers from an network outage and there's already
>> >> PostgreSQL primary server, the watchdog master should not sync the
>> >> state. What do you think Usama?
>> >>
>> >
>> > Yes, that's true, there is no functionality that exists in Pgpool-II to
>> > disable the backend node status synch. In fact that
>> > would be hazardous if we somehow disable the node status syncing.
>> >
>> > But having said that, In the mentioned scenario when the M1 comes back
>> and
>> > join the watchdog cluster Pgpool-II should have
>> > kept the M2 as the true master while resolving the split-brain. The
>> > algorithm used to resolve the true master considers quite a
>> > few parameters and for the scenario, you explained, M2 should have kept
>> the
>> > master node status while M1 should have resigned
>> > after joining back the cluster and effectively the M1 node should have
>> been
>> > syncing the status from M2 ( keeping the proper primary node)
>> > not the other way around.
>> > Can you please share the Pgpool-II log files so that I can have a look at
>> > what went wrong in this case.
>>
>> Usama,
>>
>> Ok, the scenario (PostgreSQL primary x 2 in the end) should have not
>> happend. That's a good news.
>>
>> Lakshmi,
>>
>> Can you please provide the Pgpool-II log files as Usama requested?
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>


More information about the pgpool-general mailing list