[pgpool-general: 6696] Re: Query

Mon Sep 2 15:23:45 JST 2019

Usama,

Can you please take a look at this?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

From: Lakshmi Raghavendra <lakshmiym108 at gmail.com>
Subject: Re: [pgpool-general: 6672] Query
Date: Mon, 2 Sep 2019 10:01:03 +0530
Message-ID: <CAHHVJ5suNKPS9qECEpLvzsEEi04FVPkfwUZo_9nH6ntHyBtrWg at mail.gmail.com>

> Hi Tatsuo,
> 
>           Please find attached the zip file.
> 
> Thanks And Regards,
> 
>   Lakshmi Y M
> 
> On Mon, Sep 2, 2019 at 5:13 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> Hi Lakshmi,
>>
>> Your attached files are too large to accept by the mailing list. Can
>> you compress them and post the message along the compressed attached
>> files?
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> From: Lakshmi Raghavendra <lakshmiym108 at gmail.com>
>> Subject: Fwd: [pgpool-general: 6672] Query
>> Date: Sun, 1 Sep 2019 23:14:30 +0530
>> Message-ID: <
>> CAHHVJ5sRoVFEEW4EoZLgudCTTm0cqGjXhbbkpnOiimcs4euUSw at mail.gmail.com>
>>
>> > ---------- Forwarded message ---------
>> > From: Lakshmi Raghavendra <lakshmiym108 at gmail.com>
>> > Date: Sat, Aug 31, 2019 at 10:17 PM
>> > Subject: Re: [pgpool-general: 6672] Query
>> > To: Tatsuo Ishii <ishii at sraoss.co.jp>
>> > Cc: Muhammad Usama <m.usama at gmail.com>, <pgpool-general at pgpool.net>
>> >
>> >
>> > Hi Usama / Tatsuo,
>> >
>> >          Received the email notification today, sorry for the delayed
>> > response.
>> > Please find attached the pgpool-II log for the same.
>> >
>> > So basically below is the short summary of the issue:
>> >
>> >
>> > Node -1 : Pgpool Master + Postgres Master
>> >
>> > Node -2 : Pgpool Standby + Postgres Standby
>> >
>> > Node-3 : Pgpool Standby + Postgres Standby
>> >
>> >
>> > When network failure happens and Node-1 goes out of network, below is the
>> > status :
>> >
>> > Node-1 : Pgpool Lost status + Postgres Standby (down)
>> >
>> > Node -2 : Pgpool Master + Postgres Master
>> >
>> > Node-3 : Pgpool Standby + Postgres Standby
>> >
>> >
>> > Now when Node-1 comes back to network , below is the status causing the
>> > pgpool cluster to get into imbalance :
>> >
>> >
>> >
>> > lcm-34-189:~ # psql -h 10.198.34.191 -p 9999 -U pgpool postgres -c "show
>> > pool_nodes"
>> > Password for user pgpool:
>> >  node_id |   hostname    | port | status | lb_weight |  role   |
>> select_cnt
>> > | load_balance_node | replication_delay | last_status_change
>> >
>> ---------+---------------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
>> >  0       | 10.198.34.188 | 5432 | up     | 0.333333  | primary | 0
>> >  | true              | 0                 | 2019-08-31 16:40:26
>> >  1       | 10.198.34.189 | 5432 | up     | 0.333333  | standby | 0
>> >  | false             | 1013552           | 2019-08-31 16:40:26
>> >  2       | 10.198.34.190 | 5432 | up     | 0.333333  | standby | 0
>> >  | false             | 0                 | 2019-08-31 16:40:26
>> > (3 rows)
>> >
>> > lcm-34-189:~ # /usr/local/bin/pcp_watchdog_info -p 9898 -h 10.198.34.191
>> -U
>> > pgpool
>> > Password:
>> > 3 NO lcm-34-188.dev.lcm.local:9999 Linux lcm-34-188.dev.lcm.local
>> > 10.198.34.188
>> >
>> > lcm-34-189.dev.lcm.local:9999 Linux lcm-34-189.dev.lcm.local
>> > lcm-34-189.dev.lcm.local 9999 9000 7 STANDBY
>> > lcm-34-188.dev.lcm.local:9999 Linux lcm-34-188.dev.lcm.local
>> 10.198.34.188
>> > 9999 9000 4 MASTER
>> > lcm-34-190.dev.lcm.local:9999 Linux lcm-34-190.dev.lcm.local
>> 10.198.34.190
>> > 9999 9000 4 MASTER
>> > lcm-34-189:~ #
>> >
>> >
>> >
>> > Thanks And Regards,
>> >
>> >    Lakshmi Y M
>> >
>> > On Tue, Aug 20, 2019 at 8:55 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> >
>> >> > On Sat, Aug 17, 2019 at 12:28 PM Tatsuo Ishii <ishii at sraoss.co.jp>
>> >> wrote:
>> >> >
>> >> >> > Hi Pgpool Team,
>> >> >> >
>> >> >> >               *We are nearing a production release and running into
>> >> the
>> >> >> > below issues.*
>> >> >> > Replies at the earliest would be highly helpful and greatly
>> >> appreciated.
>> >> >> > Please let us know on how to get rid of the below issues.
>> >> >> >
>> >> >> > We have a 3 node pgpool + postgres cluster - M1 , M2, M3. The
>> >> pgpool.conf
>> >> >> > is as attached.
>> >> >> >
>> >> >> > *Case I :  *
>> >> >> > M1 - Pgpool Master + Postgres Master
>> >> >> > M2 , M3 - Pgpool slave + Postgres slave
>> >> >> >
>> >> >> > - M1 goes out of network. its marked as LOST in the pgpool cluster
>> >> >> > - M2 becomes postgres master
>> >> >> > - M3 becomes pgpool master.
>> >> >> > - When M1 comes back to the network, pgpool is able to solve split
>> >> brain.
>> >> >> > However, its changing the postgres master back to M1 by logging a
>> >> >> statement
>> >> >> > - "LOG:  primary node was chenged after the sync from new master",
>> so
>> >> >> since
>> >> >> > M2 was already postgres master (and its trigger file is not
>> touched)
>> >> its
>> >> >> > not able to sync to the new master.
>> >> >> > *I somehow want to avoid this postgres master change..please let us
>> >> know
>> >> >> if
>> >> >> > there is a way to avoid it*
>> >> >>
>> >> >> Sorry but I don't know how to prevent this. Probably when former
>> >> >> watchdog master recovers from an network outage and there's already
>> >> >> PostgreSQL primary server, the watchdog master should not sync the
>> >> >> state. What do you think Usama?
>> >> >>
>> >> >
>> >> > Yes, that's true, there is no functionality that exists in Pgpool-II
>> to
>> >> > disable the backend node status synch. In fact that
>> >> > would be hazardous if we somehow disable the node status syncing.
>> >> >
>> >> > But having said that, In the mentioned scenario when the M1 comes back
>> >> and
>> >> > join the watchdog cluster Pgpool-II should have
>> >> > kept the M2 as the true master while resolving the split-brain. The
>> >> > algorithm used to resolve the true master considers quite a
>> >> > few parameters and for the scenario, you explained, M2 should have
>> kept
>> >> the
>> >> > master node status while M1 should have resigned
>> >> > after joining back the cluster and effectively the M1 node should have
>> >> been
>> >> > syncing the status from M2 ( keeping the proper primary node)
>> >> > not the other way around.
>> >> > Can you please share the Pgpool-II log files so that I can have a
>> look at
>> >> > what went wrong in this case.
>> >>
>> >> Usama,
>> >>
>> >> Ok, the scenario (PostgreSQL primary x 2 in the end) should have not
>> >> happend. That's a good news.
>> >>
>> >> Lakshmi,
>> >>
>> >> Can you please provide the Pgpool-II log files as Usama requested?
>> >>
>> >> Best regards,
>> >> --
>> >> Tatsuo Ishii
>> >> SRA OSS, Inc. Japan
>> >> English: http://www.sraoss.co.jp/index_en.php
>> >> Japanese:http://www.sraoss.co.jp
>> >>
>>