[pgpool-general: 6903] Re: watchdog fails to start pgpool-4.1.0

Wolf Schwurack wolf at uen.org
Wed Mar 4 23:53:49 JST 2020


Hi Muhammad
I’m sending you the pgpool.log from both node0 and node1. The pgpool.log files are when I used pgpool.conf from 4.1.0. Also sending the pgpool.conf from 4.0.5 and 4.1.0 When I start pgpool using 4.0.5 pgpool.conf I’m not getting any errors but when I use pgpool.conf from 4.1.0 I’m getting “We are in split brain”

I tar all files into pgpool.tar.gz

Wolfgang Schwurack
Database/System Administrator
Utah Education Network
801-587-9444
wolf at uen.org<mailto:wolf at uen.org>



From: Muhammad Usama <m.usama at gmail.com>
Date: Wednesday, March 4, 2020 at 2:20 AM
To: Tatsuo Ishii <ishii at sraoss.co.jp>
Cc: Wolfgang Schwurack <wolf at uen.org>, PgPool General <pgpool-general at pgpool.net>
Subject: Re: [pgpool-general: 6865] Re: watchdog fails to start pgpool-4.1.0

Hi Wolfgang,

Sorry for the late reply. I just realized the email was sitting in my drafts folder and was never sent.

Is it possible if you can share the Pppool log files for both nodes preferably with the debug enabled?
Meanwhile, I am also trying to reproduce the scenario locally.

Thanks
Best regards
Muhammad Usama



On Tue, Feb 18, 2020 at 12:13 PM Tatsuo Ishii <ishii at sraoss.co.jp<mailto:ishii at sraoss.co.jp>> wrote:
Hi Usama,

Any opinion on this?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> I turned on enable_consensus_with_half_votes which I’m getting the
> acquired delegate IP on node 0. But now when I start pgpool on node 1
> getting this in the log file which is repeating - see below.  When I check
> which node has the virtual IP is show that node 0 does which is the master
> node.
>
> 2020-02-12 08:11:52: pid 29493: LOG:  watchdog node state changed from
> [INITIALIZING] to [MASTER]
> 2020-02-12 08:11:52: pid 29493: LOG:  I am announcing my self as
> master/coordinator watchdog node
> 2020-02-12 08:11:52: pid 29493: LOG:  remote node "" decided it is the
> true master
> 2020-02-12 08:11:52: pid 29493: DETAIL:  re-initializing the local
> watchdog cluster state because of split-brain
> 2020-02-12 08:11:52: pid 29493: LOG:  watchdog node state changed from
> [MASTER] to [JOINING]
> 2020-02-12 08:11:53: pid 29493: LOG:  new watchdog node connection is
> received from "10.11.0.202:12399<http://10.11.0.202:12399>"
> 2020-02-12 08:11:56: pid 29493: LOG:  watchdog node state changed from
> [JOINING] to [INITIALIZING]
> 2020-02-12 08:11:57: pid 29493: LOG:  I am the only alive node in the
> watchdog cluster
> 2020-02-12 08:11:57: pid 29493: HINT:  skipping stand for coordinator state
>
> My environment
> 2 pgpool hosts on Ubuntu 18
> 2 postgresql hosts on Ubuntu 18 postgreSQL 11
>
>
> Wolfgang Schwurack
> Database/System Administrator
> Utah Education Network
> 801-587-9444
> Wolf at uen.org<mailto:Wolf at uen.org>
>
>
>
>
>
> On 2/11/20, 3:50 PM, "Tatsuo Ishii" <ishii at sraoss.co.jp<mailto:ishii at sraoss.co.jp>> wrote:
>
>>Have you turned on enable_consensus_with_half_votes?
>>From 4.1 you need to turn on this if you use even number of Pgpool-II
>>nodes.
>>It's documented in the migration section in the doc:
>>https://www.pgpool.net/docs/latest/en/html/release-4-1-0.html
>>
>>Best regards,
>>--
>>Tatsuo Ishii
>>SRA OSS, Inc. Japan
>>English: http://www.sraoss.co.jp/index_en.php
>>Japanese:http://www.sraoss.co.jp
>>
>>From: Wolf Schwurack <wolf at uen.org<mailto:wolf at uen.org>>
>>Subject: [pgpool-general: 6865] Re: watchdog fails to start pgpool-4.1.0
>>Date: Tue, 11 Feb 2020 18:10:25 +0000
>>Message-ID: <56216C05-00F8-4C10-A32A-C793411C7891 at umail.utah.edu<mailto:56216C05-00F8-4C10-A32A-C793411C7891 at umail.utah.edu>>
>>
>>> After doing some more testing on version 4.1.0 I have notice that if
>>>node 0 fails, node 1 never acquires the delegate IP. I compared this to
>>>version 4.0.5 which when node 0 fails, node 1 acquires the delegate IP
>>>
>>> Wolfgang Schwurack
>>> Database/System Administrator
>>> Utah Education Network
>>> 801-587-9444
>>> wolf at uen.org<mailto:wolf at uen.org><mailto:wolf at uen.org<mailto:wolf at uen.org>>
>>>
>>> From: "pgpool-general-bounces at pgpool.net<mailto:pgpool-general-bounces at pgpool.net>"
>>><pgpool-general-bounces at pgpool.net<mailto:pgpool-general-bounces at pgpool.net>> on behalf of Wolfgang Schwurack
>>><wolf at uen.org<mailto:wolf at uen.org>>
>>> Date: Tuesday, February 11, 2020 at 10:54 AM
>>> To: "pgpool-general at pgpool.net<mailto:pgpool-general at pgpool.net>" <pgpool-general at pgpool.net<mailto:pgpool-general at pgpool.net>>
>>> Subject: [pgpool-general: 6864] Re: watchdog fails to start pgpool-4.1.0
>>>
>>> It seem that version 4.1.0 requires the second node to be started
>>>before acquired the delegate IP
>>> After starting pgpool on the node 1 I?m seeing that watchdog
>>>successfully acquired the delegate IP on node 0
>>>
>>> 2020-02-11 10:45:26: pid 9928: LOG:  watchdog: escalation started
>>> 2020-02-11 10:45:33: pid 9928: LOG:  successfully acquired the delegate
>>>IP:"10.11.0.204"
>>> 2020-02-11 10:45:33: pid 9928: DETAIL:  'if_up_cmd' returned with
>>>success
>>> 2020-02-11 10:45:33: pid 9577: LOG:  watchdog escalation process with
>>>pid: 9928 exit with SUCCESS.
>>>
>>> On previous versions watchdog would always acquire the delegate IP
>>>without the second node being started.
>>>
>>>
>>> From: "pgpool-general-bounces at pgpool.net<mailto:pgpool-general-bounces at pgpool.net>"
>>><pgpool-general-bounces at pgpool.net<mailto:pgpool-general-bounces at pgpool.net>> on behalf of Wolfgang Schwurack
>>><wolf at uen.org<mailto:wolf at uen.org>>
>>> Date: Tuesday, February 11, 2020 at 10:22 AM
>>> To: "pgpool-general at pgpool.net<mailto:pgpool-general at pgpool.net>" <pgpool-general at pgpool.net<mailto:pgpool-general at pgpool.net>>
>>> Subject: [pgpool-general: 6863] watchdog fails to start pgpool-4.1.0
>>>
>>> I?m trying to get watchdog to start using pgpool-4.1.0 but fails to
>>>start. I have been using pgpool-4.0.5 with watchdog no issues.
>>> Has something changed in version 4.1.0 for watchdog?
>>> Hosts  - Ubuntu 18.0.4
>>> PostgreSQL 11
>>>
>>> I?ve been using pgpool for a long time on each new release I have
>>>always just done ./configure, make, make install
>>>
>>> This is my start command
>>>
>>> /usr/local/bin/pgpool -n -D -f /usr/local/etc/pgpool.conf >
>>>/var/log/pgpool/pgpool.log 2>&1 &
>>> In pgpool.log it would always show if acquired the delegate ip
>>> Version 4.0.5 start up watchdog
>>>
>>> 2020-02-11 10:13:05: pid 2195: LOG:  pgpool-II successfully started.
>>>version 4.0.5 (torokiboshi)
>>>
>>> 2020-02-11 10:13:05: pid 2195: LOG:  node status[0]: 1
>>>
>>> 2020-02-11 10:13:05: pid 2195: LOG:  node status[1]: 2
>>>
>>> 2020-02-11 10:13:06: pid 2228: LOG:  creating socket for sending
>>>heartbeat
>>>
>>> 2020-02-11 10:13:06: pid 2228: DETAIL:  bind send socket to device: eth0
>>>
>>> 2020-02-11 10:13:06: pid 2228: LOG:  set SO_REUSEPORT option to the
>>>socket
>>>
>>> 2020-02-11 10:13:06: pid 2228: LOG:  creating socket for sending
>>>heartbeat
>>>
>>> 2020-02-11 10:13:06: pid 2228: DETAIL:  set SO_REUSEPORT
>>>
>>> 2020-02-11 10:13:06: pid 2227: LOG:  createing watchdog heartbeat
>>>receive socket.
>>>
>>> 2020-02-11 10:13:06: pid 2227: DETAIL:  bind receive socket to device:
>>>"eth0"
>>>
>>> 2020-02-11 10:13:06: pid 2227: LOG:  set SO_REUSEPORT option to the
>>>socket
>>>
>>> 2020-02-11 10:13:06: pid 2227: LOG:  creating watchdog heartbeat
>>>receive socket.
>>>
>>> 2020-02-11 10:13:06: pid 2227: DETAIL:  set SO_REUSEPORT
>>>
>>> 2020-02-11 10:13:12: pid 2200: LOG:  successfully acquired the delegate
>>>IP:"10.11.0.204"
>>>
>>> 2020-02-11 10:13:12: pid 2200: DETAIL:  'if_up_cmd' returned with
>>>success
>>>
>>> 2020-02-11 10:13:12: pid 2197: LOG:  watchdog escalation process with
>>>pid: 2200 exit with SUCCESS.
>>>
>>> Version 4.1.0 fails to start watchdog
>>>
>>> 2020-02-11 10:15:54: pid 8392: LOG:  pgpool-II successfully started.
>>>version 4.1.0 (karasukiboshi)
>>>
>>> 2020-02-11 10:15:54: pid 8392: LOG:  node status[0]: 1
>>>
>>> 2020-02-11 10:15:54: pid 8392: LOG:  node status[1]: 2
>>>
>>> 2020-02-11 10:15:55: pid 8425: LOG:  creating socket for sending
>>>heartbeat
>>>
>>> 2020-02-11 10:15:55: pid 8425: DETAIL:  bind send socket to device: eth0
>>>
>>> 2020-02-11 10:15:55: pid 8425: LOG:  set SO_REUSEPORT option to the
>>>socket
>>>
>>> 2020-02-11 10:15:55: pid 8425: LOG:  creating socket for sending
>>>heartbeat
>>>
>>> 2020-02-11 10:15:55: pid 8425: DETAIL:  set SO_REUSEPORT
>>>
>>> 2020-02-11 10:15:55: pid 8424: LOG:  createing watchdog heartbeat
>>>receive socket.
>>>
>>> 2020-02-11 10:15:55: pid 8424: DETAIL:  bind receive socket to device:
>>>"eth0"
>>>
>>> 2020-02-11 10:15:55: pid 8424: LOG:  set SO_REUSEPORT option to the
>>>socket
>>>
>>> 2020-02-11 10:15:55: pid 8424: LOG:  creating watchdog heartbeat
>>>receive socket.
>>>
>>> 2020-02-11 10:15:55: pid 8424: DETAIL:  set SO_REUSEPORT
>>>
>>>
>>> Wolfgang Schwurack
>>> Database/System Administrator
>>> Utah Education Network
>>> 801-587-9444
>>> wolf at uen.org<mailto:wolf at uen.org><mailto:wolf at uen.org<mailto:wolf at uen.org>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20200304/374e2567/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool.tar.gz
Type: application/x-gzip
Size: 32102 bytes
Desc: pgpool.tar.gz
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20200304/374e2567/attachment-0001.gz>


More information about the pgpool-general mailing list