[pgpool-general: 7019] Re: Pgpool-II-4.1.1 and Ubuntu 20.04

Fri May 15 05:39:03 JST 2020

Again more testing - When these are commented out

#other_pgpool_hostname1 = 'pgtest-03'
#other_pgpool_port1 = 9999
#other_wd_port1 = 9000

The third node fails to start pgpool

On 5/14/20, 2:10 PM, "pgpool-general-bounces at pgpool.net on behalf of Wolf Schwurack" <pgpool-general-bounces at pgpool.net on behalf of wolf at uen.org> wrote:

    After some more testing I found that if I comment out  "#other_pgpool_hostname1" parameters in pgpool.conf, both version 4.0.8 and 4.1.1 watchdog starts with just the master node started

    # - Other pgpool Connection Settings -

    other_pgpool_hostname0 = 'pgtest-02'
    other_pgpool_port0 = 9999
    other_wd_port0 = 9000

    #other_pgpool_hostname1 = 'pgtest-03'
    #other_pgpool_port1 = 9999
    #other_wd_port1 = 9000 

    pgpool.log
    2020-05-14 13:57:20: pid 52482: LOG:  successfully acquired the delegate IP:"10.11.0.204"
    2020-05-14 13:57:20: pid 52482: DETAIL:  'if_up_cmd' returned with success
    2020-05-14 13:57:20: pid 52476: LOG:  watchdog escalation process with pid: 52482 exit with SUCCESS.

    When I enable these parameters in pgpool.conf
    other_pgpool_hostname1 = 'pgtest-03'
    other_pgpool_port1 = 9999
    other_wd_port1 = 9000

    The master node fails to start watchdog until another pgpool node is started.

    Wolfgang Schwurack
    Database/System Administrator
    Utah Education Network
    801-587-9444
    wolf at uen.org

    On 5/13/20, 10:34 AM, "pgpool-general-bounces at pgpool.net on behalf of Wolf Schwurack" <pgpool-general-bounces at pgpool.net on behalf of wolf at uen.org> wrote:

        I installed pgpool-II-4.0.8 using pgpool.conf from pgpool-II-4.1.1 watchdog failed to start. But when I used pgpool.conf from pgpool-II-4.0.8 watchdog started on a single node

        2020-05-13 10:28:50: pid 11671: LOG:  pgpool-II successfully started. version 4.0.8 (torokiboshi)
        2020-05-13 10:28:50: pid 11671: LOG:  node status[0]: 1
        2020-05-13 10:28:50: pid 11671: LOG:  node status[1]: 2
        2020-05-13 10:28:51: pid 11685: LOG:  creating socket for sending heartbeat
        2020-05-13 10:28:51: pid 11685: DETAIL:  bind send socket to device: ens160
        2020-05-13 10:28:51: pid 11685: LOG:  set SO_REUSEPORT option to the socket
        2020-05-13 10:28:51: pid 11685: LOG:  creating socket for sending heartbeat
        2020-05-13 10:28:51: pid 11685: DETAIL:  set SO_REUSEPORT
        2020-05-13 10:28:51: pid 11684: LOG:  createing watchdog heartbeat receive socket.
        2020-05-13 10:28:51: pid 11684: DETAIL:  bind receive socket to device: "ens160"
        2020-05-13 10:28:51: pid 11684: LOG:  set SO_REUSEPORT option to the socket
        2020-05-13 10:28:51: pid 11684: LOG:  creating watchdog heartbeat receive socket.
        2020-05-13 10:28:51: pid 11684: DETAIL:  set SO_REUSEPORT
        2020-05-13 10:28:57: pid 11681: LOG:  successfully acquired the delegate IP:"10.11.0.204"
        2020-05-13 10:28:57: pid 11681: DETAIL:  'if_up_cmd' returned with success
        2020-05-13 10:28:57: pid 11673: LOG:  watchdog escalation process with pid: 11681 exit with SUCCESS.

        Wolfgang Schwurack
        Database/System Administrator
        Utah Education Network
        801-587-9444
        wolf at uen.org

        On 5/13/20, 9:39 AM, "pgpool-general-bounces at pgpool.net on behalf of Wolf Schwurack" <pgpool-general-bounces at pgpool.net on behalf of wolf at uen.org> wrote:

            Still testing this I decided to start pgpool on the second node which started watchdog. I know you told me before about having an even number a nodes to set "enable_consensus_with_half_votes = on" Which is how this is set in pgpool.conf. But I have 3 nodes setup which is uneven number of nodes. 

            It seems that the parameter "enable_consensus_with_half_votes" is not working. I can set it to 'on or off' The delegate IP is only successfully acquired if the second node is running pgpool 

            Here's the log output once I start pgpool on the second node. 

            2020-05-13 09:11:08: pid 4103: LOG:  new watchdog node connection is received from "10.11.0.206:17581"
            2020-05-13 09:11:08: pid 4103: LOG:  new node joined the cluster hostname:"pgtest-02" port:9000 pgpool_port:9999
            2020-05-13 09:11:08: pid 4103: DETAIL:  Pgpool-II version:"4.1.1" watchdog messaging version: 1.1
            2020-05-13 09:11:08: pid 4103: LOG:  new outbound connection to pgtest-02:9000 
            2020-05-13 09:11:14: pid 4103: LOG:  adding watchdog node "pgtest-02:9999 Linux pgtest-02" to the standby list
            2020-05-13 09:11:14: pid 4103: LOG:  quorum found
            2020-05-13 09:11:14: pid 4103: DETAIL:  starting escalation process
            2020-05-13 09:11:14: pid 4103: LOG:  escalation process started with PID:4231
            2020-05-13 09:11:14: pid 4099: LOG:  Pgpool-II parent process received watchdog quorum change signal from watchdog
            2020-05-13 09:11:14: pid 4103: LOG:  new IPC connection received
            2020-05-13 09:11:14: pid 4099: LOG:  watchdog cluster now holds the quorum
            2020-05-13 09:11:14: pid 4099: DETAIL:  updating the state of quarantine backend nodes
            2020-05-13 09:11:14: pid 4103: LOG:  new IPC connection received
            2020-05-13 09:11:14: pid 4231: LOG:  watchdog: escalation started
            2020-05-13 09:11:21: pid 4231: LOG:  successfully acquired the delegate IP:"10.11.0.204"
            2020-05-13 09:11:21: pid 4231: DETAIL:  'if_up_cmd' returned with success
            2020-05-13 09:11:21: pid 4103: LOG:  watchdog escalation process with pid: 4231 exit with SUCCESS.

            Then I start pgpool on the third node
            2020-05-13 09:26:52: pid 4906: LOG:  new watchdog node connection is received from "10.11.0.207:47278"
            2020-05-13 09:26:52: pid 4906: LOG:  new node joined the cluster hostname:"pgtest-03" port:9000 pgpool_port:9999
            2020-05-13 09:26:52: pid 4906: DETAIL:  Pgpool-II version:"4.1.1" watchdog messaging version: 1.1
            2020-05-13 09:26:52: pid 4906: LOG:  new outbound connection to pgtest-03:9000 
            2020-05-13 09:26:58: pid 4906: LOG:  adding watchdog node "pgtest-03:9999 Linux pgtest-03" to the standby list

            Now when I stop pgpool on third node
            2020-05-13 09:34:19: pid 4906: LOG:  client socket of pgtest-03:9999 Linux pgtest-03 is closed
            2020-05-13 09:34:19: pid 4906: LOG:  remote node "pgtest-03:9999 Linux pgtest-03" is shutting down
            2020-05-13 09:34:19: pid 4906: LOG:  removing watchdog node "pgtest-03:9999 Linux pgtest-03" from the standby list

            Watchdog is still running
            root at pgtest-01:~# ip a
            1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
                link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
                inet 127.0.0.1/8 scope host lo
                   valid_lft forever preferred_lft forever
            2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
                link/ether 00:50:56:b3:d3:58 brd ff:ff:ff:ff:ff:ff
                inet 10.11.0.205/24 brd 10.11.0.255 scope global ens160
                   valid_lft forever preferred_lft forever
                inet 10.11.0.204/24 brd 10.11.0.255 scope global secondary ens160:1
                   valid_lft forever preferred_lft forever

            Now when I stop pgpool on the second node
            2020-05-13 09:36:01: pid 4906: LOG:  We have lost the quorum
            2020-05-13 09:36:01: pid 4902: LOG:  Pgpool-II parent process received watchdog quorum change signal from watchdog
            2020-05-13 09:36:01: pid 4906: LOG:  new IPC connection received
            2020-05-13 09:36:01: pid 5504: LOG:  watchdog: de-escalation started
            2020-05-13 09:36:01: pid 5504: LOG:  successfully released the delegate IP:"10.11.0.204"
            2020-05-13 09:36:01: pid 5504: DETAIL:  'if_down_cmd' returned with success
            2020-05-13 09:36:01: pid 4906: LOG:  watchdog de-escalation process with pid: 5504 exit with SUCCESS.

            _______________________________________________
            pgpool-general mailing list
            pgpool-general at pgpool.net
            http://www.pgpool.net/mailman/listinfo/pgpool-general

        _______________________________________________
        pgpool-general mailing list
        pgpool-general at pgpool.net
        http://www.pgpool.net/mailman/listinfo/pgpool-general

    _______________________________________________
    pgpool-general mailing list
    pgpool-general at pgpool.net
    http://www.pgpool.net/mailman/listinfo/pgpool-general