[pgpool-general: 1355] Re: [PgpoolI 3.2.1] Watchdog: IP address conflict when both pgpool start in the same time.

Thomas Martin tmartincpp at gmail.com
Fri Feb 1 01:33:14 JST 2013


Hi Wolf.

Actually I don't really want to start both pgpools at the same time
but this kind of situation could happens in the real life.
For example let's say there is a power outage on the cabinet where the
servers are so they will reboot; with the same hardware and
configuration they often take the same amount of time to launch
services (and so to launch pgpool).

In this kind of situation this issue could be really annoying.
Maybe pgpool could wait a bit to verify if an other node is not going
to join the party before starting the delegate ip (if possible with a
configurable value).

I know this is not a common situation but this could happens and it's
not negligible (from my opinion at least).


Regards.

2013/1/31 Wolf Schwurack <wolf at uen.org>:
> Not sure why you would want both to start at the same time. The primary node needs to start and get watchdog started, then start the secondary node. If both are started at the same time then nether node knows that another node is the primary.
>
> Wolf
>
>
> -----Original Message-----
> From: pgpool-general-bounces at pgpool.net [mailto:pgpool-general-bounces at pgpool.net] On Behalf Of Thomas Martin
> Sent: Thursday, January 31, 2013 2:15 AM
> To: pgpool-general at pgpool.net
> Subject: [pgpool-general: 1353] [PgpoolI 3.2.1] Watchdog: IP address conflict when both pgpool start in the same time.
>
> Hi all.
>
> I'm having a really annoying issue with PgpoolI 3.2.1 and the watchdog
> feature: if pgpools start in the same time they both becomes master.
>
> Here is an example:
>
> 1) Pgpool is down so there is no IP configured:
> root at pgpool2-1:~# ifconfig | grep 10.59.10.67 root at pgpool2-2:~# ifconfig | grep 10.59.10.67
>
> 2) Start of pgpool on first node:
> root at pgpool2-1:~# date ; /usr/bin/itf/postgres/pgpool -n Thu Jan 31 09:06:53 UTC 2013
> 2013-01-31 09:06:53 LOG:   pid 32713: wd_chk_sticky:
> ifup[/sbin/ifconfig] doesn't have sticky bit
> 2013-01-31 09:06:53 LOG:   pid 32713: read_status_file: 1 th backend
> is set to down status
> 2013-01-31 09:06:53 LOG:   pid 32713: wd_create_send_socket: connect()
> reports failure (Connection refused). You can safely ignore this while starting up.
> 2013-01-31 09:06:56 LOG:   pid 32713: wd_escalation: escalated to master pgpool
> 2013-01-31 09:06:56 LOG:   pid 32713: wd_create_send_socket: connect()
> reports failure (Connection refused). You can safely ignore this while starting up.
> 2013-01-31 09:06:56 LOG:   pid 32713: wd_escalation:  escaleted to
> delegate_IP holder
> 2013-01-31 09:06:56 LOG:   pid 32713: wd_init: start watchdog
> 2013-01-31 09:06:56 LOG:   pid 32713: pgpool-II successfully started.
> version 3.2.1 (namameboshi)
> 2013-01-31 09:06:56 LOG:   pid 32713: find_primary_node: primary node id is 0
> 2013-01-31 09:06:57 LOG:   pid 32731: connection received:
> host=10.59.10.68 port=40768
> 2013-01-31 09:06:57 LOG:   pid 32719: watchdog: lifecheck started
> 2013-01-31 09:06:57 LOG:   pid 32731: connection received:
> host=10.59.10.68 port=40771
> 2013-01-31 09:06:57 LOG:   pid 32731: connection received:
> host=10.59.10.66 port=32852
> 2013-01-31 09:06:57 LOG:   pid 32731: connection received:
> host=10.59.10.66 port=32856
>
> 3) Start of pgpool on second node in the same time:
> root at pgpool2-2:~# date ; /usr/bin/itf/postgres/pgpool -n Thu Jan 31 09:06:53 UTC 2013
> 2013-01-31 09:06:53 LOG:   pid 11691: wd_chk_sticky:
> ifup[/sbin/ifconfig] doesn't have sticky bit
> 2013-01-31 09:06:53 LOG:   pid 11691: wd_create_send_socket: connect()
> reports failure (Connection refused). You can safely ignore this while starting up.
> 2013-01-31 09:06:56 LOG:   pid 11691: wd_escalation: escalated to master pgpool
> 2013-01-31 09:06:56 LOG:   pid 11691: wd_create_send_socket: connect()
> reports failure (Connection refused). You can safely ignore this while starting up.
> 2013-01-31 09:06:56 LOG:   pid 11691: wd_escalation:  escaleted to
> delegate_IP holder
> 2013-01-31 09:06:56 LOG:   pid 11691: wd_init: start watchdog
> 2013-01-31 09:06:56 LOG:   pid 11691: pgpool-II successfully started.
> version 3.2.1 (namameboshi)
> 2013-01-31 09:06:56 LOG:   pid 11691: find_primary_node: primary node id is 0
> 2013-01-31 09:06:57 LOG:   pid 11706: connection received:
> host=10.59.10.65 port=55593
> 2013-01-31 09:06:57 LOG:   pid 11706: connection received:
> host=10.59.10.65 port=55596
> 2013-01-31 09:06:57 LOG:   pid 11709: connection received:
> host=10.59.10.69 port=54965
> 2013-01-31 09:06:57 LOG:   pid 11697: watchdog: lifecheck started
> 2013-01-31 09:06:57 LOG:   pid 11709: connection received:
> host=10.59.10.69 port=54969
> 2013-01-31 09:07:07 LOG:   pid 11709: connection received:
> host=10.59.10.65 port=55601
> 2013-01-31 09:07:07 LOG:   pid 11709: connection received:
> host=10.59.10.69 port=54975
> 2013-01-31 09:07:17 LOG:   pid 11709: connection received:
> host=10.59.10.65 port=55605
> 2013-01-31 09:07:17 LOG:   pid 11709: connection received:
> host=10.59.10.69 port=54981
> 2013-01-31 09:07:27 LOG:   pid 11709: connection received:
> host=10.59.10.65 port=55609
> 2013-01-31 09:07:27 LOG:   pid 11709: connection received:
> host=10.59.10.69 port=54987
>
> 4) Both nodes have the delegated IP:
> root at pgpool2-1:~# ifconfig | grep 10.59.10.67
>           inet addr:10.59.10.67  Bcast:10.59.10.255  Mask:255.255.255.0 root at pgpool2-2:~# ifconfig | grep 10.59.10.67
>           inet addr:10.59.10.67  Bcast:10.59.10.255  Mask:255.255.255.0
>
> 5) If I start the second pgpool with a little delay there is no conflict.
>
>
> Obviously, if needed, I can send you my pgpools configurations but I reproduce this all the time.
>
> Thanks.
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list