[pgpool-general: 7274] Re: Node status "lost" not recognized by standby PgPool

Bo Peng pengbo at sraoss.co.jp
Mon Sep 14 20:10:32 JST 2020


Hi,

Sorry for late response.

Thank you for logging debug messages.
I found the following logs in both of node0 and node2.

It seems that watchdog can't connect other 2 pgpool nodes and send/receive heartbeat signal.
I think it may be the reason why node2 can't detect node1's down.
Could you check if the port 9694 is open and can receive UDP packets.

=======================
Sep  2 11:18:53 centos8i3 pgpool[6663]: [20-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:18:53 centos8i3 pgpool[6663]: [20-2] pid 6663: DETAIL:  pgpool:1 at "centos8i1-int:5432" has not send the heartbeat signal yet
Sep  2 11:18:53 centos8i3 pgpool[6663]: [21-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:18:53 centos8i3 pgpool[6663]: [21-2] pid 6663: DETAIL:  pgpool:2 at "centos8i2-int:5432" has not send the heartbeat signal yet
Sep  2 11:20:33 centos8i3 pgpool[6663]: [22-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:20:33 centos8i3 pgpool[6663]: [22-2] pid 6663: DETAIL:  pgpool:1 at "centos8i1-int:5432" has not send the heartbeat signal yet
Sep  2 11:20:33 centos8i3 pgpool[6663]: [23-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:20:33 centos8i3 pgpool[6663]: [23-2] pid 6663: DETAIL:  pgpool:2 at "centos8i2-int:5432" has not send the heartbeat signal yet
Sep  2 11:22:13 centos8i3 pgpool[6663]: [24-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:22:13 centos8i3 pgpool[6663]: [24-2] pid 6663: DETAIL:  pgpool:1 at "centos8i1-int:5432" has not send the heartbeat signal yet
Sep  2 11:22:13 centos8i3 pgpool[6663]: [25-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:22:13 centos8i3 pgpool[6663]: [25-2] pid 6663: DETAIL:  pgpool:2 at "centos8i2-int:5432" has not send the heartbeat signal yet
Sep  2 11:23:53 centos8i3 pgpool[6663]: [26-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:23:53 centos8i3 pgpool[6663]: [26-2] pid 6663: DETAIL:  pgpool:1 at "centos8i1-int:5432" has not send the heartbeat signal yet
Sep  2 11:23:53 centos8i3 pgpool[6663]: [27-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:23:53 centos8i3 pgpool[6663]: [27-2] pid 6663: DETAIL:  pgpool:2 at "centos8i2-int:5432" has not send the heartbeat signal yet
Sep  2 11:25:33 centos8i3 pgpool[6663]: [28-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:25:33 centos8i3 pgpool[6663]: [28-2] pid 6663: DETAIL:  pgpool:1 at "centos8i1-int:5432" has not send the heartbeat signal yet
Sep  2 11:25:33 centos8i3 pgpool[6663]: [29-1] pid 6663: DEBUG:  watchdog checking life check is ready
Sep  2 11:25:33 centos8i3 pgpool[6663]: [29-2] pid 6663: DETAIL:  pgpool:2 at "centos8i2-int:5432" has not send the heartbeat signal yet
==========================

Regards,

On Mon, 14 Sep 2020 08:26:27 +0300
Anssi Kanninen <anssi at iki.fi> wrote:

> Any update on this?
> Cheers,
>     Anssi
> 
> On 2 September 2020 13:16:59 EEST, Anssi Kanninen <anssi at iki.fi> wrote:
> >Thanks Bo Peng,
> >
> >Here are the "power off" logs with debugging enabled.
> >
> >Cheers,
> >   - Anssi Kanninen
> >
> >Sep  2 11:20:13
> >
> >
> >
> >
> >On Wed, 2 Sep 2020, Bo Peng wrote:
> >
> >> Hi,
> >>
> >> Thank you for sharing log and connfig files.
> >>
> >> I can't see log like:
> >>
> >>   "LOG:  watchdog: lifecheck started"
> >>
> >> Watchdog lifecheck may not being performed properly.
> >> Sometimes watchdog lifecheck takes time,
> >> could you wait a while and check the watchdog status.
> >>
> >> To check watchdog lifecheck behaviour details you need to enable
> >debug mode.
> >> Is possible to enable debug mode and try again?
> >>
> >> To enable debug mode (If you installed pgpool from rpm):
> >>
> >> vi /etc/sysconfig/pgpool
> >>
> >> OPTS=" -D -n" => OPTS="-d -D -n"
> >>
> >>
> >> On Tue, 1 Sep 2020 12:04:35 +0300 (FLE Daylight Time)
> >> Anssi Kanninen <anssi at iki.fi> wrote:
> >>
> >>> Here are my configs and logs for nodes 0, 1 and 2 regarding the
> >"power
> >>> off" problem.
> >>>
> >>> CLEAN SHUTDOWN LOGS: pgpool-shutdown.log.nodeX
> >>>
> >>> ***** Node status from node 0 after shutdown of node 1:
> >>>
> >>> [node0]$ pcp_watchdog_info -w -h localhost
> >>> 3 NO centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int
> >>>
> >>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 7 STANDBY
> >>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 10 SHUTDOWN
> >>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 4 MASTER
> >>>
> >>> ***** Node status from node 2 after shutdown of node 1:
> >>>
> >>> [node2]$ pcp_watchdog_info -w -h localhost
> >>> 3 YES centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int
> >>>
> >>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 7 STANDBY
> >>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 10 SHUTDOWN
> >>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 4 MASTER
> >>>
> >>> =====> Everything is ok
> >>>
> >>>
> >>> NON-CLEAN ("power switch off") SHUTDOWN LOGS:
> >pgpool-poweroff.log.nodeX
> >>>
> >>> ***** Node status from node 0 after power off of node 1:
> >>>
> >>> [node0]$ pcp_watchdog_info -w -h localhost
> >>> 3 YES centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>
> >>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 8 LOST
> >>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>
> >>>
> >>> ***** Node status from node 2 after power off of node 1:
> >>>
> >>> [node2]$ pcp_watchdog_info -w -h localhost
> >>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>
> >>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>
> >>> =====> Node 2 is thinks node 1 is still in standby mode
> >>>
> >>> Cheers!
> >>>    - Anssi Kanninen
> >>>
> >>>
> >>> On Mon, 31 Aug 2020, Bo Peng wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> On Fri, 28 Aug 2020 12:27:48 +0300 (FLE Daylight Time)
> >>>> Anssi Kanninen <anssi at iki.fi> wrote:
> >>>>
> >>>>> Hi everyone!
> >>>>>
> >>>>> I'm having a problem with information exchange between PgPool
> >instances. I
> >>>>> have 3 nodes, each containing one DB backend instance and one
> >PgPool
> >>>>> instance.
> >>>>>
> >>>>> If I shut down one standby node cleanly, everything seems to go
> >ok. The
> >>>>> master PgPool notices that and informs the remaining standby
> >PgPool about
> >>>>> it.
> >>>>>
> >>>>> But the situation changes if a standby node just vahishes from the
> >network
> >>>>> by powering it off without clean shutdown. The master PgPool marks
> >the
> >>>>> node as "lost" but the remaining standby PgPool still thinks we
> >are having
> >>>>> another standby PgPool. It doesn't get any information about a
> >lost node.
> >>>>
> >>>> How did you shutdown pgpool node?
> >>>> Could you share the pgpool.log of each node?
> >>>>
> >>>>> Here it goes. In the example I'm checking the statuses by
> >connecting each
> >>>>> node with pcp_watchdog_info . I have sorted the results by node
> >hostname.
> >>>>>
> >>>>> Nodes are:
> >>>>> * ID 0 (centos8i1-int)
> >>>>> * ID 1 (centos8i2-int)
> >>>>> * ID 2 (centos8i3-int).
> >>>>>
> >>>>> ***** INITIAL SETUP *****
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i1-int
> >>>>> 3 YES centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i2-int
> >>>>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i3-int
> >>>>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> ***** SHUTDOWN node ID 1 *****
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i1-int
> >>>>> 3 YES centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 10 SHUTDOWN
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i3-int
> >>>>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 10 SHUTDOWN
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> ***** RESTART node ID 1 *****
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i1-int
> >>>>> 3 YES centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i2-int
> >>>>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i3-int
> >>>>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> ***** POWER OFF node ID 1 *****
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i1-int
> >>>>> 3 YES centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 8 LOST
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>> $ pcp_watchdog_info -w -h centos8i3-int
> >>>>> 3 NO centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int
> >>>>>
> >>>>> centos8i1-int:5432 Linux centos8i1.localdomain centos8i1-int 5432
> >9000 4 MASTER
> >>>>> centos8i2-int:5432 Linux centos8i2.localdomain centos8i2-int 5432
> >9000 7 STANDBY
> >>>>> centos8i3-int:5432 Linux centos8i3.localdomain centos8i3-int 5432
> >9000 7 STANDBY
> >>>>>
> >>>>>
> >>>>> Best regards,
> >>>>> Anssi Kanninen
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> anssi at iki.fi
> >>>>> _______________________________________________
> >>>>> pgpool-general mailing list
> >>>>> pgpool-general at pgpool.net
> >>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> >>>>
> >>>>
> >>>> --
> >>>> Bo Peng <pengbo at sraoss.co.jp>
> >>>> SRA OSS, Inc. Japan
> >>>> _______________________________________________
> >>>> pgpool-general mailing list
> >>>> pgpool-general at pgpool.net
> >>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> >>>>
> >>>
> >>> --
> >>> anssi at iki.fi
> >>
> >>
> >> -- 
> >> Bo Peng <pengbo at sraoss.co.jp>
> >> SRA OSS, Inc. Japan
> >> _______________________________________________
> >> pgpool-general mailing list
> >> pgpool-general at pgpool.net
> >> http://www.pgpool.net/mailman/listinfo/pgpool-general
> >>
> >
> >-- 
> >anssi at iki.fi


-- 
Bo Peng <pengbo at sraoss.co.jp>
SRA OSS, Inc. Japan


More information about the pgpool-general mailing list