[pgpool-general: 2287] Re: Strange watchdog - trusted server issue
Yugo Nagata
nagata at sraoss.co.jp
Fri Nov 15 16:11:43 JST 2013
Hi,
On Fri, 15 Nov 2013 15:44:24 +0900
Yugo Nagata <nagata at sraoss.co.jp> wrote:
> Hi,
>
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: get_result: ping data: PING
> > 10.201.101.92 (10.201.101.92) 56(84) bytes of data.#012#012---
> > 10.201.101.92 ping statistics ---#0123 packets transmitted, 3 received,
> > 0% packet loss, time 2006ms#012rtt min/avg/max/mdev =
> > 0.000/0.000/0.002/0.001 ms
>
I understand the problem now. It's very simple.
Watchdog regards ping is succeeded when the average RTT > 0.
So, when the ave RTT = 0, this is regarded as failure. :-(
Could you try patch attached?
> There is a strange string '#012', which should be a line break.
> Is this the same is other log lines about PING, or only right
> before down of pgpool-II?
>
>
> On Thu, 14 Nov 2013 14:45:10 +0100
> Sam Wouters <sam at ericom.be> wrote:
>
> > Hi,
> >
> > I have a running pgpool-II (3.3.1 on ubuntu-lts, package from postgresql
> > repo) cluster, consisting of three nodes with watchdog enabled.
> > After a random period of time (a couple hours), watchdog goes into down
> > state, with below log lines. This happenes consequently and on different
> > clusters, no network issues that I know off (checked tcpdumps etc).
> > The log also says that ping to the trusted servers succeeds, but
> > nevertheless you get the "failed to connect to any trusted servers"?
> >
> > Any help in debugging this issue would be very much appreciated....
> >
> > Sam
> >
> > <LOG SNIPPET>
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10107]: wd_hb_send: send 224 byte packet
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10107]: wd_hb_sender: send heartbeat
> > signal to 10.201.101.92:9694
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10106]: wd_hb_recv: received 224 byte
> > packet
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10106]: wd_hb_receiver: received
> > heartbeat signal from 10.201.101.92:5432
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: exec_ping: succeed to ping
> > 10.201.101.91
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: get_result: ping data: PING
> > 10.201.101.91 (10.201.101.91) 56(84) bytes of data.#012
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: exec_ping: succeed to ping
> > 10.201.101.92
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: get_result: ping data: PING
> > 10.201.101.92 (10.201.101.92) 56(84) bytes of data.#012#012---
> > 10.201.101.92 ping statistics ---#0123 packets transmitted, 3 received,
> > 0% packet loss, time 2006ms#012rtt min/avg/max/mdev =
> > 0.000/0.000/0.002/0.001 ms
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: wd_lifecheck: failed to
> > connect to any trusted servers
> > Nov 14 13:34:11 pgdb92-id03 pgpool[10108]: wd_IP_down: not delegate IP
> > holder
> > </LOG SNIPPET>
> >
> >
>
>
> --
> Yugo Nagata <nagata at sraoss.co.jp>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
--
Yugo Nagata <nagata at sraoss.co.jp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wd_ping.c.diff
Type: text/x-diff
Size: 413 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20131115/11fca3b2/attachment.bin>
More information about the pgpool-general
mailing list