[pgpool-general: 6706] pgpool often goes down on a node

Giancarlo Celli giancarlo.celli at flottaweb.com
Wed Sep 11 19:38:57 JST 2019


Hi,
on a node that had never had problems, the pgpool often goes down from 
about 2 weeks. In the log it looks like a connection problem that really 
doesn't seem to exist. When I restart pgpool after a few hours it goes 
down again. What could be the problem?
The log also seems to fail to ping trusted servers (which are google ones).
An extract of pgpool.log is in attached file.
Please, help me.
Thanks!
Best regards.
Giancarlo
-------------- next part --------------
2019-09-11 00:17:21: pid 26293: DEBUG:  received heartbeat signal from "IPMASTER:5432" node:IPMASTER:5432 Linux NAMESERVER
2019-09-11 00:17:21: pid 26292: WARNING:  watchdog failed to ping host"8.8.8.8"
2019-09-11 00:17:21: pid 26292: DETAIL:  ping process exits with code: 1
2019-09-11 00:17:21: pid 26292: WARNING:  watchdog failed to ping host"8.8.4.4"
2019-09-11 00:17:21: pid 26292: DETAIL:  ping process exits with code: 1
2019-09-11 00:17:21: pid 26292: WARNING:  watchdog lifecheck, failed to connect to any trusted servers
2019-09-11 00:17:21: pid 26292: LOG:  informing the node status change to watchdog
2019-09-11 00:17:21: pid 26292: DETAIL:  node id :0 status = "NODE DEAD" message:"trusted server is unreachable"
2019-09-11 00:17:21: pid 26287: LOG:  new IPC connection received
2019-09-11 00:17:21: pid 26287: LOG:  received node status change ipc message
2019-09-11 00:17:21: pid 26287: DETAIL:  trusted server is unreachable
2019-09-11 00:17:21: pid 26287: DEBUG:  processing node status changed to DEAD event for node ID:0
2019-09-11 00:17:21: pid 26287: DEBUG:  STATE MACHINE INVOKED WITH EVENT = THIS NODE LOST Current State = STANDBY
2019-09-11 00:17:21: pid 26287: WARNING:  watchdog lifecheck reported, we are disconnected from the network
2019-09-11 00:17:21: pid 26287: DETAIL:  changing the state to LOST
2019-09-11 00:17:21: pid 26287: LOG:  watchdog node state changed from [STANDBY] to [LOST]
2019-09-11 00:17:21: pid 26287: DEBUG:  STATE MACHINE INVOKED WITH EVENT = STATE CHANGED Current State = LOST
2019-09-11 00:17:21: pid 26287: FATAL:  system has lost the network
2019-09-11 00:17:21: pid 26287: LOG:  Watchdog is shutting down
2019-09-11 00:17:21: pid 26287: DEBUG:  sending packet, watchdog node:[IPMASTER:5432 Linux NAMESERVER] command id:[9] type:[INFORM I AM GOING DOWN] state:[LOST]
2019-09-11 00:17:21: pid 26287: DEBUG:  sending watchdog packet to socket:8, type:[X], command ID:9, data Length:0
2019-09-11 00:17:21: pid 26287: DEBUG:  sending watchdog packet, command id:[9] type:[INFORM I AM GOING DOWN] state :[LOST]
2019-09-11 00:17:21: pid 26287: DEBUG:  new cluster command X issued with command id 9
2019-09-11 00:17:21: pid 26287: DEBUG:  shmem_exit(3): 0 callbacks to make
2019-09-11 00:17:21: pid 26287: DEBUG:  proc_exit(3): 1 callbacks to make
2019-09-11 00:17:21: pid 26287: DEBUG:  exit(3)
2019-09-11 00:17:21: pid 26287: DEBUG:  shmem_exit(-1): 0 callbacks to make
2019-09-11 00:17:21: pid 26287: DEBUG:  proc_exit(-1): 0 callbacks to make
2019-09-11 00:17:21: pid 26286: DEBUG:  reaper handler
2019-09-11 00:17:21: pid 26286: DEBUG:  watchdog child process with pid: 26287 exit with FATAL ERROR. pgpool-II will be shutdown
2019-09-11 00:17:21: pid 26286: LOG:  watchdog child process with pid: 26287 exits with status 768
2019-09-11 00:17:21: pid 26286: FATAL:  watchdog child process exit with fatal error. exiting pgpool-II
2019-09-11 00:17:21: pid 13268: LOG:  setting the local watchdog node name to "IPSLAVE:5432 Linux NAMESLAVE"
2019-09-11 00:17:21: pid 13268: LOG:  watchdog cluster is configured with 1 remote nodes
2019-09-11 00:17:21: pid 13268: LOG:  watchdog remote node:0 on IPMASTER:9000
2019-09-11 00:17:21: pid 13268: LOG:  interface monitoring is disabled in watchdog
2019-09-11 00:17:21: pid 13268: INFO:  IPC socket path: "/tmp/.s.PGPOOLWD_CMD.9000"
2019-09-11 00:17:21: pid 13268: LOG:  watchdog node state changed from [DEAD] to [LOADING]
2019-09-11 00:17:21: pid 13268: DEBUG:  STATE MACHINE INVOKED WITH EVENT = STATE CHANGED Current State = LOADING
2019-09-11 00:17:21: pid 13268: LOG:  new outbond connection to IPMASTER:9000 
2019-09-11 00:17:21: pid 13268: DEBUG:  STATE MACHINE INVOKED WITH EVENT = NEW OUTBOUND_CONNECTION Current State = LOADING
2019-09-11 00:17:21: pid 13268: DEBUG:  sending packet, watchdog node:[] command id:[4] type:[ADD NODE] state:[LOADING]
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet to socket:15, type:[A], command ID:4, data Length:258
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet, command id:[4] type:[ADD NODE] state :[LOADING]
2019-09-11 00:17:21: pid 13268: DEBUG:  client socket of  is ready for reading
2019-09-11 00:17:21: pid 13268: DEBUG:  received watchdog packet type:I
2019-09-11 00:17:21: pid 13268: DEBUG:  received packet with command id 4 from watchdog node 
2019-09-11 00:17:21: pid 13268: DEBUG:  reading packet type I of length 272
2019-09-11 00:17:21: pid 13268: DEBUG:  STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = LOADING
2019-09-11 00:17:21: pid 13268: DEBUG:  received packet, watchdog node:[] command id:[4] type:[NODE INFO] state:[LOADING]
2019-09-11 00:17:21: pid 13268: DEBUG:  state: "MASTER" Host: "IPMASTER" Name: "IPMASTER:5432 Linux NAMESERVER" WD Port:9000 PP Port: 5432 priority:1
2019-09-11 00:17:21: pid 13268: LOG:  setting the remote node "IPMASTER:5432 Linux NAMESERVER" as watchdog cluster master
2019-09-11 00:17:21: pid 13268: LOG:  watchdog node state changed from [LOADING] to [INITIALIZING]
2019-09-11 00:17:21: pid 13268: DEBUG:  STATE MACHINE INVOKED WITH EVENT = STATE CHANGED Current State = INITIALIZING
2019-09-11 00:17:21: pid 13268: DEBUG:  sending packet, watchdog node:[IPMASTER:5432 Linux NAMESERVER] command id:[5] type:[NODE INFO] state:[INITIALIZING]
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet to socket:15, type:[I], command ID:5, data Length:258
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet, command id:[5] type:[NODE INFO] state :[INITIALIZING]
2019-09-11 00:17:21: pid 13268: LOG:  new watchdog node connection is received from "IPMASTER:52390"
2019-09-11 00:17:21: pid 13268: DEBUG:  un-identified socket 16 is ready for reading
2019-09-11 00:17:21: pid 13268: DEBUG:  received watchdog packet type:A
2019-09-11 00:17:21: pid 13268: DEBUG:  received packet with command id 1624484 from watchdog node 
2019-09-11 00:17:21: pid 13268: DEBUG:  reading packet type A of length 272
2019-09-11 00:17:21: pid 13268: DEBUG:  state: "MASTER" Host: "IPMASTER" Name: "IPMASTER:5432 Linux NAMESERVER" WD Port:9000 PP Port: 5432 priority:1
2019-09-11 00:17:21: pid 13268: DEBUG:  ADD NODE MESSAGE from hostname:"IPMASTER" port:9000 pgpool_port:5432
2019-09-11 00:17:21: pid 13268: LOG:  new node joined the cluster hostname:"IPMASTER" port:9000 pgpool_port:5432
2019-09-11 00:17:21: pid 13268: DEBUG:  STATE MACHINE INVOKED WITH EVENT = PACKET RECEIVED Current State = INITIALIZING
2019-09-11 00:17:21: pid 13268: DEBUG:  received packet, watchdog node:[IPMASTER:5432 Linux NAMESERVER] command id:[1624484] type:[ADD NODE] state:[INITIALIZING]
2019-09-11 00:17:21: pid 13268: DEBUG:  sending packet, watchdog node:[IPMASTER:5432 Linux NAMESERVER] command id:[1624484] type:[NODE INFO] state:[INITIALIZING]
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet to socket:15, type:[I], command ID:1624484, data Length:258
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet, command id:[1624484] type:[NODE INFO] state :[INITIALIZING]
2019-09-11 00:17:21: pid 13268: LOG:  Watchdog is shutting down
2019-09-11 00:17:21: pid 13268: DEBUG:  sending packet, watchdog node:[IPMASTER:5432 Linux NAMESERVER] command id:[6] type:[INFORM I AM GOING DOWN] state:[INITIALIZING]
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet to socket:15, type:[X], command ID:6, data Length:0
2019-09-11 00:17:21: pid 13268: DEBUG:  sending watchdog packet, command id:[6] type:[INFORM I AM GOING DOWN] state :[INITIALIZING]
2019-09-11 00:17:21: pid 13268: DEBUG:  new cluster command X issued with command id 6
2019-09-11 00:17:21: pid 26327: DEBUG:  PCP child receives shutdown request signal 2, Forwarding to all children
2019-09-11 00:17:21: pid 26327: DEBUG:  PCP child receives fast shutdown request
2019-09-11 00:17:21: pid 26292: DEBUG:  lifecheck child receives shutdown request signal 2, forwarding to all children
2019-09-11 00:17:21: pid 26292: DEBUG:  lifecheck child receives fast shutdown request
2019-09-11 00:17:21: pid 26293: DEBUG:  watchdog heartbeat receiver child receives shutdown request signal 2
2019-09-11 00:17:21: pid 26294: DEBUG:  watchdog heartbeat sender child receives shutdown request signal 2


More information about the pgpool-general mailing list