[pgpool-general: 5977] Automatic Failover Randomly does not work
Pankaj Joshi
pankajjo02 at gmail.com
Mon Mar 5 21:57:46 JST 2018
yes I am sure that the DB on both the servers were attached to the pool ,
that's the reason I have mentioned that it works most of the time but does
not work sometimes.
what can I do to provide more details which will help to debug this issue ?
Thanks
Pankaj
On Mon, Mar 5, 2018 at 5:59 PM, Pierre Timmermans <ptim007 at yahoo.com> wrote:
> Hello
>
> In the log, the failover command is indeed incorrect, since the %H
> (hostname of new master) is empty. So I suppose your failover script cannot
> connect to the new master to do the promotion
>
> Maybe it means that pgpool does not have a node available to become a new
> master ? For example if the DB on osboxes44 is not attached to the pool,
> then you will get this behavior. Are you sure that before the reboot of
> osboxes75 booth db were attached to the pool ?
>
> Pierre
>
>
> On Monday, March 5, 2018, 10:41:28 AM GMT+1, Pankaj Joshi <
> pankajjo02 at gmail.com> wrote:
>
>
> Hello Team,
>
> we have a setup of 2 nodes running pgpool version 3.7.2 (amefuriboshi)
> and postgres 9.6.7 and rep manager. these 2 nodes run both pgpool and
> postgres in HA mode .
>
> in production as well in the test environment when the primary pgpool and
> primary postgres are on same node and that node is shutdown or restarted
> the auto failover of the setup sporadically fails ,in my testing out of 20
> instances of the primary node failure the auto failover fails 3-4 times ,
> rest of the time it works fine .
>
> we have 2 nodes osboxes44 and osboxes75 , on checking logs each time the
> failover command which is exected is incorrect
>
> execute command: /etc/pgpool-II-96/failover.sh 1 ""
>
>
> I am putting pgpool logs of 3 instances when the failover failed , pgpool
> goes keeps looking for the primary node , in all three cases the secondary
> postgres node was up and running .. , I am putting pgpool.conf as well
> post the incident logs..
>
>
> *instance 1 *
>
> Feb 3 20:22:55 osboxes44 pgpool[5288]: [11-1] 2018-02-03 20:22:55: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:22:55 osboxes44 pgpool[5288]: [12-1] 2018-02-03 20:22:55: pid
> 5288: LOG: health check retrying on DB node: 1 (round:1)
> Feb 3 20:22:55 osboxes44 pgpool[5287]: [11-1] 2018-02-03 20:22:55: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:22:56 osboxes44 pgpool[5288]: [13-1] 2018-02-03 20:22:56: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:22:56 osboxes44 pgpool[5288]: [14-1] 2018-02-03 20:22:56: pid
> 5288: LOG: health check retrying on DB node: 1 (round:2)
> Feb 3 20:22:57 osboxes44 pgpool[5288]: [15-1] 2018-02-03 20:22:57: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:22:57 osboxes44 pgpool[5288]: [16-1] 2018-02-03 20:22:57: pid
> 5288: LOG: health check retrying on DB node: 1 (round:3)
> Feb 3 20:22:58 osboxes44 pgpool[5288]: [17-1] 2018-02-03 20:22:58: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:22:58 osboxes44 pgpool[5288]: [18-1] 2018-02-03 20:22:58: pid
> 5288: LOG: health check retrying on DB node: 1 (round:4)
> Feb 3 20:22:59 osboxes44 pgpool[5288]: [19-1] 2018-02-03 20:22:59: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:22:59 osboxes44 pgpool[5288]: [20-1] 2018-02-03 20:22:59: pid
> 5288: LOG: health check retrying on DB node: 1 (round:5)
> Feb 3 20:23:00 osboxes44 pgpool[5288]: [21-1] 2018-02-03 20:23:00: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:00 osboxes44 pgpool[5288]: [22-1] 2018-02-03 20:23:00: pid
> 5288: LOG: health check retrying on DB node: 1 (round:6)
> Feb 3 20:23:01 osboxes44 pgpool[5288]: [23-1] 2018-02-03 20:23:01: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:01 osboxes44 pgpool[5288]: [24-1] 2018-02-03 20:23:01: pid
> 5288: LOG: health check retrying on DB node: 1 (round:7)
> Feb 3 20:23:02 osboxes44 pgpool[5288]: [25-1] 2018-02-03 20:23:02: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:02 osboxes44 pgpool[5288]: [26-1] 2018-02-03 20:23:02: pid
> 5288: LOG: health check retrying on DB node: 1 (round:8)
> Feb 3 20:23:03 osboxes44 pgpool[5288]: [27-1] 2018-02-03 20:23:03: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:03 osboxes44 pgpool[5288]: [28-1] 2018-02-03 20:23:03: pid
> 5288: LOG: health check retrying on DB node: 1 (round:9)
> Feb 3 20:23:04 osboxes44 pgpool[5288]: [29-1] 2018-02-03 20:23:04: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:04 osboxes44 pgpool[5288]: [30-1] 2018-02-03 20:23:04: pid
> 5288: LOG: health check retrying on DB node: 1 (round:10)
> Feb 3 20:23:05 osboxes44 pgpool[5287]: [12-1] 2018-02-03 20:23:05: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:05 osboxes44 pgpool[5288]: [31-1] 2018-02-03 20:23:05: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:05 osboxes44 pgpool[5288]: [32-1] 2018-02-03 20:23:05: pid
> 5288: LOG: health check failed on node 1 (timeout:0)
> Feb 3 20:23:05 osboxes44 pgpool[5288]: [33-1] 2018-02-03 20:23:05: pid
> 5288: LOG: received degenerate backend request for node_id: 1 from pid
> [5288]
> Feb 3 20:23:05 osboxes44 pgpool[5246]: [21-1] 2018-02-03 20:23:05: pid
> 5246: LOG: new IPC connection received
> Feb 3 20:23:05 osboxes44 pgpool[5246]: [22-1] 2018-02-03 20:23:05: pid
> 5246: LOG: failover request from local pgpool-II node received on IPC
> interface is forwarded to master watchdog node "osboxes75:5431 Linux
> osboxes75"
> Feb 3 20:23:05 osboxes44 pgpool[5246]: [22-2] 2018-02-03 20:23:05: pid
> 5246: DETAIL: waiting for the reply...
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [23-1] 2018-02-03 20:23:10: pid
> 5246: LOG: remote node "osboxes75:5431 Linux osboxes75" is not replying..
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [23-2] 2018-02-03 20:23:10: pid
> 5246: DETAIL: marking the node as lost
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [24-1] 2018-02-03 20:23:10: pid
> 5246: LOG: remote node "osboxes75:5431 Linux osboxes75" is lost
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [25-1] 2018-02-03 20:23:10: pid
> 5246: LOG: watchdog cluster has lost the coordinator node
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [26-1] 2018-02-03 20:23:10: pid
> 5246: LOG: unassigning the remote node "osboxes75:5431 Linux osboxes75"
> from watchdog cluster master
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [27-1] 2018-02-03 20:23:10: pid
> 5246: LOG: We have lost the cluster master node "osboxes75:5431 Linux
> osboxes75"
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [28-1] 2018-02-03 20:23:10: pid
> 5246: LOG: watchdog node state changed from [STANDBY] to [JOINING]
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [29-1] 2018-02-03 20:23:10: pid
> 5246: LOG: connect on socket failed
> Feb 3 20:23:10 osboxes44 pgpool[5246]: [29-2] 2018-02-03 20:23:10: pid
> 5246: DETAIL: connect failed with error: "Network is unreachable"
> Feb 3 20:23:10 osboxes44 pgpool[5288]: [34-1] 2018-02-03 20:23:10: pid
> 5288: LOG: degenerate backend request for 1 node(s) from pid [5288] is
> canceled by other pgpool
> Feb 3 20:23:14 osboxes44 pgpool[5246]: [30-1] 2018-02-03 20:23:14: pid
> 5246: LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
> Feb 3 20:23:15 osboxes44 pgpool[5287]: [13-1] 2018-02-03 20:23:15: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432" with
> error "Network is unreachable"
> Feb 3 20:23:15 osboxes44 pgpool[5246]: [31-1] 2018-02-03 20:23:15: pid
> 5246: LOG: I am the only alive node in the watchdog cluster
> Feb 3 20:23:15 osboxes44 pgpool[5246]: [31-2] 2018-02-03 20:23:15: pid
> 5246: HINT: skiping stand for coordinator state
> Feb 3 20:23:15 osboxes44 pgpool[5246]: [32-1] 2018-02-03 20:23:15: pid
> 5246: LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
> Feb 3 20:23:15 osboxes44 pgpool[5246]: [33-1] 2018-02-03 20:23:15: pid
> 5246: LOG: I am announcing my self as master/coordinator watchdog node
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [34-1] 2018-02-03 20:23:19: pid
> 5246: LOG: I am the cluster leader node
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [34-2] 2018-02-03 20:23:19: pid
> 5246: DETAIL: our declare coordinator message is accepted by all nodes
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [35-1] 2018-02-03 20:23:19: pid
> 5246: LOG: setting the local node "osboxes44:5431 Linux osboxes44" as
> watchdog cluster master
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [36-1] 2018-02-03 20:23:19: pid
> 5246: LOG: I am the cluster leader node. Starting escalation process
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [37-1] 2018-02-03 20:23:19: pid
> 5246: LOG: escalation process started with PID:31303
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [38-1] 2018-02-03 20:23:19: pid
> 5246: LOG: new IPC connection received
> Feb 3 20:23:19 osboxes44 pgpool[31303]: [37-1] 2018-02-03 20:23:19: pid
> 31303: LOG: watchdog: escalation started
> Feb 3 20:23:19 osboxes44 pgpool[31303]: [38-1] 2018-02-03 20:23:19: pid
> 31303: LOG: failed to acquire the delegate IP address
> Feb 3 20:23:19 osboxes44 pgpool[31303]: [38-2] 2018-02-03 20:23:19: pid
> 31303: DETAIL: 'if_up_cmd' failed
> Feb 3 20:23:19 osboxes44 pgpool[5246]: [39-1] 2018-02-03 20:23:19: pid
> 5246: LOG: watchdog escalation process with pid: 31303 exit with SUCCESS.
> Feb 3 20:23:22 osboxes44 pgpool[5288]: [35-1] 2018-02-03 20:23:22: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:22 osboxes44 pgpool[5288]: [36-1] 2018-02-03 20:23:22: pid
> 5288: LOG: health check retrying on DB node: 1 (round:1)
> Feb 3 20:23:25 osboxes44 pgpool[5288]: [37-1] 2018-02-03 20:23:25: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:25 osboxes44 pgpool[5288]: [38-1] 2018-02-03 20:23:25: pid
> 5288: LOG: health check retrying on DB node: 1 (round:2)
> Feb 3 20:23:25 osboxes44 pgpool[5287]: [14-1] 2018-02-03 20:23:25: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:28 osboxes44 pgpool[5288]: [39-1] 2018-02-03 20:23:28: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:28 osboxes44 pgpool[5288]: [40-1] 2018-02-03 20:23:28: pid
> 5288: LOG: health check retrying on DB node: 1 (round:3)
> Feb 3 20:23:31 osboxes44 pgpool[5288]: [41-1] 2018-02-03 20:23:31: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:31 osboxes44 pgpool[5288]: [42-1] 2018-02-03 20:23:31: pid
> 5288: LOG: health check retrying on DB node: 1 (round:4)
> Feb 3 20:23:34 osboxes44 pgpool[5288]: [43-1] 2018-02-03 20:23:34: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:37 osboxes44 pgpool[5287]: [15-1] 2018-02-03 20:23:37: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:37 osboxes44 pgpool[5288]: [45-1] 2018-02-03 20:23:37: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:37 osboxes44 pgpool[5288]: [46-1] 2018-02-03 20:23:37: pid
> 5288: LOG: health check retrying on DB node: 1 (round:6)
> Feb 3 20:23:41 osboxes44 pgpool[5288]: [47-1] 2018-02-03 20:23:41: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:41 osboxes44 pgpool[5288]: [48-1] 2018-02-03 20:23:41: pid
> 5288: LOG: health check retrying on DB node: 1 (round:7)
> Feb 3 20:23:44 osboxes44 pgpool[5288]: [49-1] 2018-02-03 20:23:44: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:44 osboxes44 pgpool[5288]: [50-1] 2018-02-03 20:23:44: pid
> 5288: LOG: health check retrying on DB node: 1 (round:8)
> Feb 3 20:23:47 osboxes44 pgpool[5288]: [51-1] 2018-02-03 20:23:47: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:47 osboxes44 pgpool[5288]: [52-1] 2018-02-03 20:23:47: pid
> 5288: LOG: health check retrying on DB node: 1 (round:9)
> Feb 3 20:23:50 osboxes44 pgpool[5288]: [53-1] 2018-02-03 20:23:50: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:50 osboxes44 pgpool[5288]: [54-1] 2018-02-03 20:23:50: pid
> 5288: LOG: health check retrying on DB node: 1 (round:10)
> Feb 3 20:23:50 osboxes44 pgpool[5287]: [16-1] 2018-02-03 20:23:50: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:53 osboxes44 pgpool[5288]: [55-1] 2018-02-03 20:23:53: pid
> 5288: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:23:53 osboxes44 pgpool[5288]: [56-1] 2018-02-03 20:23:53: pid
> 5288: LOG: health check failed on node 1 (timeout:0)
> Feb 3 20:23:53 osboxes44 pgpool[5288]: [57-1] 2018-02-03 20:23:53: pid
> 5288: LOG: received degenerate backend request for node_id: 1 from pid
> [5288]
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [40-1] 2018-02-03 20:23:53: pid
> 5246: LOG: new IPC connection received
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [41-1] 2018-02-03 20:23:53: pid
> 5246: LOG: watchdog received the failover command from local pgpool-II on
> IPC interface
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [42-1] 2018-02-03 20:23:53: pid
> 5246: LOG: watchdog is processing the failover command
> [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [43-1] 2018-02-03 20:23:53: pid
> 5246: LOG: we have got the consensus to perform the failover
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [43-2] 2018-02-03 20:23:53: pid
> 5246: DETAIL: 1 node(s) voted in the favor
> Feb 3 20:23:53 osboxes44 pgpool[5244]: [12-1] 2018-02-03 20:23:53: pid
> 5244: LOG: Pgpool-II parent process has received failover request
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [44-1] 2018-02-03 20:23:53: pid
> 5246: LOG: new IPC connection received
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [45-1] 2018-02-03 20:23:53: pid
> 5246: LOG: received the failover indication from Pgpool-II on IPC interface
> Feb 3 20:23:53 osboxes44 pgpool[5246]: [46-1] 2018-02-03 20:23:53: pid
> 5246: LOG: watchdog is informed of failover end by the main process
> Feb 3 20:23:53 osboxes44 pgpool[5244]: [13-1] 2018-02-03 20:23:53: pid
> 5244: LOG: starting degeneration. shutdown host osboxes75(5432)
> Feb 3 20:23:53 osboxes44 pgpool[5244]: [14-1] 2018-02-03 20:23:53: pid
> 5244: LOG: failover: no valid backends node found
> Feb 3 20:23:53 osboxes44 pgpool[5244]: [15-1] 2018-02-03 20:23:53: pid
> 5244: LOG: Restart all children
> *Feb 3 20:23:53 osboxes44 pgpool[5244]: [16-1] 2018-02-03 20:23:53: pid
> 5244: LOG: execute command: /etc/pgpool-II-96/failover.sh 1 ""*
> Feb 3 20:24:00 osboxes44 pgpool[5287]: [17-1] 2018-02-03 20:24:00: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:24:12 osboxes44 pgpool[5287]: [18-1] 2018-02-03 20:24:12: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 3 20:24:13 osboxes44 pgpool[5244]: [17-1] 2018-02-03 20:24:13: pid
> 5244: LOG: find_primary_node_repeatedly: waiting for finding a primary node
> Feb 3 20:24:13 osboxes44 pgpool[5244]: [17-1] 2018-02-03 20:24:13: pid
> 5244: LOG: find_primary_node_repeatedly: waiting for finding a primary node
> Feb 3 20:24:13 osboxes44 pgpool[5244]: [18-1] 2018-02-03 20:24:13: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:13 osboxes44 pgpool[5244]: [19-1] 2018-02-03 20:24:13: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:14 osboxes44 pgpool[5244]: [20-1] 2018-02-03 20:24:14: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:14 osboxes44 pgpool[5244]: [21-1] 2018-02-03 20:24:14: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:15 osboxes44 pgpool[5244]: [22-1] 2018-02-03 20:24:15: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:15 osboxes44 pgpool[5244]: [23-1] 2018-02-03 20:24:15: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:16 osboxes44 pgpool[5244]: [24-1] 2018-02-03 20:24:16: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:16 osboxes44 pgpool[5244]: [25-1] 2018-02-03 20:24:16: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:17 osboxes44 pgpool[5244]: [26-1] 2018-02-03 20:24:17: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:17 osboxes44 pgpool[5244]: [27-1] 2018-02-03 20:24:17: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:18 osboxes44 pgpool[5244]: [28-1] 2018-02-03 20:24:18: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:18 osboxes44 pgpool[5244]: [29-1] 2018-02-03 20:24:18: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:19 osboxes44 pgpool[5244]: [30-1] 2018-02-03 20:24:19: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:19 osboxes44 pgpool[5244]: [31-1] 2018-02-03 20:24:19: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:20 osboxes44 pgpool[5244]: [32-1] 2018-02-03 20:24:20: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:20 osboxes44 pgpool[5244]: [33-1] 2018-02-03 20:24:20: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:21 osboxes44 pgpool[5244]: [34-1] 2018-02-03 20:24:21: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:21 osboxes44 pgpool[5244]: [35-1] 2018-02-03 20:24:21: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:22 osboxes44 pgpool[5244]: [36-1] 2018-02-03 20:24:22: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:22 osboxes44 pgpool[5244]: [37-1] 2018-02-03 20:24:22: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:23 osboxes44 pgpool[5244]: [38-1] 2018-02-03 20:24:23: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:23 osboxes44 pgpool[5244]: [39-1] 2018-02-03 20:24:23: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:24 osboxes44 pgpool[5244]: [40-1] 2018-02-03 20:24:24: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:24 osboxes44 pgpool[5244]: [41-1] 2018-02-03 20:24:24: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:25 osboxes44 pgpool[5244]: [42-1] 2018-02-03 20:24:25: pid
> 5244: LOG: find_primary_node: checking backend no 0
> Feb 3 20:24:25 osboxes44 pgpool[5244]: [43-1] 2018-02-03 20:24:25: pid
> 5244: LOG: find_primary_node: checking backend no 1
> Feb 3 20:24:25 osboxes44 pgpool[5287]: [19-1] 2018-02-03 20:24:25: pid
> 5287: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
>
>
>
>
> *instance 2*
>
>
> Feb 26 13:46:30 osboxes44 pgpool[4398]: [56-1] 2018-02-26 13:46:30: pid
> 4398: LOG: remote node "osboxes75:5431 Linux osboxes75" is shutting down
> Feb 26 13:46:30 osboxes44 pgpool[4398]: [57-1] 2018-02-26 13:46:30: pid
> 4398: LOG: watchdog cluster has lost the coordinator node
> Feb 26 13:46:30 osboxes44 pgpool[4398]: [58-1] 2018-02-26 13:46:30: pid
> 4398: LOG: unassigning the remote node "osboxes75:5431 Linux osboxes75"
> from watchdog cluster master
> Feb 26 13:46:30 osboxes44 pgpool[4398]: [59-1] 2018-02-26 13:46:30: pid
> 4398: LOG: We have lost the cluster master node "osboxes75:5431 Linux
> osboxes75"
> Feb 26 13:46:30 osboxes44 pgpool[4398]: [60-1] 2018-02-26 13:46:30: pid
> 4398: LOG: watchdog node state changed from [STANDBY] to [JOINING]
> Feb 26 13:46:34 osboxes44 pgpool[4398]: [61-1] 2018-02-26 13:46:34: pid
> 4398: LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
> Feb 26 13:46:35 osboxes44 pgpool[4398]: [62-1] 2018-02-26 13:46:35: pid
> 4398: LOG: I am the only alive node in the watchdog cluster
> Feb 26 13:46:35 osboxes44 pgpool[4398]: [62-2] 2018-02-26 13:46:35: pid
> 4398: HINT: skipping stand for coordinator state
> Feb 26 13:46:35 osboxes44 pgpool[4398]: [63-1] 2018-02-26 13:46:35: pid
> 4398: LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
> Feb 26 13:46:35 osboxes44 pgpool[4398]: [64-1] 2018-02-26 13:46:35: pid
> 4398: LOG: I am announcing my self as master/coordinator watchdog node
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [65-1] 2018-02-26 13:46:39: pid
> 4398: LOG: I am the cluster leader node
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [65-2] 2018-02-26 13:46:39: pid
> 4398: DETAIL: our declare coordinator message is accepted by all nodes
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [66-1] 2018-02-26 13:46:39: pid
> 4398: LOG: setting the local node "osboxes44:5431 Linux osboxes44" as
> watchdog cluster master
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [67-1] 2018-02-26 13:46:39: pid
> 4398: LOG: I am the cluster leader node. Starting escalation process
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [68-1] 2018-02-26 13:46:39: pid
> 4398: LOG: escalation process started with PID:5976
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [69-1] 2018-02-26 13:46:39: pid
> 4398: LOG: new IPC connection received
> Feb 26 13:46:39 osboxes44 pgpool[5976]: [68-1] 2018-02-26 13:46:39: pid
> 5976: LOG: watchdog: escalation started
> Feb 26 13:46:39 osboxes44 pgpool[5976]: [69-1] 2018-02-26 13:46:39: pid
> 5976: LOG: failed to acquire the delegate IP address
> Feb 26 13:46:39 osboxes44 pgpool[5976]: [69-2] 2018-02-26 13:46:39: pid
> 5976: DETAIL: 'if_up_cmd' failed
> Feb 26 13:46:39 osboxes44 pgpool[4398]: [70-1] 2018-02-26 13:46:39: pid
> 4398: LOG: watchdog escalation process with pid: 5976 exit with SUCCESS.
> Feb 26 13:47:05 osboxes44 pgpool[4401]: [8-1] 2018-02-26 13:47:05: pid
> 4401: LOG: informing the node status change to watchdog
> Feb 26 13:47:05 osboxes44 pgpool[4401]: [8-2] 2018-02-26 13:47:05: pid
> 4401: DETAIL: node id :1 status = "NODE DEAD" message:"No heartbeat signal
> from node"
> Feb 26 13:47:05 osboxes44 pgpool[4398]: [71-1] 2018-02-26 13:47:05: pid
> 4398: LOG: new IPC connection received
> Feb 26 13:47:05 osboxes44 pgpool[4398]: [72-1] 2018-02-26 13:47:05: pid
> 4398: LOG: received node status change ipc message
> Feb 26 13:47:05 osboxes44 pgpool[4398]: [72-2] 2018-02-26 13:47:05: pid
> 4398: DETAIL: No heartbeat signal from node
> Feb 26 13:47:05 osboxes44 pgpool[4398]: [73-1] 2018-02-26 13:47:05: pid
> 4398: LOG: remote node "osboxes75:5431 Linux osboxes75" is shutting down
> Feb 26 13:48:06 osboxes44 pgpool[4434]: [11-1] 2018-02-26 13:48:06: pid
> 4434: LOG: forked new pcp worker, pid=5996 socket=8
> Feb 26 13:48:06 osboxes44 pgpool[5996]: [11-1] 2018-02-26 13:48:06: pid
> 5996: FATAL: authentication failed for user "pgpool"
> Feb 26 13:48:06 osboxes44 pgpool[5996]: [11-2] 2018-02-26 13:48:06: pid
> 5996: DETAIL: username and/or password does not match
> Feb 26 13:48:06 osboxes44 pgpool[4434]: [12-1] 2018-02-26 13:48:06: pid
> 4434: LOG: PCP process with pid: 5996 exit with SUCCESS.
> Feb 26 13:48:06 osboxes44 pgpool[4434]: [13-1] 2018-02-26 13:48:06: pid
> 4434: LOG: PCP process with pid: 5996 exits with status 256
> Feb 26 13:48:09 osboxes44 pgpool[4434]: [14-1] 2018-02-26 13:48:09: pid
> 4434: LOG: forked new pcp worker, pid=5999 socket=8
> Feb 26 13:48:09 osboxes44 pgpool[4398]: [74-1] 2018-02-26 13:48:09: pid
> 4398: LOG: new IPC connection received
> Feb 26 13:48:09 osboxes44 pgpool[4434]: [15-1] 2018-02-26 13:48:09: pid
> 4434: LOG: PCP process with pid: 5999 exit with SUCCESS.
> Feb 26 13:48:09 osboxes44 pgpool[4434]: [16-1] 2018-02-26 13:48:09: pid
> 4434: LOG: PCP process with pid: 5999 exits with status 0
> Feb 26 13:48:20 osboxes44 pgpool[5951]: [201-1] 2018-02-26 13:48:20: pid
> 5951: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "No route to host"
> Feb 26 13:48:20 osboxes44 pgpool[5951]: [202-1] 2018-02-26 13:48:20: pid
> 5951: LOG: received degenerate backend request for node_id: 1 from pid
> [5951]
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [75-1] 2018-02-26 13:48:20: pid
> 4398: LOG: new IPC connection received
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [76-1] 2018-02-26 13:48:20: pid
> 4398: LOG: watchdog received the failover command from local pgpool-II on
> IPC interface
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [77-1] 2018-02-26 13:48:20: pid
> 4398: LOG: watchdog is processing the failover command
> [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [78-1] 2018-02-26 13:48:20: pid
> 4398: LOG: we have got the consensus to perform the failover
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [78-2] 2018-02-26 13:48:20: pid
> 4398: DETAIL: 1 node(s) voted in the favor
> Feb 26 13:48:20 osboxes44 pgpool[5951]: [203-1] 2018-02-26 13:48:20: pid
> 5951: FATAL: failed to create a backend connection
> Feb 26 13:48:20 osboxes44 pgpool[5951]: [203-2] 2018-02-26 13:48:20: pid
> 5951: DETAIL: executing failover on backend
> Feb 26 13:48:20 osboxes44 pgpool[4395]: [230-1] 2018-02-26 13:48:20: pid
> 4395: LOG: Pgpool-II parent process has received failover request
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [79-1] 2018-02-26 13:48:20: pid
> 4398: LOG: new IPC connection received
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [80-1] 2018-02-26 13:48:20: pid
> 4398: LOG: received the failover indication from Pgpool-II on IPC interface
> Feb 26 13:48:20 osboxes44 pgpool[4398]: [81-1] 2018-02-26 13:48:20: pid
> 4398: LOG: watchdog is informed of failover end by the main process
> Feb 26 13:48:20 osboxes44 pgpool[4395]: [231-1] 2018-02-26 13:48:20: pid
> 4395: LOG: starting degeneration. shutdown host osboxes75(5432)
> Feb 26 13:48:20 osboxes44 pgpool[4395]: [232-1] 2018-02-26 13:48:20: pid
> 4395: LOG: failover: no valid backends node found
> Feb 26 13:48:20 osboxes44 pgpool[4395]: [233-1] 2018-02-26 13:48:20: pid
> 4395: LOG: Restart all children
> *Feb 26 13:48:20 osboxes44 pgpool[4395]: [234-1] 2018-02-26 13:48:20: pid
> 4395: LOG: execute command: /etc/pgpool-II-96/failover.sh 1 ""*
> Feb 26 13:48:21 osboxes44 pgpool[4395]: [235-1] 2018-02-26 13:48:21: pid
> 4395: LOG: find_primary_node_repeatedly: waiting for finding a primary node
> Feb 26 13:48:21 osboxes44 pgpool[4395]: [236-1] 2018-02-26 13:48:21: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:21 osboxes44 pgpool[4395]: [237-1] 2018-02-26 13:48:21: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:22 osboxes44 pgpool[4395]: [238-1] 2018-02-26 13:48:22: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:22 osboxes44 pgpool[4395]: [239-1] 2018-02-26 13:48:22: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:23 osboxes44 pgpool[4395]: [240-1] 2018-02-26 13:48:23: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:23 osboxes44 pgpool[4395]: [241-1] 2018-02-26 13:48:23: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:24 osboxes44 pgpool[4395]: [242-1] 2018-02-26 13:48:24: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:24 osboxes44 pgpool[4395]: [243-1] 2018-02-26 13:48:24: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:25 osboxes44 pgpool[4395]: [244-1] 2018-02-26 13:48:25: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:25 osboxes44 pgpool[4395]: [245-1] 2018-02-26 13:48:25: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:26 osboxes44 pgpool[4395]: [246-1] 2018-02-26 13:48:26: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:26 osboxes44 pgpool[4395]: [247-1] 2018-02-26 13:48:26: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:27 osboxes44 pgpool[4395]: [248-1] 2018-02-26 13:48:27: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:27 osboxes44 pgpool[4395]: [249-1] 2018-02-26 13:48:27: pid
> 4395: LOG: find_primary_node: checking backend no 1
> Feb 26 13:48:28 osboxes44 pgpool[4395]: [250-1] 2018-02-26 13:48:28: pid
> 4395: LOG: find_primary_node: checking backend no 0
> Feb 26 13:48:28 osboxes44 pgpool[4395]: [251-1] 2018-02-26 13:48:28: pid
> 4395: LOG: find_primary_node: checking backend no 1
>
>
> *Instance 3*
>
>
> Mar 1 00:43:22 osboxes44 pgpool[948]: [30-1] 2018-03-01 00:43:22: pid
> 948: LOG: remote node "osboxes75:5431 Linux osboxes75" is shutting down
> Mar 1 00:43:22 osboxes44 pgpool[948]: [31-1] 2018-03-01 00:43:22: pid
> 948: LOG: watchdog cluster has lost the coordinator node
> Mar 1 00:43:22 osboxes44 pgpool[948]: [32-1] 2018-03-01 00:43:22: pid
> 948: LOG: unassigning the remote node "osboxes75:5431 Linux osboxes75"
> from watchdog cluster master
> Mar 1 00:43:22 osboxes44 pgpool[948]: [33-1] 2018-03-01 00:43:22: pid
> 948: LOG: We have lost the cluster master node "osboxes75:5431 Linux
> osboxes75"
> Mar 1 00:43:22 osboxes44 pgpool[948]: [34-1] 2018-03-01 00:43:22: pid
> 948: LOG: watchdog node state changed from [STANDBY] to [JOINING]
> Mar 1 00:43:23 osboxes44 pgpool[1089]: [13-1] 2018-03-01 00:43:23: pid
> 1089: ERROR: failed to authenticate
> Mar 1 00:43:23 osboxes44 pgpool[1089]: [13-2] 2018-03-01 00:43:23: pid
> 1089: DETAIL: the database system is shutting down
> Mar 1 00:43:23 osboxes44 pgpool[1090]: [13-1] 2018-03-01 00:43:23: pid
> 1090: ERROR: failed to authenticate
> Mar 1 00:43:23 osboxes44 pgpool[1090]: [13-2] 2018-03-01 00:43:23: pid
> 1090: DETAIL: the database system is shutting down
> Mar 1 00:43:23 osboxes44 pgpool[1090]: [14-1] 2018-03-01 00:43:23: pid
> 1090: LOG: health check retrying on DB node: 1 (round:1)
> Mar 1 00:43:24 osboxes44 pgpool[1090]: [15-1] 2018-03-01 00:43:24: pid
> 1090: ERROR: failed to authenticate
> Mar 1 00:43:24 osboxes44 pgpool[1090]: [15-2] 2018-03-01 00:43:24: pid
> 1090: DETAIL: the database system is shutting down
> Mar 1 00:43:24 osboxes44 pgpool[1090]: [16-1] 2018-03-01 00:43:24: pid
> 1090: LOG: health check retrying on DB node: 1 (round:2)
> Mar 1 00:43:25 osboxes44 pgpool[1090]: [17-1] 2018-03-01 00:43:25: pid
> 1090: ERROR: failed to authenticate
> Mar 1 00:43:25 osboxes44 pgpool[1090]: [17-2] 2018-03-01 00:43:25: pid
> 1090: DETAIL: the database system is shutting down
> Mar 1 00:43:25 osboxes44 pgpool[1090]: [18-1] 2018-03-01 00:43:25: pid
> 1090: LOG: health check retrying on DB node: 1 (round:3)
> Mar 1 00:43:26 osboxes44 pgpool[948]: [35-1] 2018-03-01 00:43:26: pid
> 948: LOG: watchdog node state changed from [JOINING] to [INITIALIZING]
> Mar 1 00:43:26 osboxes44 pgpool[1090]: [19-1] 2018-03-01 00:43:26: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:43:26 osboxes44 pgpool[1090]: [20-1] 2018-03-01 00:43:26: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:26 osboxes44 pgpool[1090]: [20-2] 2018-03-01 00:43:26: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:26 osboxes44 pgpool[1090]: [21-1] 2018-03-01 00:43:26: pid
> 1090: LOG: health check retrying on DB node: 1 (round:4)
> Mar 1 00:43:27 osboxes44 pgpool[948]: [36-1] 2018-03-01 00:43:27: pid
> 948: LOG: I am the only alive node in the watchdog cluster
> Mar 1 00:43:27 osboxes44 pgpool[948]: [36-2] 2018-03-01 00:43:27: pid
> 948: HINT: skipping stand for coordinator state
> Mar 1 00:43:27 osboxes44 pgpool[948]: [37-1] 2018-03-01 00:43:27: pid
> 948: LOG: watchdog node state changed from [INITIALIZING] to [MASTER]
> Mar 1 00:43:27 osboxes44 pgpool[948]: [38-1] 2018-03-01 00:43:27: pid
> 948: LOG: I am announcing my self as master/coordinator watchdog node
> Mar 1 00:43:27 osboxes44 pgpool[1090]: [22-1] 2018-03-01 00:43:27: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:43:27 osboxes44 pgpool[1090]: [23-1] 2018-03-01 00:43:27: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:27 osboxes44 pgpool[1090]: [23-2] 2018-03-01 00:43:27: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:27 osboxes44 pgpool[1090]: [24-1] 2018-03-01 00:43:27: pid
> 1090: LOG: health check retrying on DB node: 1 (round:5)
> Mar 1 00:43:31 osboxes44 pgpool[948]: [39-1] 2018-03-01 00:43:31: pid
> 948: LOG: I am the cluster leader node
> Mar 1 00:43:31 osboxes44 pgpool[948]: [39-2] 2018-03-01 00:43:31: pid
> 948: DETAIL: our declare coordinator message is accepted by all nodes
> Mar 1 00:43:31 osboxes44 pgpool[948]: [40-1] 2018-03-01 00:43:31: pid
> 948: LOG: setting the local node "osboxes44:5431 Linux osboxes44" as
> watchdog cluster master
> Mar 1 00:43:31 osboxes44 pgpool[948]: [41-1] 2018-03-01 00:43:31: pid
> 948: LOG: I am the cluster leader node. Starting escalation process
> Mar 1 00:43:31 osboxes44 pgpool[948]: [42-1] 2018-03-01 00:43:31: pid
> 948: LOG: escalation process started with PID:5329
> Mar 1 00:43:31 osboxes44 pgpool[948]: [43-1] 2018-03-01 00:43:31: pid
> 948: LOG: new IPC connection received
> Mar 1 00:43:31 osboxes44 pgpool[5329]: [42-1] 2018-03-01 00:43:31: pid
> 5329: LOG: watchdog: escalation started
> Mar 1 00:43:31 osboxes44 pgpool[5329]: [43-1] 2018-03-01 00:43:31: pid
> 5329: LOG: failed to acquire the delegate IP address
> Mar 1 00:43:31 osboxes44 pgpool[5329]: [43-2] 2018-03-01 00:43:31: pid
> 5329: DETAIL: 'if_up_cmd' failed
> Mar 1 00:43:31 osboxes44 pgpool[5329]: [44-1] 2018-03-01 00:43:31: pid
> 5329: WARNING: watchdog escalation failed to acquire delegate IP
> Mar 1 00:43:31 osboxes44 pgpool[948]: [44-1] 2018-03-01 00:43:31: pid
> 948: LOG: watchdog escalation process with pid: 5329 exit with SUCCESS.
> Mar 1 00:43:38 osboxes44 pgpool[1090]: [25-1] 2018-03-01 00:43:38: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> timed out
> Mar 1 00:43:38 osboxes44 pgpool[1090]: [26-1] 2018-03-01 00:43:38: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:38 osboxes44 pgpool[1090]: [26-2] 2018-03-01 00:43:38: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:38 osboxes44 pgpool[1090]: [27-1] 2018-03-01 00:43:38: pid
> 1090: LOG: health check retrying on DB node: 1 (round:6)
> Mar 1 00:43:41 osboxes44 pgpool[1084]: [13-1] 2018-03-01 00:43:41: pid
> 1084: LOG: trying connecting to PostgreSQL server on "osboxes75:5432" by
> INET socket
> Mar 1 00:43:41 osboxes44 pgpool[1084]: [13-2] 2018-03-01 00:43:41: pid
> 1084: DETAIL: timed out. retrying...
> Mar 1 00:43:43 osboxes44 pgpool[1089]: [14-1] 2018-03-01 00:43:43: pid
> 1089: LOG: trying connecting to PostgreSQL server on "osboxes75:5432" by
> INET socket
> Mar 1 00:43:43 osboxes44 pgpool[1089]: [14-2] 2018-03-01 00:43:43: pid
> 1089: DETAIL: timed out. retrying...
> Mar 1 00:43:49 osboxes44 pgpool[1090]: [28-1] 2018-03-01 00:43:49: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> timed out
> Mar 1 00:43:49 osboxes44 pgpool[1090]: [29-1] 2018-03-01 00:43:49: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:49 osboxes44 pgpool[1090]: [29-2] 2018-03-01 00:43:49: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:49 osboxes44 pgpool[1090]: [30-1] 2018-03-01 00:43:49: pid
> 1090: LOG: health check retrying on DB node: 1 (round:7)
> Mar 1 00:43:51 osboxes44 pgpool[1084]: [14-1] 2018-03-01 00:43:51: pid
> 1084: LOG: trying connecting to PostgreSQL server on "osboxes75:5432" by
> INET socket
> Mar 1 00:43:51 osboxes44 pgpool[1084]: [14-2] 2018-03-01 00:43:51: pid
> 1084: DETAIL: timed out. retrying...
> Mar 1 00:43:53 osboxes44 pgpool[1046]: [12-1] 2018-03-01 00:43:53: pid
> 1046: LOG: informing the node status change to watchdog
> Mar 1 00:43:53 osboxes44 pgpool[1046]: [12-2] 2018-03-01 00:43:53: pid
> 1046: DETAIL: node id :1 status = "NODE DEAD" message:"No heartbeat signal
> from node"
> Mar 1 00:43:53 osboxes44 pgpool[948]: [45-1] 2018-03-01 00:43:53: pid
> 948: LOG: new IPC connection received
> Mar 1 00:43:53 osboxes44 pgpool[948]: [46-1] 2018-03-01 00:43:53: pid
> 948: LOG: received node status change ipc message
> Mar 1 00:43:53 osboxes44 pgpool[948]: [46-2] 2018-03-01 00:43:53: pid
> 948: DETAIL: No heartbeat signal from node
> Mar 1 00:43:53 osboxes44 pgpool[948]: [47-1] 2018-03-01 00:43:53: pid
> 948: LOG: remote node "osboxes75:5431 Linux osboxes75" is shutting down
> Mar 1 00:43:53 osboxes44 pgpool[1089]: [15-1] 2018-03-01 00:43:53: pid
> 1089: LOG: trying connecting to PostgreSQL server on "osboxes75:5432" by
> INET socket
> Mar 1 00:43:53 osboxes44 pgpool[1089]: [15-2] 2018-03-01 00:43:53: pid
> 1089: DETAIL: timed out. retrying...
> Mar 1 00:43:57 osboxes44 pgpool[1090]: [31-1] 2018-03-01 00:43:57: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:43:57 osboxes44 pgpool[1090]: [32-1] 2018-03-01 00:43:57: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:57 osboxes44 pgpool[1090]: [32-2] 2018-03-01 00:43:57: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:57 osboxes44 pgpool[1090]: [33-1] 2018-03-01 00:43:57: pid
> 1090: LOG: health check retrying on DB node: 1 (round:8)
> Mar 1 00:43:58 osboxes44 pgpool[1090]: [34-1] 2018-03-01 00:43:58: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:43:58 osboxes44 pgpool[1090]: [35-1] 2018-03-01 00:43:58: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:58 osboxes44 pgpool[1090]: [35-2] 2018-03-01 00:43:58: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:58 osboxes44 pgpool[1090]: [36-1] 2018-03-01 00:43:58: pid
> 1090: LOG: health check retrying on DB node: 1 (round:9)
> Mar 1 00:43:59 osboxes44 pgpool[1090]: [37-1] 2018-03-01 00:43:59: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:43:59 osboxes44 pgpool[1090]: [38-1] 2018-03-01 00:43:59: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:43:59 osboxes44 pgpool[1090]: [38-2] 2018-03-01 00:43:59: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:43:59 osboxes44 pgpool[1090]: [39-1] 2018-03-01 00:43:59: pid
> 1090: LOG: health check retrying on DB node: 1 (round:10)
> Mar 1 00:44:00 osboxes44 pgpool[1090]: [40-1] 2018-03-01 00:44:00: pid
> 1090: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:44:00 osboxes44 pgpool[1090]: [41-1] 2018-03-01 00:44:00: pid
> 1090: ERROR: failed to make persistent db connection
> Mar 1 00:44:00 osboxes44 pgpool[1090]: [41-2] 2018-03-01 00:44:00: pid
> 1090: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:44:00 osboxes44 pgpool[1090]: [42-1] 2018-03-01 00:44:00: pid
> 1090: LOG: health check failed on node 1 (timeout:0)
> Mar 1 00:44:00 osboxes44 pgpool[1090]: [43-1] 2018-03-01 00:44:00: pid
> 1090: LOG: received degenerate backend request for node_id: 1 from pid
> [1090]
> Mar 1 00:44:00 osboxes44 pgpool[948]: [48-1] 2018-03-01 00:44:00: pid
> 948: LOG: new IPC connection received
> Mar 1 00:44:00 osboxes44 pgpool[948]: [49-1] 2018-03-01 00:44:00: pid
> 948: LOG: watchdog received the failover command from local pgpool-II on
> IPC interface
> Mar 1 00:44:00 osboxes44 pgpool[948]: [50-1] 2018-03-01 00:44:00: pid
> 948: LOG: watchdog is processing the failover command
> [DEGENERATE_BACKEND_REQUEST] received from local pgpool-II on IPC interface
> Mar 1 00:44:00 osboxes44 pgpool[948]: [51-1] 2018-03-01 00:44:00: pid
> 948: LOG: we have got the consensus to perform the failover
> Mar 1 00:44:00 osboxes44 pgpool[948]: [51-2] 2018-03-01 00:44:00: pid
> 948: DETAIL: 1 node(s) voted in the favor
> Mar 1 00:44:00 osboxes44 pgpool[905]: [20-1] 2018-03-01 00:44:00: pid
> 905: LOG: Pgpool-II parent process has received failover request
> Mar 1 00:44:00 osboxes44 pgpool[948]: [52-1] 2018-03-01 00:44:00: pid
> 948: LOG: new IPC connection received
> Mar 1 00:44:00 osboxes44 pgpool[948]: [53-1] 2018-03-01 00:44:00: pid
> 948: LOG: received the failover indication from Pgpool-II on IPC interface
> Mar 1 00:44:00 osboxes44 pgpool[948]: [54-1] 2018-03-01 00:44:00: pid
> 948: LOG: watchdog is informed of failover end by the main process
> Mar 1 00:44:00 osboxes44 pgpool[905]: [21-1] 2018-03-01 00:44:00: pid
> 905: LOG: starting degeneration. shutdown host osboxes75(5432)
> Mar 1 00:44:00 osboxes44 pgpool[905]: [22-1] 2018-03-01 00:44:00: pid
> 905: WARNING: All the DB nodes are in down status and skip writing status
> file.
> Mar 1 00:44:00 osboxes44 pgpool[905]: [23-1] 2018-03-01 00:44:00: pid
> 905: LOG: failover: no valid backends node found
> Mar 1 00:44:00 osboxes44 pgpool[905]: [24-1] 2018-03-01 00:44:00: pid
> 905: LOG: Restart all children
> *Mar 1 00:44:00 osboxes44 pgpool[905]: [25-1] 2018-03-01 00:44:00: pid
> 905: LOG: execute command: /etc/pgpool-II-96/failover.sh 1 ""*
> Mar 1 00:44:01 osboxes44 pgpool[905]: [26-1] 2018-03-01 00:44:01: pid
> 905: LOG: find_primary_node_repeatedly: waiting for finding a primary node
> Mar 1 00:44:01 osboxes44 pgpool[905]: [27-1] 2018-03-01 00:44:01: pid
> 905: LOG: find_primary_node: checking backend no 0
> Mar 1 00:44:01 osboxes44 pgpool[905]: [28-1] 2018-03-01 00:44:01: pid
> 905: LOG: find_primary_node: checking backend no 1
> Mar 1 00:44:02 osboxes44 pgpool[905]: [29-1] 2018-03-01 00:44:02: pid
> 905: LOG: find_primary_node: checking backend no 0
> Mar 1 00:44:02 osboxes44 pgpool[905]: [30-1] 2018-03-01 00:44:02: pid
> 905: LOG: find_primary_node: checking backend no 1
> Mar 1 00:44:03 osboxes44 pgpool[905]: [31-1] 2018-03-01 00:44:03: pid
> 905: LOG: find_primary_node: checking backend no 0
> Mar 1 00:44:03 osboxes44 pgpool[905]: [32-1] 2018-03-01 00:44:03: pid
> 905: LOG: find_primary_node: checking backend no 1
> Mar 1 00:44:03 osboxes44 pgpool[1089]: [16-1] 2018-03-01 00:44:03: pid
> 1089: LOG: trying connecting to PostgreSQL server on "osboxes75:5432" by
> INET socket
> Mar 1 00:44:03 osboxes44 pgpool[1089]: [16-2] 2018-03-01 00:44:03: pid
> 1089: DETAIL: timed out. retrying...
> Mar 1 00:44:04 osboxes44 pgpool[905]: [33-1] 2018-03-01 00:44:04: pid
> 905: LOG: find_primary_node: checking backend no 0
> Mar 1 00:44:04 osboxes44 pgpool[905]: [34-1] 2018-03-01 00:44:04: pid
> 905: LOG: find_primary_node: checking backend no 1
> Mar 1 00:44:04 osboxes44 pgpool[1089]: [17-1] 2018-03-01 00:44:04: pid
> 1089: LOG: failed to connect to PostgreSQL server on "osboxes75:5432",
> getsockopt() detected error "Connection refused"
> Mar 1 00:44:04 osboxes44 pgpool[1089]: [18-1] 2018-03-01 00:44:04: pid
> 1089: ERROR: failed to make persistent db connection
> Mar 1 00:44:04 osboxes44 pgpool[1089]: [18-2] 2018-03-01 00:44:04: pid
> 1089: DETAIL: connection to host:"osboxes75:5432" failed
> Mar 1 00:44:05 osboxes44 pgpool[948]: [55-1] 2018-03-01 00:44:05: pid
> 948: LOG: new watchdog node connection is received from "
> 192.168.0.6:52436"
> Mar 1 00:44:05 osboxes44 pgpool[948]: [56-1] 2018-03-01 00:44:05: pid
> 948: LOG: new node joined the cluster hostname:"osboxes75" port:9000
> pgpool_port:5431
> Mar 1 00:44:05 osboxes44 pgpool[948]: [57-1] 2018-03-01 00:44:05: pid
> 948: LOG: new outbound connection to osboxes75:9000
> Mar 1 00:44:05 osboxes44 pgpool[905]: [35-1] 2018-03-01 00:44:05: pid
> 905: LOG: find_primary_node: checking backend no 0
> Mar 1 00:44:05 osboxes44 pgpool[905]: [36-1] 2018-03-01 00:44:05: pid
> 905: LOG: find_primary_node: checking backend no 1
> Mar 1 00:44:06 osboxes44 pgpool[948]: [58-1] 2018-03-01 00:44:06: pid
> 948: LOG: adding watchdog node "osboxes75:5431 Linux osboxes75" to the
> standby list
> Mar 1 00:44:06 osboxes44 pgpool[905]: [37-1] 2018-03-01 00:44:06: pid
> 905: LOG: find_primary_node: checking backend no 0
> Mar 1 00:44:06 osboxes44 pgpool[905]: [38-1] 2018-03-01 00:44:06: pid
> 905: LOG: find_primary_node: checking backend no 1
> Mar 1 00:44:06 osboxes44 pgpool[905]: [39-1] 2018-03-01 00:44:06: pid
> 905: LOG: Pgpool-II parent process received watchdog quorum change signal
> from watchdog
> Mar 1 00:44:06 osboxes44 pgpool[948]: [59-1] 2018-03-01 00:44:06: pid
> 948: LOG: new IPC connection received
> Mar 1 00:44:06 osboxes44 pgpool[905]: [40-1] 2018-03-01 00:44:06: pid
> 905: LOG: watchdog cluster now holds the quorum
> Mar 1 00:44:06 osboxes44 pgpool[905]: [40-2] 2018-03-01 00:44:06: pid
> 905: DETAIL: updating the state of quarantine backend nodes
> Mar 1 00:44:06 osboxes44 pgpool[948]: [60-1] 2018-03-01 00:44:06: pid
> 948: LOG: new IPC connection received
>
>
>
>
> pgpool conf is mentioned below
>
>
> # ----------------------------
> # pgPool-II configuration file
> # ----------------------------
> #
> # This file consists of lines of the form:
> #
> # name = value
> #
> # Whitespace may be used. Comments are introduced with "#" anywhere on a
> line.
> # The complete list of parameter names and allowed values can be found in
> the
> # pgPool-II documentation.
> #
> # This file is read on server startup and when the server receives a SIGHUP
> # signal. If you edit the file on a running system, you have to SIGHUP the
> # server for the changes to take effect, or use "pgpool reload". Some
> # parameters, which are marked below, require a server shutdown and
> restart to
> # take effect.
> #
>
>
> #-----------------------------------------------------------
> -------------------
> # CONNECTIONS
> #-----------------------------------------------------------
> -------------------
>
> # - pgpool Connection Settings -
>
> listen_addresses = 'localhost'
> # Host name or IP address to listen on:
> # '*' for all, '' for no TCP/IP
> connections
> # (change requires restart)
> port = 9999
> # Port number
> # (change requires restart)
> socket_dir = '/tmp'
> # Unix domain socket path
> # The Debian package defaults to
> # /var/run/postgresql
> # (change requires restart)
>
>
> # - pgpool Communication Manager Connection Settings -
>
> pcp_listen_addresses = '*'
> # Host name or IP address for pcp
> process to listen on:
> # '*' for all, '' for no TCP/IP
> connections
> # (change requires restart)
> pcp_port = 9898
> # Port number for pcp
> # (change requires restart)
> pcp_socket_dir = '/tmp'
> # Unix domain socket path for pcp
> # The Debian package defaults to
> # /var/run/postgresql
> # (change requires restart)
> listen_backlog_multiplier = 2
> # Set the backlog parameter of
> listen(2) to
> #
> num_init_children * listen_backlog_multiplier.
> # (change requires restart)
> serialize_accept = off
> # whether to serialize accept() call
> to avoid thundering herd problem
> # (change requires restart)
>
> # - Backend Connection Settings -
>
> backend_hostname0 = 'host1'
> # Host name or IP address to connect
> to for backend 0
> backend_port0 = 5432
> # Port number for backend 0
> backend_weight0 = 1
> # Weight for backend 0 (only in load
> balancing mode)
> backend_data_directory0 = '/data'
> # Data directory for backend 0
> backend_flag0 = 'ALLOW_TO_FAILOVER'
> # Controls various backend behavior
> # ALLOW_TO_FAILOVER,
> DISALLOW_TO_FAILOVER
> # or ALWAYS_MASTER
> #backend_hostname1 = 'host2'
> #backend_port1 = 5433
> #backend_weight1 = 1
> #backend_data_directory1 = '/data1'
> #backend_flag1 = 'ALLOW_TO_FAILOVER'
>
> # - Authentication -
>
> enable_pool_hba = off
> # Use pool_hba.conf for client
> authentication
> pool_passwd = 'pool_passwd'
> # File name of pool_passwd for md5
> authentication.
> # "" disables pool_passwd.
> # (change requires restart)
> authentication_timeout = 60
> # Delay in seconds to complete client
> authentication
> # 0 means no timeout.
>
> # - SSL Connections -
>
> ssl = off
> # Enable SSL support
> # (change requires restart)
> #ssl_key = './server.key'
> # Path to the SSL private key file
> # (change requires restart)
> #ssl_cert = './server.cert'
> # Path to the SSL public certificate
> file
> # (change requires restart)
> #ssl_ca_cert = ''
> # Path to a single PEM format file
> # containing CA root certificate(s)
> # (change requires restart)
> #ssl_ca_cert_dir = ''
> # Directory containing CA root
> certificate(s)
> # (change requires restart)
>
>
> #-----------------------------------------------------------
> -------------------
> # POOLS
> #-----------------------------------------------------------
> -------------------
>
> # - Concurrent session and pool size -
>
> num_init_children = 32
> # Number of concurrent sessions allowed
> # (change requires restart)
> max_pool = 4
> # Number of connection pool caches per
> connection
> # (change requires restart)
>
> # - Life time -
>
> child_life_time = 300
> # Pool exits after being idle for this
> many seconds
> child_max_connections = 0
> # Pool exits after receiving that many
> connections
> # 0 means no exit
> connection_life_time = 0
> # Connection to backend closes after
> being idle for this many seconds
> # 0 means no close
> client_idle_limit = 0
> # Client is disconnected after being
> idle for that many seconds
> # (even inside an explicit
> transactions!)
> # 0 means no disconnection
>
>
> #-----------------------------------------------------------
> -------------------
> # LOGS
> #-----------------------------------------------------------
> -------------------
>
> # - Where to log -
>
> log_destination = 'stderr'
> # Where to log
> # Valid values are combinations of
> stderr,
> # and syslog. Default to stderr.
>
> # - What to log -
>
> log_line_prefix = '%t: pid %p: ' # printf-style string to output at
> beginning of each log line.
>
> log_connections = off
> # Log connections
> log_hostname = off
> # Hostname will be shown in ps status
> # and in logs if connections are logged
> log_statement = off
> # Log all statements
> log_per_node_statement = off
> # Log all statements
> # with node and backend informations
> log_standby_delay = 'if_over_threshold'
> # Log standby delay
> # Valid values are combinations of
> always,
> # if_over_threshold, none
>
> # - Syslog specific -
>
> syslog_facility = 'LOCAL0'
> # Syslog local facility. Default to
> LOCAL0
> syslog_ident = 'pgpool'
> # Syslog program identification string
> # Default to 'pgpool'
>
> # - Debug -
>
> #log_error_verbosity = default # terse, default, or verbose
> messages
>
> #client_min_messages = notice # values in order of decreasing
> detail:
> # debug5
> # debug4
> # debug3
> # debug2
> # debug1
> # log
> # notice
> # warning
> # error
>
> #log_min_messages = warning # values in order of decreasing
> detail:
> # debug5
> # debug4
> # debug3
> # debug2
> # debug1
> # info
> # notice
> # warning
> # error
> # log
> # fatal
> # panic
>
> #-----------------------------------------------------------
> -------------------
> # FILE LOCATIONS
> #-----------------------------------------------------------
> -------------------
>
> pid_file_name = '/var/run/pgpool/pgpool.pid'
> # PID file name
> # Can be specified as relative to the"
> # location of pgpool.conf file or
> # as an absolute path
> # (change requires restart)
> logdir = '/tmp'
> # Directory of pgPool status file
> # (change requires restart)
>
>
> #-----------------------------------------------------------
> -------------------
> # CONNECTION POOLING
> #-----------------------------------------------------------
> -------------------
>
> connection_cache = on
> # Activate connection pools
> # (change requires restart)
>
> # Semicolon separated list of queries
> # to be issued at the end of a session
> # The default is for 8.3 and later
> reset_query_list = 'ABORT; DISCARD ALL'
> # The following one is for 8.2 and
> before
> #reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
>
>
> #-----------------------------------------------------------
> -------------------
> # REPLICATION MODE
> #-----------------------------------------------------------
> -------------------
>
> replication_mode = off
> # Activate replication mode
> # (change requires restart)
> replicate_select = off
> # Replicate SELECT statements
> # when in replication mode
> # replicate_select is higher priority
> than
> # load_balance_mode.
>
> insert_lock = off
> # Automatically locks a dummy row or a
> table
> # with INSERT statements to keep
> SERIAL data
> # consistency
> # Without SERIAL, no lock will be
> issued
> lobj_lock_table = ''
> # When rewriting lo_creat command in
> # replication mode, specify table name
> to
> # lock
>
> # - Degenerate handling -
>
> replication_stop_on_mismatch = off
> # On disagreement with the packet kind
> # sent from backend, degenerate the
> node
> # which is most likely "minority"
> # If off, just force to exit this
> session
>
> failover_if_affected_tuples_mismatch = off
> # On disagreement with the number of
> affected
> # tuples in UPDATE/DELETE queries, then
> # degenerate the node which is most
> likely
> # "minority".
> # If off, just abort the transaction to
> # keep the consistency
>
>
> #-----------------------------------------------------------
> -------------------
> # LOAD BALANCING MODE
> #-----------------------------------------------------------
> -------------------
>
> load_balance_mode = on
> # Activate load balancing mode
> # (change requires restart)
> ignore_leading_white_space = on
> # Ignore leading white spaces of each
> query
> white_function_list = ''
> # Comma separated list of function
> names
> # that don't write to database
> # Regexp are accepted
> black_function_list = 'currval,lastval,nextval,setval'
> # Comma separated list of function
> names
> # that write to database
> # Regexp are accepted
>
> database_redirect_preference_list = ''
> #
> comma separated list of pairs of database and node id.
> #
> example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2'
> #
> valid for streaming replicaton mode only.
>
> app_name_redirect_preference_list = ''
> #
> comma separated list of pairs of app name and node id.
> #
> example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby'
> #
> valid for streaming replicaton mode only.
> allow_sql_comments = off
> # if
> on, ignore SQL comments when judging if load balance or
> #
> query cache is possible.
> # If
> off, SQL comments effectively prevent the judgment
> # (pre
> 3.4 behavior).
>
> #-----------------------------------------------------------
> -------------------
> # MASTER/SLAVE MODE
> #-----------------------------------------------------------
> -------------------
>
> master_slave_mode = on
> # Activate master/slave mode
> # (change requires restart)
> master_slave_sub_mode = 'stream'
> # Master/slave sub mode
> # Valid values are combinations
> stream, slony
> # or logical. Default is stream.
> # (change requires restart)
>
> # - Streaming -
>
> sr_check_period = 10
> # Streaming replication check period
> # Disabled (0) by default
> sr_check_user = 'nobody'
> # Streaming replication check user
> # This is neccessary even if you
> disable streaming
> # replication delay check by
> sr_check_period = 0
> sr_check_password = ''
> # Password for streaming replication
> check user
> sr_check_database = 'postgres'
> # Database name for streaming
> replication check
> delay_threshold = 10000000
> # Threshold before not dispatching
> query to standby node
> # Unit is in bytes
> # Disabled (0) by default
>
> # - Special commands -
>
> follow_master_command = ''
> # Executes this command after master
> failover
> # Special values:
> # %d = node id
> # %h = host name
> # %p = port number
> # %D = database cluster path
> # %m = new master node id
> # %H = hostname of the new master
> node
> # %M = old master node id
> # %P = old primary node id
> # %r
> = new master port number
> # %R
> = new master database cluster path
> # %% = '%' character
>
> #-----------------------------------------------------------
> -------------------
> # HEALTH CHECK GLOBAL PARAMETERS
> #-----------------------------------------------------------
> -------------------
>
> health_check_period = 0
> # Health check period
> # Disabled (0) by default
> health_check_timeout = 20
> # Health check timeout
> # 0 means no timeout
> health_check_user = 'nobody'
> # Health check user
> health_check_password = ''
> # Password for health check user
> health_check_database = ''
> # Database name for health check. If
> '', tries 'postgres' frist,
> health_check_max_retries = 0
> # Maximum number of times to retry a
> failed health check before giving up.
> health_check_retry_delay = 1
> # Amount of time to wait (in seconds)
> between retries.
> connect_timeout = 10000
> # Timeout value in milliseconds before
> giving up to connect to backend.
> #
> Default is 10000 ms (10 second). Flaky network user may want to increase
> # the
> value. 0 means no timeout.
> # Note
> that this value is not only used for health check,
> # but
> also for ordinary conection to backend.
>
> #-----------------------------------------------------------
> -------------------
> # HEALTH CHECK PER NODE PARAMETERS (OPTIONAL)
> #-----------------------------------------------------------
> -------------------
> health_check_period0 = 0
> health_check_timeout0 = 20
> health_check_user0 = 'nobody'
> health_check_password0 = ''
> health_check_database0 = ''
> health_check_max_retries0 = 0
> health_check_retry_delay0 = 1
> connect_timeout0 = 10000
>
> #-----------------------------------------------------------
> -------------------
> # FAILOVER AND FAILBACK
> #-----------------------------------------------------------
> -------------------
>
> failover_command = ''
> # Executes this command at failover
> # Special values:
> # %d = node id
> # %h = host name
> # %p = port number
> # %D = database cluster path
> # %m = new master node id
> # %H = hostname of the new master
> node
> # %M = old master node id
> # %P = old primary node id
> # %r
> = new master port number
> # %R
> = new master database cluster path
> # %% = '%' character
> failback_command = ''
> # Executes this command at failback.
> # Special values:
> # %d = node id
> # %h = host name
> # %p = port number
> # %D = database cluster path
> # %m = new master node id
> # %H = hostname of the new master
> node
> # %M = old master node id
> # %P = old primary node id
> # %r
> = new master port number
> # %R
> = new master database cluster path
> # %% = '%' character
>
> fail_over_on_backend_error = on
> # Initiates failover when
> reading/writing to the
> # backend communication socket fails
> # If set to off, pgpool will report an
> # error and disconnect the session.
>
> search_primary_node_timeout = 300
> # Timeout in seconds to search for the
> # primary node when a failover occurs.
> # 0 means no timeout, keep searching
> # for a primary node forever.
>
> #-----------------------------------------------------------
> -------------------
> # ONLINE RECOVERY
> #-----------------------------------------------------------
> -------------------
>
> recovery_user = 'nobody'
> # Online recovery user
> recovery_password = ''
> # Online recovery password
> recovery_1st_stage_command = ''
> # Executes a command in first stage
> recovery_2nd_stage_command = ''
> # Executes a command in second stage
> recovery_timeout = 90
> # Timeout in seconds to wait for the
> # recovering node's postmaster to
> start up
> # 0 means no wait
> client_idle_limit_in_recovery = 0
> # Client is disconnected after being
> idle
> # for that many seconds in the second
> stage
> # of online recovery
> # 0 means no disconnection
> # -1 means immediate disconnection
>
>
> #-----------------------------------------------------------
> -------------------
> # WATCHDOG
> #-----------------------------------------------------------
> -------------------
>
> # - Enabling -
>
> use_watchdog = off
> # Activates watchdog
> # (change requires restart)
>
> # -Connection to up stream servers -
>
> trusted_servers = ''
> # trusted server list which are used
> # to confirm network connection
> # (hostA,hostB,hostC,...)
> # (change requires restart)
> ping_path = '/bin'
> # ping command path
> # (change requires restart)
>
> # - Watchdog communication Settings -
>
> wd_hostname = ''
> # Host name or IP address of this
> watchdog
> # (change requires restart)
> wd_port = 9000
> # port number for watchdog service
> # (change requires restart)
> wd_priority = 1
> #
> priority of this watchdog in leader election
> #
> (change requires restart)
>
> wd_authkey = ''
> # Authentication key for watchdog
> communication
> # (change requires restart)
>
> wd_ipc_socket_dir = '/tmp'
> #
> Unix domain socket path for watchdog IPC socket
> #
> The Debian package defaults to
> #
> /var/run/postgresql
> #
> (change requires restart)
>
>
> # - Virtual IP control Setting -
>
> delegate_IP = ''
> # delegate IP address
> # If this is empty, virtual IP never
> bring up.
> # (change requires restart)
> if_cmd_path = '/sbin'
> # path to the directory where
> if_up/down_cmd exists
> # (change requires restart)
> if_up_cmd = 'ip addr add $_IP_$/24 dev eth0 label eth0:0'
> # startup delegate IP command
> # (change requires restart)
> if_down_cmd = 'ip addr del $_IP_$/24 dev eth0'
> # shutdown delegate IP command
> # (change requires restart)
> arping_path = '/usr/sbin'
> # arping command path
> # (change requires restart)
> arping_cmd = 'arping -U $_IP_$ -w 1'
> # arping command
> # (change requires restart)
>
> # - Behaivor on escalation Setting -
>
> clear_memqcache_on_escalation = on
> # Clear all the query cache on shared
> memory
> # when standby pgpool escalate to
> active pgpool
> # (= virtual IP holder).
> # This should be off if client
> connects to pgpool
> # not using virtual IP.
> # (change requires restart)
> wd_escalation_command = ''
> # Executes this command at escalation
> on new active pgpool.
> # (change requires restart)
> wd_de_escalation_command = ''
> #
> Executes this command when master pgpool resigns from being master.
> #
> (change requires restart)
>
> # - Watchdog consensus settings for failover -
>
> failover_when_quorum_exists = on
> #
> Only perform backend node failover
> #
> when the watchdog cluster holds the quorum
> #
> (change requires restart)
>
> failover_require_consensus = on
> #
> Perform failover when majority of Pgpool-II nodes
> #
> aggrees on the backend node status change
> #
> (change requires restart)
>
> allow_multiple_failover_requests_from_node = off
> #
> A Pgpool-II node can cast multiple votes
> #
> for building the consensus on failover
> #
> (change requires restart)
>
>
> # - Lifecheck Setting -
>
> # -- common --
>
> wd_monitoring_interfaces_list = '' # Comma separated list of interfaces
> names to monitor.
> #
> if any interface from the list is active the watchdog will
> #
> consider the network is fine
> #
> 'any' to enable monitoring on all interfaces except loopback
> #
> '' to disable monitoring
> #
> (change requires restart)
>
> wd_lifecheck_method = 'heartbeat'
> # Method of watchdog lifecheck
> ('heartbeat' or 'query' or 'external')
> # (change requires restart)
> wd_interval = 10
> # lifecheck interval (sec) > 0
> # (change requires restart)
>
> # -- heartbeat mode --
>
> wd_heartbeat_port = 9694
> # Port number for receiving heartbeat
> signal
> # (change requires restart)
> wd_heartbeat_keepalive = 2
> # Interval time of sending heartbeat
> signal (sec)
> # (change requires restart)
> wd_heartbeat_deadtime = 30
> # Deadtime interval for heartbeat
> signal (sec)
> # (change requires restart)
> heartbeat_destination0 = 'host0_ip1'
> # Host name or IP address of
> destination 0
> # for sending heartbeat signal.
> # (change requires restart)
> heartbeat_destination_port0 = 9694
> # Port number of destination 0 for
> sending
> # heartbeat signal. Usually this is
> the
> # same as wd_heartbeat_port.
> # (change requires restart)
> heartbeat_device0 = ''
> # Name of NIC device (such like
> 'eth0')
> # used for sending/receiving heartbeat
> # signal to/from destination 0.
> # This works only when this is not
> empty
> # and pgpool has root privilege.
> # (change requires restart)
>
> #heartbeat_destination1 = 'host0_ip2'
> #heartbeat_destination_port1 = 9694
> #heartbeat_device1 = ''
>
> # -- query mode --
>
> wd_life_point = 3
> # lifecheck retry times
> # (change requires restart)
> wd_lifecheck_query = 'SELECT 1'
> # lifecheck query to pgpool from
> watchdog
> # (change requires restart)
> wd_lifecheck_dbname = 'template1'
> # Database name connected for
> lifecheck
> # (change requires restart)
> wd_lifecheck_user = 'nobody'
> # watchdog user monitoring pgpools in
> lifecheck
> # (change requires restart)
> wd_lifecheck_password = ''
> # Password for watchdog user in
> lifecheck
> # (change requires restart)
>
> # - Other pgpool Connection Settings -
>
> #other_pgpool_hostname0 = 'host0'
> # Host name or IP address to connect
> to for other pgpool 0
> # (change requires restart)
> #other_pgpool_port0 = 5432
> # Port number for other pgpool 0
> # (change requires restart)
> #other_wd_port0 = 9000
> # Port number for other watchdog 0
> # (change requires restart)
> #other_pgpool_hostname1 = 'host1'
> #other_pgpool_port1 = 5432
> #other_wd_port1 = 9000
>
>
> #-----------------------------------------------------------
> -------------------
> # OTHERS
> #-----------------------------------------------------------
> -------------------
> relcache_expire = 0
> # Life time of relation cache in
> seconds.
> # 0 means no cache expiration(the
> default).
> # The relation cache is used for cache
> the
> # query result against PostgreSQL
> system
> # catalog to obtain various information
> # including table structures or if
> it's a
> # temporary table or not. The cache is
> # maintained in a pgpool child local
> memory
> # and being kept as long as it
> survives.
> # If someone modify the table by using
> # ALTER TABLE or some such, the
> relcache is
> # not consistent anymore.
> # For this purpose, cache_expiration
> # controls the life time of the cache.
> relcache_size = 256
> # Number of relation cache
> # entry. If you see frequently:
> #
> "pool_search_relcache: cache replacement happend"
> # in
> the pgpool log, you might want to increate this number.
>
> check_temp_table = on
> # If on, enable temporary table check
> in SELECT statements.
> # This initiates queries against
> system catalog of primary/master
> # thus
> increases load of master.
> # If
> you are absolutely sure that your system never uses temporary tables
> # and
> you want to save access to primary/master, you could turn this off.
> #
> Default is on.
>
> check_unlogged_table = on
> # If on, enable unlogged table check
> in SELECT statements.
> # This initiates queries against
> system catalog of primary/master
> # thus increases load of master.
> # If you are absolutely sure that your
> system never uses unlogged tables
> # and you want to save access to
> primary/master, you could turn this off.
> # Default is on.
>
> #-----------------------------------------------------------
> -------------------
> # IN MEMORY QUERY MEMORY CACHE
> #-----------------------------------------------------------
> -------------------
> memory_cache_enabled = off
> # If
> on, use the memory cache functionality, off by default
> memqcache_method = 'shmem'
> #
> Cache storage method. either 'shmem'(shared memory) or
> #
> 'memcached'. 'shmem' by default
> # (change requires restart)
> memqcache_memcached_host = 'localhost'
> #
> Memcached host name or IP address. Mandatory if
> #
> memqcache_method = 'memcached'.
> #
> Defaults to localhost.
> # (change requires restart)
> memqcache_memcached_port = 11211
> #
> Memcached port number. Mondatory if memqcache_method = 'memcached'.
> #
> Defaults to 11211.
> # (change requires restart)
> memqcache_total_size = 67108864
> #
> Total memory size in bytes for storing memory cache.
> #
> Mandatory if memqcache_method = 'shmem'.
> #
> Defaults to 64MB.
> # (change requires restart)
> memqcache_max_num_cache = 1000000
> #
> Total number of cache entries. Mandatory
> # if
> memqcache_method = 'shmem'.
> # Each
> cache entry consumes 48 bytes on shared memory.
> #
> Defaults to 1,000,000(45.8MB).
> # (change requires restart)
> memqcache_expire = 0
> #
> Memory cache entry life time specified in seconds.
> # 0
> means infinite life time. 0 by default.
> # (change requires restart)
> memqcache_auto_cache_invalidation = on
> # If
> on, invalidation of query cache is triggered by corresponding
> #
> DDL/DML/DCL(and memqcache_expire). If off, it is only triggered
> # by
> memqcache_expire. on by default.
> # (change requires restart)
> memqcache_maxcache = 409600
> #
> Maximum SELECT result size in bytes.
> # Must
> be smaller than memqcache_cache_block_size. Defaults to 400KB.
> # (change requires restart)
> memqcache_cache_block_size = 1048576
> #
> Cache block size in bytes. Mandatory if memqcache_method = 'shmem'.
> #
> Defaults to 1MB.
> # (change requires restart)
> memqcache_oiddir = '/var/log/pgpool/oiddir'
> #
> Temporary work directory to record table oids
> # (change requires restart)
> white_memqcache_table_list = ''
> # Comma separated list of table names
> to memcache
> # that don't write to database
> # Regexp are accepted
> black_memqcache_table_list = ''
> # Comma separated list of table names
> not to memcache
> # that don't write to database
> # Regexp are accepted
>
> listen_addresses = '*'
> port = 5431
> backend_hostname0 = 'osboxes44'
> backend_port0 = 5432
> backend_weight0 = 1
> backend_data_directory0 = '/var/lib/pgsql/9.6/data'
> backend_flag0 = 'ALLOW_TO_FAILOVER'
> backend_hostname1 = 'osboxes75'
> backend_port1 = 5432
> backend_weight1 = 1
> backend_data_directory1 = '/var/lib/pgsql/9.6/data'
> backend_flag1 = 'ALLOW_TO_FAILOVER'
> enable_pool_hba = on
> pid_file_name = '/var/run/pgpool-II-96/pgpool.pid'
> sr_check_user = 'pgpool'
> sr_check_password = 'secret'
> health_check_period = 10
> health_check_user = 'pgpool'
> health_check_password = 'secret'
> failover_command = '/etc/pgpool-II-96/failover.sh %d %H'
> recovery_user = 'pgpool'
> recovery_password = 'secret'
> recovery_1st_stage_command = 'basebackup.sh'
> log_destination = 'syslog,stderr'
> client_min_messages = log
> log_min_messages = info
> health_check_max_retries = 10
> socket_dir = '/var/run/pgpool-II-96'
>
> use_watchdog = on
> delegate_IP = '192.168.0.200'
> wd_hostname = 'osboxes44'
> wd_port = 9000
> ifconfig_path = '/usr/sbin'
> arping_path = '/usr/sbin'
> wd_lifecheck_method = 'heartbeat'
> wd_interval = 5
> wd_heartbeat_port = 9694
> heartbeat_destination0 = 'osboxes75'
> heartbeat_destination_port0 = 9694
> other_pgpool_hostname0 = 'osboxes75'
> other_pgpool_port0 = 5431
> other_wd_port0 = 9000
> load_balance_mode = off
> if_up_cmd = 'ip addr add $_IP_$/24 dev enp0s3:1 label enp0s3:1'
> if_down_cmd = 'ip addr del $_IP_$/24 dev enp0s3:1'
> pcp_socket_dir = '/var/run/pgpool-II-96'
> wd_ipc_socket_dir = '/var/run/pgpool-II-96'
>
>
>
> Thanks
> Pankaj Joshi
>
>
>
>
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20180305/dd7e1968/attachment.htm>
More information about the pgpool-general
mailing list