0000720: Connection timed out for node of cluster

ID	Project	Category	View Status	Date Submitted	Last Update

0000720	Pgpool-II	General	public	2021-06-25 17:53	2021-07-12 17:37

Reporter	Ken	Assigned To	pengbo
Priority	normal	Severity	minor	Reproducibility	always
Status	closed	Resolution	open
Product Version	4.2.3

Summary	0000720: Connection timed out for node of cluster
Description	Hello I have cluseter make of 3 nodes, 2 nodes db (10.234.226.66 and 10.234.234.66) and 1 witness (10.234.227.161). I noticed in logs examples: 2021-06-25 03:00:54: pid 17502: LOG: read from socket failed with error :"Connection timed out" 2021-06-25 03:00:54: pid 17502: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-25 03:00:54: pid 17502: LOG: new outbound connection to 10.234.234.66:9000 or 2021-06-25 03:41:04: pid 17502: LOG: new watchdog node connection is received from "10.234.234.66:64585" 2021-06-25 03:41:04: pid 17502: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-25 03:41:04: pid 17502: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 Why is connection time out for node with db? This is normal behavior? I have other clusters and I not notised these logs. If this is normal behavior, could you describe this process? In attachments logs and conf pgpool
Tags	No tags attached.

Ken 2021-06-25 17:53 reporter	logs.txt (22,122 bytes) 2021-06-25 00:10:31: pid 17592: LOG: Replication of node:1 is behind 11143832 bytes from the primary server (node:0) 2021-06-25 00:10:31: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 00:49:17: pid 17502: LOG: read from socket failed with error :"Connection timed out" 2021-06-25 00:49:17: pid 17502: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-25 00:49:17: pid 17502: LOG: new outbound connection to 10.234.234.66:9000 2021-06-25 01:09:28: pid 17592: LOG: Replication of node:1 is behind 29638200 bytes from the primary server (node:0) 2021-06-25 01:09:28: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:20:00: pid 17592: LOG: Replication of node:1 is behind 25174408 bytes from the primary server (node:0) 2021-06-25 01:20:00: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:21:51: pid 17592: LOG: Replication of node:1 is behind 78286648 bytes from the primary server (node:0) 2021-06-25 01:21:51: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:30:01: pid 17592: LOG: Replication of node:1 is behind 15301472 bytes from the primary server (node:0) 2021-06-25 01:30:01: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:40:55: pid 17502: LOG: new watchdog node connection is received from "10.234.234.66:4591" 2021-06-25 01:40:55: pid 17502: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-25 01:40:55: pid 17502: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-25 01:57:14: pid 17592: LOG: Replication of node:1 is behind 14143008 bytes from the primary server (node:0) 2021-06-25 01:57:14: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:57:34: pid 17592: LOG: Replication of node:1 is behind 28096632 bytes from the primary server (node:0) 2021-06-25 01:57:34: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:57:44: pid 17592: LOG: Replication of node:1 is behind 27645056 bytes from the primary server (node:0) 2021-06-25 01:57:44: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:57:54: pid 17592: LOG: Replication of node:1 is behind 16777304 bytes from the primary server (node:0) 2021-06-25 01:57:54: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:58:04: pid 17592: LOG: Replication of node:1 is behind 19392000 bytes from the primary server (node:0) 2021-06-25 01:58:04: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 01:58:15: pid 17592: LOG: Replication of node:1 is behind 29633616 bytes from the primary server (node:0) 2021-06-25 01:58:15: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:27:07: pid 17592: LOG: Replication of node:1 is behind 10803768 bytes from the primary server (node:0) 2021-06-25 02:27:07: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:27:47: pid 17592: LOG: Replication of node:1 is behind 15997472 bytes from the primary server (node:0) 2021-06-25 02:27:47: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:28:08: pid 17592: LOG: Replication of node:1 is behind 24119712 bytes from the primary server (node:0) 2021-06-25 02:28:08: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:30:38: pid 17592: LOG: Replication of node:1 is behind 38062712 bytes from the primary server (node:0) 2021-06-25 02:30:38: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:30:48: pid 17592: LOG: Replication of node:1 is behind 12222856 bytes from the primary server (node:0) 2021-06-25 02:30:48: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:58:42: pid 17592: LOG: Replication of node:1 is behind 10465816 bytes from the primary server (node:0) 2021-06-25 02:58:42: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 02:58:52: pid 17592: LOG: Replication of node:1 is behind 61209936 bytes from the primary server (node:0) 2021-06-25 02:58:52: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 03:00:54: pid 17502: LOG: read from socket failed with error :"Connection timed out" 2021-06-25 03:00:54: pid 17502: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-25 03:00:54: pid 17502: LOG: new outbound connection to 10.234.234.66:9000 2021-06-25 03:10:33: pid 17592: LOG: Replication of node:1 is behind 12215416 bytes from the primary server (node:0) 2021-06-25 03:10:33: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 03:30:35: pid 17592: LOG: Replication of node:1 is behind 26449736 bytes from the primary server (node:0) 2021-06-25 03:30:35: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 03:30:45: pid 17592: LOG: Replication of node:1 is behind 12482864 bytes from the primary server (node:0) 2021-06-25 03:30:45: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 03:32:26: pid 17592: LOG: Replication of node:1 is behind 15789304 bytes from the primary server (node:0) 2021-06-25 03:32:26: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 03:41:04: pid 17502: LOG: new watchdog node connection is received from "10.234.234.66:64585" 2021-06-25 03:41:04: pid 17502: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-25 03:41:04: pid 17502: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-25 04:04:00: pid 17592: LOG: Replication of node:1 is behind 41926824 bytes from the primary server (node:0) 2021-06-25 04:04:00: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:05:00: pid 17592: LOG: Replication of node:1 is behind 19306856 bytes from the primary server (node:0) 2021-06-25 04:05:00: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:05:40: pid 17592: LOG: Replication of node:1 is behind 16142216 bytes from the primary server (node:0) 2021-06-25 04:05:40: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:07:30: pid 17592: LOG: Replication of node:1 is behind 20570992 bytes from the primary server (node:0) 2021-06-25 04:07:30: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:07:40: pid 17592: LOG: Replication of node:1 is behind 11906288 bytes from the primary server (node:0) 2021-06-25 04:07:40: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:08:00: pid 17592: LOG: Replication of node:1 is behind 15595520 bytes from the primary server (node:0) 2021-06-25 04:08:00: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:08:10: pid 17592: LOG: Replication of node:1 is behind 32614944 bytes from the primary server (node:0) 2021-06-25 04:08:10: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:08:20: pid 17592: LOG: Replication of node:1 is behind 13813112 bytes from the primary server (node:0) 2021-06-25 04:08:20: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:08:40: pid 17592: LOG: Replication of node:1 is behind 13375560 bytes from the primary server (node:0) 2021-06-25 04:08:40: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:08:50: pid 17592: LOG: Replication of node:1 is behind 11752712 bytes from the primary server (node:0) 2021-06-25 04:08:50: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:09:00: pid 17592: LOG: Replication of node:1 is behind 11403280 bytes from the primary server (node:0) 2021-06-25 04:09:00: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:10:00: pid 17592: LOG: Replication of node:1 is behind 24212064 bytes from the primary server (node:0) 2021-06-25 04:10:00: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:30:42: pid 17592: LOG: Replication of node:1 is behind 184068896 bytes from the primary server (node:0) 2021-06-25 04:30:42: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:30:52: pid 17592: LOG: Replication of node:1 is behind 340004488 bytes from the primary server (node:0) 2021-06-25 04:30:52: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:31:02: pid 17592: LOG: Replication of node:1 is behind 669585576 bytes from the primary server (node:0) 2021-06-25 04:31:02: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:31:13: pid 17592: LOG: Replication of node:1 is behind 1011669024 bytes from the primary server (node:0) 2021-06-25 04:31:13: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:31:23: pid 17592: LOG: Replication of node:1 is behind 1065417727 bytes from the primary server (node:0) 2021-06-25 04:31:23: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:31:33: pid 17592: LOG: Replication of node:1 is behind 1312002064 bytes from the primary server (node:0) 2021-06-25 04:31:33: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:31:43: pid 17592: LOG: Replication of node:1 is behind 1445769816 bytes from the primary server (node:0) 2021-06-25 04:31:43: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:31:53: pid 17592: LOG: Replication of node:1 is behind 1701446928 bytes from the primary server (node:0) 2021-06-25 04:31:53: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:32:03: pid 17592: LOG: Replication of node:1 is behind 1383392807 bytes from the primary server (node:0) 2021-06-25 04:32:03: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:32:13: pid 17592: LOG: Replication of node:1 is behind 684064039 bytes from the primary server (node:0) 2021-06-25 04:32:13: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:37:53: pid 17592: LOG: Replication of node:1 is behind 11166376 bytes from the primary server (node:0) 2021-06-25 04:37:53: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 04:53:55: pid 17592: LOG: Replication of node:1 is behind 12970992 bytes from the primary server (node:0) 2021-06-25 04:53:55: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:07:06: pid 17592: LOG: Replication of node:1 is behind 126447088 bytes from the primary server (node:0) 2021-06-25 05:07:06: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:07:16: pid 17592: LOG: Replication of node:1 is behind 256557456 bytes from the primary server (node:0) 2021-06-25 05:07:16: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:07:26: pid 17592: LOG: Replication of node:1 is behind 603987584 bytes from the primary server (node:0) 2021-06-25 05:07:26: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:07:36: pid 17592: LOG: Replication of node:1 is behind 484786272 bytes from the primary server (node:0) 2021-06-25 05:07:36: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:10:17: pid 17592: LOG: Replication of node:1 is behind 33542696 bytes from the primary server (node:0) 2021-06-25 05:10:17: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:12:31: pid 17502: LOG: read from socket failed with error :"Connection timed out" 2021-06-25 05:12:31: pid 17502: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-25 05:12:31: pid 17502: LOG: new outbound connection to 10.234.234.66:9000 2021-06-25 05:21:08: pid 17592: LOG: Replication of node:1 is behind 83917640 bytes from the primary server (node:0) 2021-06-25 05:21:08: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:31:10: pid 17592: LOG: Replication of node:1 is behind 16778976 bytes from the primary server (node:0) 2021-06-25 05:31:10: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:35:50: pid 17592: LOG: Replication of node:1 is behind 12578712 bytes from the primary server (node:0) 2021-06-25 05:35:50: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:36:10: pid 17592: LOG: Replication of node:1 is behind 15762520 bytes from the primary server (node:0) 2021-06-25 05:36:10: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 05:41:13: pid 17502: LOG: new watchdog node connection is received from "10.234.234.66:38530" 2021-06-25 05:41:13: pid 17502: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-25 05:41:13: pid 17502: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-25 06:30:07: pid 17592: LOG: Replication of node:1 is behind 12802888 bytes from the primary server (node:0) 2021-06-25 06:30:07: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 06:30:57: pid 17592: LOG: Replication of node:1 is behind 134870512 bytes from the primary server (node:0) 2021-06-25 06:30:57: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 06:31:17: pid 17592: LOG: Replication of node:1 is behind 48984000 bytes from the primary server (node:0) 2021-06-25 06:31:17: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 06:57:01: pid 17592: LOG: Replication of node:1 is behind 17832600 bytes from the primary server (node:0) 2021-06-25 06:57:01: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 06:57:41: pid 17592: LOG: Replication of node:1 is behind 20547496 bytes from the primary server (node:0) 2021-06-25 06:57:41: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 06:57:51: pid 17592: LOG: Replication of node:1 is behind 14190384 bytes from the primary server (node:0) 2021-06-25 06:57:51: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 06:58:11: pid 17592: LOG: Replication of node:1 is behind 10466200 bytes from the primary server (node:0) 2021-06-25 06:58:11: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:06:02: pid 17592: LOG: Replication of node:1 is behind 33151000 bytes from the primary server (node:0) 2021-06-25 07:06:02: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:10:33: pid 17592: LOG: Replication of node:1 is behind 37871655 bytes from the primary server (node:0) 2021-06-25 07:10:33: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:24:08: pid 17502: LOG: read from socket failed with error :"Connection timed out" 2021-06-25 07:24:08: pid 17502: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-25 07:24:08: pid 17502: LOG: new outbound connection to 10.234.234.66:9000 2021-06-25 07:30:35: pid 17592: LOG: Replication of node:1 is behind 51795096 bytes from the primary server (node:0) 2021-06-25 07:30:35: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:30:55: pid 17592: LOG: Replication of node:1 is behind 176378648 bytes from the primary server (node:0) 2021-06-25 07:30:55: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:35:05: pid 17592: LOG: Replication of node:1 is behind 15470144 bytes from the primary server (node:0) 2021-06-25 07:35:05: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:35:15: pid 17592: LOG: Replication of node:1 is behind 20059368 bytes from the primary server (node:0) 2021-06-25 07:35:15: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 07:41:22: pid 17502: LOG: new watchdog node connection is received from "10.234.234.66:62651" 2021-06-25 07:41:22: pid 17502: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-25 07:41:22: pid 17502: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-25 08:01:28: pid 17592: LOG: Replication of node:1 is behind 33408928 bytes from the primary server (node:0) 2021-06-25 08:01:28: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:01:48: pid 17592: LOG: Replication of node:1 is behind 83890368 bytes from the primary server (node:0) 2021-06-25 08:01:48: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:05:39: pid 17592: LOG: Replication of node:1 is behind 36896936 bytes from the primary server (node:0) 2021-06-25 08:05:39: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:30:05: pid 17592: LOG: Replication of node:1 is behind 76604560 bytes from the primary server (node:0) 2021-06-25 08:30:05: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:30:15: pid 17592: LOG: Replication of node:1 is behind 200199072 bytes from the primary server (node:0) 2021-06-25 08:30:15: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:30:25: pid 17592: LOG: Replication of node:1 is behind 18552224 bytes from the primary server (node:0) 2021-06-25 08:30:25: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:30:35: pid 17592: LOG: Replication of node:1 is behind 177315816 bytes from the primary server (node:0) 2021-06-25 08:30:35: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:30:45: pid 17592: LOG: Replication of node:1 is behind 261111984 bytes from the primary server (node:0) 2021-06-25 08:30:45: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:30:55: pid 17592: LOG: Replication of node:1 is behind 406269712 bytes from the primary server (node:0) 2021-06-25 08:30:55: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:31:05: pid 17592: LOG: Replication of node:1 is behind 189502680 bytes from the primary server (node:0) 2021-06-25 08:31:05: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:31:15: pid 17592: LOG: Replication of node:1 is behind 507883671 bytes from the primary server (node:0) 2021-06-25 08:31:15: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:31:25: pid 17592: LOG: Replication of node:1 is behind 846031887 bytes from the primary server (node:0) 2021-06-25 08:31:25: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:31:35: pid 17592: LOG: Replication of node:1 is behind 867297576 bytes from the primary server (node:0) 2021-06-25 08:31:35: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:31:45: pid 17592: LOG: Replication of node:1 is behind 816141280 bytes from the primary server (node:0) 2021-06-25 08:31:45: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:31:55: pid 17592: LOG: Replication of node:1 is behind 427226768 bytes from the primary server (node:0) 2021-06-25 08:31:55: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 08:32:05: pid 17592: LOG: Replication of node:1 is behind 24198544 bytes from the primary server (node:0) 2021-06-25 08:32:05: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:02:31: pid 17592: LOG: Replication of node:1 is behind 29540960 bytes from the primary server (node:0) 2021-06-25 09:02:31: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:02:42: pid 17592: LOG: Replication of node:1 is behind 145053616 bytes from the primary server (node:0) 2021-06-25 09:02:42: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:05:22: pid 17592: LOG: Replication of node:1 is behind 34939768 bytes from the primary server (node:0) 2021-06-25 09:05:22: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:20:09: pid 17592: LOG: Replication of node:1 is behind 42216472 bytes from the primary server (node:0) 2021-06-25 09:20:09: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:21:14: pid 17591: LOG: forked new pcp worker, pid=16327 socket=7 2021-06-25 09:21:14: pid 17502: LOG: new IPC connection received 2021-06-25 09:21:14: pid 17591: LOG: PCP process with pid: 16327 exit with SUCCESS. 2021-06-25 09:21:14: pid 17591: LOG: PCP process with pid: 16327 exits with status 0 2021-06-25 09:30:11: pid 17592: LOG: Replication of node:1 is behind 24477296 bytes from the primary server (node:0) 2021-06-25 09:30:11: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:35:48: pid 17502: LOG: read from socket failed with error :"Connection timed out" 2021-06-25 09:35:48: pid 17502: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-25 09:35:48: pid 17502: LOG: new outbound connection to 10.234.234.66:9000 2021-06-25 09:41:34: pid 17502: LOG: new watchdog node connection is received from "10.234.234.66:29941" 2021-06-25 09:41:34: pid 17502: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-25 09:41:34: pid 17502: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-25 09:57:04: pid 17592: LOG: Replication of node:1 is behind 11585912 bytes from the primary server (node:0) 2021-06-25 09:57:04: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 09:58:36: pid 17502: WARNING: we have not received a beacon message from leader node "10.234.227.161:5433 Linux m2mps1an.unx.t-mobile.pl" 2021-06-25 09:58:36: pid 17502: DETAIL: requesting info message from leader node 2021-06-25 10:20:29: pid 17592: LOG: Replication of node:1 is behind 99738296 bytes from the primary server (node:0) 2021-06-25 10:20:29: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 10:30:31: pid 17592: LOG: Replication of node:1 is behind 89146648 bytes from the primary server (node:0) 2021-06-25 10:30:31: pid 17592: CONTEXT: while checking replication time lag 2021-06-25 10:30:41: pid 17592: LOG: Replication of node:1 is behind 197526304 bytes from the primary server (node:0) 2021-06-25 10:30:41: pid 17592: CONTEXT: while checking replication time lag logs.txt (22,122 bytes) pgpool.conf (47,025 bytes) # ---------------------------- # pgPool-II configuration file # ---------------------------- # # This file consists of lines of the form: # # name = value # # Whitespace may be used. Comments are introduced with "#" anywhere on a line. # The complete list of parameter names and allowed values can be found in the # pgPool-II documentation. # # This file is read on server startup and when the server receives a SIGHUP # signal. If you edit the file on a running system, you have to SIGHUP the # server for the changes to take effect, or use "pgpool reload". Some # parameters, which are marked below, require a server shutdown and restart to # take effect. # #------------------------------------------------------------------------------ # BACKEND CLUSTERING MODE # Choose one of: 'streaming_replication', 'native_replication', # 'logical_replication', 'slony', 'raw' or 'snapshot_isolation' # (change requires restart) #------------------------------------------------------------------------------ backend_clustering_mode = 'streaming_replication' #------------------------------------------------------------------------------ # CONNECTIONS #------------------------------------------------------------------------------ # - pgpool Connection Settings - listen_addresses = '' # Host name or IP address to listen on: # '' for all, '' for no TCP/IP connections # (change requires restart) port = 5433 # Port number # (change requires restart) socket_dir = '/var/run/postgresql' # Unix domain socket path # The Debian package defaults to # /var/run/postgresql # (change requires restart) reserved_connections = 0 # Number of reserved connections. # Pgpool-II does not accept connections if over # num_init_chidlren - reserved_connections. # - pgpool Communication Manager Connection Settings - pcp_listen_addresses = '' # Host name or IP address for pcp process to listen on: # '' for all, '' for no TCP/IP connections # (change requires restart) pcp_port = 9898 # Port number for pcp # (change requires restart) pcp_socket_dir = '/var/run/postgresql' # Unix domain socket path for pcp # The Debian package defaults to # /var/run/postgresql # (change requires restart) listen_backlog_multiplier = 2 # Set the backlog parameter of listen(2) to # num_init_children * listen_backlog_multiplier. # (change requires restart) serialize_accept = off # whether to serialize accept() call to avoid thundering herd problem # (change requires restart) # - Backend Connection Settings - backend_hostname0 = '10.234.226.66' # Host name or IP address to connect to for backend 0 backend_port0 = 5432 # Port number for backend 0 backend_weight0 = 1 # Weight for backend 0 (only in load balancing mode) backend_data_directory0 = '/pgdata/m2mdbprod/data' # Data directory for backend 0 backend_flag0 = 'ALLOW_TO_FAILOVER' # Controls various backend behavior # ALLOW_TO_FAILOVER, DISALLOW_TO_FAILOVER # or ALWAYS_PRIMARY backend_application_name0 = 'server0' # walsender's application_name, used for "show pool_nodes" command backend_hostname1 = '10.234.234.66' backend_port1 = 5432 backend_weight1 = 1 backend_data_directory1 = '/pgdata/m2mdbprod/data' backend_flag1 = 'ALLOW_TO_FAILOVER' backend_application_name1 = 'server1' # - Authentication - enable_pool_hba = off # Use pool_hba.conf for client authentication pool_passwd = 'pool_passwd' # File name of pool_passwd for md5 authentication. # "" disables pool_passwd. # (change requires restart) authentication_timeout = 1min # Delay in seconds to complete client authentication # 0 means no timeout. allow_clear_text_frontend_auth = off # Allow Pgpool-II to use clear text password authentication # with clients, when pool_passwd does not # contain the user password # - SSL Connections - ssl = off # Enable SSL support # (change requires restart) #ssl_key = 'server.key' # SSL private key file # (change requires restart) #ssl_cert = 'server.crt' # SSL public certificate file # (change requires restart) #ssl_ca_cert = '' # Single PEM format file containing # CA root certificate(s) # (change requires restart) #ssl_ca_cert_dir = '' # Directory containing CA root certificate(s) # (change requires restart) #ssl_crl_file = '' # SSL certificate revocation list file # (change requires restart) ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # Allowed SSL ciphers # (change requires restart) ssl_prefer_server_ciphers = off # Use server's SSL cipher preferences, # rather than the client's # (change requires restart) ssl_ecdh_curve = 'prime256v1' # Name of the curve to use in ECDH key exchange ssl_dh_params_file = '' # Name of the file containing Diffie-Hellman parameters used # for so-called ephemeral DH family of SSL cipher. #ssl_passphrase_command='' # Sets an external command to be invoked when a passphrase # for decrypting an SSL file needs to be obtained # (change requires restart) #------------------------------------------------------------------------------ # POOLS #------------------------------------------------------------------------------ # - Concurrent session and pool size - num_init_children = 75 # Number of concurrent sessions allowed # (change requires restart) max_pool = 4 # Number of connection pool caches per connection # (change requires restart) # - Life time - child_life_time = 5min # Pool exits after being idle for this many seconds child_max_connections = 0 # Pool exits after receiving that many connections # 0 means no exit connection_life_time = 0 # Connection to backend closes after being idle for this many seconds # 0 means no close client_idle_limit = 0 # Client is disconnected after being idle for that many seconds # (even inside an explicit transactions!) # 0 means no disconnection #------------------------------------------------------------------------------ # LOGS #------------------------------------------------------------------------------ # - Where to log - log_destination = 'stderr' # Where to log # Valid values are combinations of stderr, # and syslog. Default to stderr. # - What to log - log_line_prefix = '%t: pid %p: ' # printf-style string to output at beginning of each log line. log_connections = off # Log connections log_disconnections = off # Log disconnections log_hostname = off # Hostname will be shown in ps status # and in logs if connections are logged log_statement = off # Log all statements log_per_node_statement = off # Log all statements # with node and backend informations log_client_messages = off # Log any client messages log_standby_delay = 'if_over_threshold' # Log standby delay # Valid values are combinations of always, # if_over_threshold, none # - Syslog specific - syslog_facility = 'LOCAL0' # Syslog local facility. Default to LOCAL0 syslog_ident = 'pgpool' # Syslog program identification string # Default to 'pgpool' # - Debug - #log_error_verbosity = default # terse, default, or verbose messages #client_min_messages = notice # values in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = warning # values in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic # This is used when logging to stderr: logging_collector = on # Enable capturing of stderr # into log files. # (change requires restart) # -- Only used if logging_collector is on --- log_directory = '/var/log/pgpool' # directory where log files are written, # can be absolute log_filename = 'pgpool-%a.log' # log file name pattern, # can include strftime() escapes log_file_mode = 0644 # creation mode for log files, # begin with 0 to use octal notation log_truncate_on_rotation = on # If on, an existing log file with the # same name as the new log file will be # truncated rather than appended to. # But such truncation only occurs on # time-driven rotation, not on restarts # or size-driven rotation. Default is # off, meaning append to existing files # in all cases. log_rotation_age = 1d # Automatic rotation of logfiles will # happen after that (minutes)time. # 0 disables time based rotation. log_rotation_size = 0 # Automatic rotation of logfiles will # happen after that much (KB) log output. # 0 disables size based rotation. #------------------------------------------------------------------------------ # FILE LOCATIONS #------------------------------------------------------------------------------ pid_file_name = '/var/run/pgpool/pgpool.pid' # PID file name # Can be specified as relative to the" # location of pgpool.conf file or # as an absolute path # (change requires restart) logdir = '/tmp' # Directory of pgPool status file # (change requires restart) #------------------------------------------------------------------------------ # CONNECTION POOLING #------------------------------------------------------------------------------ connection_cache = on # Activate connection pools # (change requires restart) # Semicolon separated list of queries # to be issued at the end of a session # The default is for 8.3 and later reset_query_list = 'ABORT; DISCARD ALL' # The following one is for 8.2 and before #reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT' #------------------------------------------------------------------------------ # REPLICATION MODE #------------------------------------------------------------------------------ replicate_select = off # Replicate SELECT statements # when in replication mode # replicate_select is higher priority than # load_balance_mode. insert_lock = off # Automatically locks a dummy row or a table # with INSERT statements to keep SERIAL data # consistency # Without SERIAL, no lock will be issued lobj_lock_table = '' # When rewriting lo_creat command in # replication mode, specify table name to # lock # - Degenerate handling - replication_stop_on_mismatch = off # On disagreement with the packet kind # sent from backend, degenerate the node # which is most likely "minority" # If off, just force to exit this session failover_if_affected_tuples_mismatch = off # On disagreement with the number of affected # tuples in UPDATE/DELETE queries, then # degenerate the node which is most likely # "minority". # If off, just abort the transaction to # keep the consistency #------------------------------------------------------------------------------ # LOAD BALANCING MODE #------------------------------------------------------------------------------ load_balance_mode = on # Activate load balancing mode # (change requires restart) ignore_leading_white_space = on # Ignore leading white spaces of each query read_only_function_list = '' # Comma separated list of function names # that don't write to database # Regexp are accepted write_function_list = '' # Comma separated list of function names # that write to database # Regexp are accepted # If both read_only_function_list and write_function_list # is empty, function's volatile property is checked. # If it's volatile, the function is regarded as a # writing function. primary_routing_query_pattern_list = '' # Semicolon separated list of query patterns # that should be sent to primary node # Regexp are accepted # valid for streaming replicaton mode only. database_redirect_preference_list = '' # comma separated list of pairs of database and node id. # example: postgres:primary,mydb[0-4]:1,mydb[5-9]:2' # valid for streaming replicaton mode only. app_name_redirect_preference_list = '' # comma separated list of pairs of app name and node id. # example: 'psql:primary,myapp[0-4]:1,myapp[5-9]:standby' # valid for streaming replicaton mode only. allow_sql_comments = off # if on, ignore SQL comments when judging if load balance or # query cache is possible. # If off, SQL comments effectively prevent the judgment # (pre 3.4 behavior). disable_load_balance_on_write = 'transaction' # Load balance behavior when write query is issued # in an explicit transaction. # # Valid values: # # 'transaction' (default): # if a write query is issued, subsequent # read queries will not be load balanced # until the transaction ends. # # 'trans_transaction': # if a write query is issued, subsequent # read queries in an explicit transaction # will not be load balanced until the session ends. # # 'dml_adaptive': # Queries on the tables that have already been # modified within the current explicit transaction will # not be load balanced until the end of the transaction. # # 'always': # if a write query is issued, read queries will # not be load balanced until the session ends. # # Note that any query not in an explicit transaction # is not affected by the parameter. dml_adaptive_object_relationship_list= '' # comma separated list of object pairs # [object]:[dependent-object], to disable load balancing # of dependent objects within the explicit transaction # after WRITE statement is issued on (depending-on) object. # # example: 'tb_t1:tb_t2,insert_tb_f_func():tb_f,tb_v:my_view' # Note: function name in this list must also be present in # the write_function_list # only valid for disable_load_balance_on_write = 'dml_adaptive'. statement_level_load_balance = off # Enables statement level load balancing #------------------------------------------------------------------------------ # NATIVE REPLICATION MODE #------------------------------------------------------------------------------ # - Streaming - sr_check_period = 10 # Streaming replication check period # Disabled (0) by default sr_check_user = 'pgpool' # Streaming replication check user # This is neccessary even if you disable streaming # replication delay check by sr_check_period = 0 sr_check_password = '' # Password for streaming replication check user # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password sr_check_database = 'postgres' # Database name for streaming replication check delay_threshold = 10000000 # Threshold before not dispatching query to standby node # Unit is in bytes # Disabled (0) by default # - Special commands - follow_primary_command = '' # Executes this command after main node failover # Special values: # %d = failed node id # %h = failed node host name # %p = failed node port number # %D = failed node database cluster path # %m = new main node id # %H = new main node hostname # %M = old main node id # %P = old primary node id # %r = new main port number # %R = new main database cluster path # %N = old primary node hostname # %S = old primary node port number # %% = '%' character #------------------------------------------------------------------------------ # HEALTH CHECK GLOBAL PARAMETERS #------------------------------------------------------------------------------ health_check_period = 5 # Health check period # Disabled (0) by default health_check_timeout = 30 # Health check timeout # 0 means no timeout health_check_user = 'pgpool' # Health check user health_check_password = '' # Password for health check user # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password health_check_database = '' # Database name for health check. If '', tries 'postgres' frist, health_check_max_retries = 3 # Maximum number of times to retry a failed health check before giving up. health_check_retry_delay = 1 # Amount of time to wait (in seconds) between retries. connect_timeout = 10000 # Timeout value in milliseconds before giving up to connect to backend. # Default is 10000 ms (10 second). Flaky network user may want to increase # the value. 0 means no timeout. # Note that this value is not only used for health check, # but also for ordinary conection to backend. #------------------------------------------------------------------------------ # HEALTH CHECK PER NODE PARAMETERS (OPTIONAL) #------------------------------------------------------------------------------ #health_check_period0 = 0 #health_check_timeout0 = 20 #health_check_user0 = 'nobody' #health_check_password0 = '' #health_check_database0 = '' #health_check_max_retries0 = 0 #health_check_retry_delay0 = 1 #connect_timeout0 = 10000 #------------------------------------------------------------------------------ # FAILOVER AND FAILBACK #------------------------------------------------------------------------------ failover_command = '/etc/pgpool-II/failover.sh %d %h %p %D %m %H %M %P %r %R %N %S' # Executes this command at failover # Special values: # %d = failed node id # %h = failed node host name # %p = failed node port number # %D = failed node database cluster path # %m = new main node id # %H = new main node hostname # %M = old main node id # %P = old primary node id # %r = new main port number # %R = new main database cluster path # %N = old primary node hostname # %S = old primary node port number # %% = '%' character failback_command = '' # Executes this command at failback. # Special values: # %d = failed node id # %h = failed node host name # %p = failed node port number # %D = failed node database cluster path # %m = new main node id # %H = new main node hostname # %M = old main node id # %P = old primary node id # %r = new main port number # %R = new main database cluster path # %N = old primary node hostname # %S = old primary node port number # %% = '%' character failover_on_backend_error = on # Initiates failover when reading/writing to the # backend communication socket fails # If set to off, pgpool will report an # error and disconnect the session. detach_false_primary = off # Detach false primary if on. Only # valid in streaming replicaton # mode and with PostgreSQL 9.6 or # after. search_primary_node_timeout = 5min # Timeout in seconds to search for the # primary node when a failover occurs. # 0 means no timeout, keep searching # for a primary node forever. #------------------------------------------------------------------------------ # ONLINE RECOVERY #------------------------------------------------------------------------------ recovery_user = 'postgres' # Online recovery user recovery_password = '' # Online recovery password # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password recovery_1st_stage_command = 'recovery_1st_stage' # Executes a command in first stage recovery_2nd_stage_command = '' # Executes a command in second stage recovery_timeout = 90 # Timeout in seconds to wait for the # recovering node's postmaster to start up # 0 means no wait client_idle_limit_in_recovery = 0 # Client is disconnected after being idle # for that many seconds in the second stage # of online recovery # 0 means no disconnection # -1 means immediate disconnection auto_failback = off # Dettached backend node reattach automatically # if replication_state is 'streaming'. auto_failback_interval = 1min # Min interval of executing auto_failback in # seconds. #------------------------------------------------------------------------------ # WATCHDOG #------------------------------------------------------------------------------ # - Enabling - use_watchdog = on # Activates watchdog # (change requires restart) # -Connection to up stream servers - trusted_servers = '' # trusted server list which are used # to confirm network connection # (hostA,hostB,hostC,...) # (change requires restart) ping_path = '/bin' # ping command path # (change requires restart) # - Watchdog communication Settings - hostname0 = '10.234.226.66' # Host name or IP address of pgpool node # for watchdog connection # (change requires restart) wd_port0 = 9000 # Port number for watchdog service # (change requires restart) pgpool_port0 = 5433 # Port number for pgpool # (change requires restart) hostname1 = '10.234.234.66' wd_port1 = 9000 pgpool_port1 = 5433 hostname2 = '10.234.227.161' wd_port2 = 9000 pgpool_port2 = 5433 wd_priority = 1 # priority of this watchdog in leader election # (change requires restart) wd_authkey = '' # Authentication key for watchdog communication # (change requires restart) wd_ipc_socket_dir = '/var/run/postgresql' # Unix domain socket path for watchdog IPC socket # The Debian package defaults to # /var/run/postgresql # (change requires restart) # - Virtual IP control Setting - delegate_IP = '' # delegate IP address # If this is empty, virtual IP never bring up. # (change requires restart) if_cmd_path = '/sbin' # path to the directory where if_up/down_cmd exists # If if_up/down_cmd starts with "/", if_cmd_path will be ignored. # (change requires restart) if_up_cmd = '/usr/bin/sudo /sbin/ip addr add $_IP_$/24 dev eth0 label eth0:0' # startup delegate IP command # (change requires restart) if_down_cmd = '/usr/bin/sudo /sbin/ip addr del $_IP_$/24 dev eth0' # shutdown delegate IP command # (change requires restart) arping_path = '/usr/sbin' # arping command path # If arping_cmd starts with "/", if_cmd_path will be ignored. # (change requires restart) arping_cmd = '/usr/bin/sudo /usr/sbin/arping -U $_IP_$ -w 1 -I eth0' # arping command # (change requires restart) # - Behaivor on escalation Setting - clear_memqcache_on_escalation = on # Clear all the query cache on shared memory # when standby pgpool escalate to active pgpool # (= virtual IP holder). # This should be off if client connects to pgpool # not using virtual IP. # (change requires restart) wd_escalation_command = '' # Executes this command at escalation on new active pgpool. # (change requires restart) wd_de_escalation_command = '' # Executes this command when leader pgpool resigns from being leader. # (change requires restart) # - Watchdog consensus settings for failover - failover_when_quorum_exists = on # Only perform backend node failover # when the watchdog cluster holds the quorum # (change requires restart) failover_require_consensus = on # Perform failover when majority of Pgpool-II nodes # aggrees on the backend node status change # (change requires restart) allow_multiple_failover_requests_from_node = off # A Pgpool-II node can cast multiple votes # for building the consensus on failover # (change requires restart) enable_consensus_with_half_votes = off # apply majority rule for consensus and quorum computation # at 50% of votes in a cluster with even number of nodes. # when enabled the existence of quorum and consensus # on failover is resolved after receiving half of the # total votes in the cluster, otherwise both these # decisions require at least one more vote than # half of the total votes. # (change requires restart) # - Lifecheck Setting - # -- common -- wd_monitoring_interfaces_list = '' # Comma separated list of interfaces names to monitor. # if any interface from the list is active the watchdog will # consider the network is fine # 'any' to enable monitoring on all interfaces except loopback # '' to disable monitoring # (change requires restart) wd_lifecheck_method = 'heartbeat' # Method of watchdog lifecheck ('heartbeat' or 'query' or 'external') # (change requires restart) wd_interval = 10 # lifecheck interval (sec) > 0 # (change requires restart) # -- heartbeat mode -- heartbeat_hostname0 = '10.234.226.66' # Host name or IP address used # for sending heartbeat signal. # (change requires restart) heartbeat_port0 = 9694 # Port number used for receiving/sending heartbeat signal # Usually this is the same as heartbeat_portX. # (change requires restart) heartbeat_device0 = '' # Name of NIC device (such like 'eth0') # used for sending/receiving heartbeat # signal to/from destination 0. # This works only when this is not empty # and pgpool has root privilege. # (change requires restart) heartbeat_hostname1 = '10.234.234.66' heartbeat_port1 = 9694 heartbeat_device1 = '' heartbeat_hostname2 = '10.234.227.161' heartbeat_port2 = 9694 heartbeat_device2 = '' wd_heartbeat_keepalive = 2 # Interval time of sending heartbeat signal (sec) # (change requires restart) wd_heartbeat_deadtime = 30 # Deadtime interval for heartbeat signal (sec) # (change requires restart) # -- query mode -- wd_life_point = 3 # lifecheck retry times # (change requires restart) wd_lifecheck_query = 'SELECT 1' # lifecheck query to pgpool from watchdog # (change requires restart) wd_lifecheck_dbname = 'template1' # Database name connected for lifecheck # (change requires restart) wd_lifecheck_user = 'nobody' # watchdog user monitoring pgpools in lifecheck # (change requires restart) wd_lifecheck_password = '' # Password for watchdog user in lifecheck # Leaving it empty will make Pgpool-II to first look for the # Password in pool_passwd file before using the empty password # (change requires restart) #------------------------------------------------------------------------------ # OTHERS #------------------------------------------------------------------------------ relcache_expire = 0 # Life time of relation cache in seconds. # 0 means no cache expiration(the default). # The relation cache is used for cache the # query result against PostgreSQL system # catalog to obtain various information # including table structures or if it's a # temporary table or not. The cache is # maintained in a pgpool child local memory # and being kept as long as it survives. # If someone modify the table by using # ALTER TABLE or some such, the relcache is # not consistent anymore. # For this purpose, cache_expiration # controls the life time of the cache. relcache_size = 256 # Number of relation cache # entry. If you see frequently: # "pool_search_relcache: cache replacement happend" # in the pgpool log, you might want to increate this number. check_temp_table = catalog # Temporary table check method. catalog, trace or none. # Default is catalog. check_unlogged_table = on # If on, enable unlogged table check in SELECT statements. # This initiates queries against system catalog of primary/main # thus increases load of primary. # If you are absolutely sure that your system never uses unlogged tables # and you want to save access to primary/main, you could turn this off. # Default is on. enable_shared_relcache = on # If on, relation cache stored in memory cache, # the cache is shared among child process. # Default is on. # (change requires restart) relcache_query_target = primary # Target node to send relcache queries. Default is primary node. # If load_balance_node is specified, queries will be sent to load balance node. #------------------------------------------------------------------------------ # IN MEMORY QUERY MEMORY CACHE #------------------------------------------------------------------------------ memory_cache_enabled = off # If on, use the memory cache functionality, off by default # (change requires restart) memqcache_method = 'shmem' # Cache storage method. either 'shmem'(shared memory) or # 'memcached'. 'shmem' by default # (change requires restart) memqcache_memcached_host = 'localhost' # Memcached host name or IP address. Mandatory if # memqcache_method = 'memcached'. # Defaults to localhost. # (change requires restart) memqcache_memcached_port = 11211 # Memcached port number. Mondatory if memqcache_method = 'memcached'. # Defaults to 11211. # (change requires restart) memqcache_total_size = 64MB # Total memory size in bytes for storing memory cache. # Mandatory if memqcache_method = 'shmem'. # Defaults to 64MB. # (change requires restart) memqcache_max_num_cache = 1000000 # Total number of cache entries. Mandatory # if memqcache_method = 'shmem'. # Each cache entry consumes 48 bytes on shared memory. # Defaults to 1,000,000(45.8MB). # (change requires restart) memqcache_expire = 0 # Memory cache entry life time specified in seconds. # 0 means infinite life time. 0 by default. # (change requires restart) memqcache_auto_cache_invalidation = on # If on, invalidation of query cache is triggered by corresponding # DDL/DML/DCL(and memqcache_expire). If off, it is only triggered # by memqcache_expire. on by default. # (change requires restart) memqcache_maxcache = 400kB # Maximum SELECT result size in bytes. # Must be smaller than memqcache_cache_block_size. Defaults to 400KB. # (change requires restart) memqcache_cache_block_size = 1MB # Cache block size in bytes. Mandatory if memqcache_method = 'shmem'. # Defaults to 1MB. # (change requires restart) memqcache_oiddir = '/var/log/pgpool/oiddir' # Temporary work directory to record table oids # (change requires restart) cache_safe_memqcache_table_list = '' # Comma separated list of table names to memcache # that don't write to database # Regexp are accepted cache_unsafe_memqcache_table_list = '' # Comma separated list of table names not to memcache # that don't write to database # Regexp are accepted pgpool.conf (47,025 bytes)

pengbo 2021-06-28 17:54 developer ~0003885	I think this "timed out" error may be caused by network failure. Has the same error output in pgpool.log on 10.234.234.66?

Ken 2021-06-28 19:17 reporter ~0003887	Yes on machine 10.234.234.66 is such same error. What i have to do that this error not appearing, maybe increase timeout? But if would were problem with network that replication had problem too while replication is ok What may happen if cluster will has error connection timeout? This has fluence on work all cluster?

Ken 2021-06-28 21:57 reporter ~0003889	Additional I add that on witness hasn't this problem. So If problem would was in network, on witness server would was this problem too.

Ken 2021-06-29 16:35 reporter ~0003890	I see now additional things on primary and standby. Witness hasn't this entries: 2021-06-29 07:47:09: pid 8315: LOG: failover or failback event detected 2021-06-29 07:47:09: pid 8315: DETAIL: restarting myself 2021-06-29 07:47:09: pid 8313: LOG: failover or failback event detected 2021-06-29 07:47:09: pid 8313: DETAIL: restarting myself 2021-06-29 07:47:09: pid 1080: LOG: failover or failback event detected 2021-06-29 07:47:09: pid 1080: DETAIL: restarting myself Whole log in attachment primary_log.txt (24,377 bytes) 2021-06-29 00:10:29: pid 24250: LOG: Replication of node:1 is behind 133404544 bytes from the primary server (node:0) 2021-06-29 00:10:29: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 00:22:07: pid 15137: LOG: read from socket failed with error :"Connection timed out" 2021-06-29 00:22:07: pid 15137: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-29 00:22:07: pid 15137: LOG: new outbound connection to 10.234.234.66:9000 2021-06-29 00:29:01: pid 24250: LOG: Replication of node:1 is behind 14021088 bytes from the primary server (node:0) 2021-06-29 00:29:01: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 00:29:11: pid 24250: LOG: Replication of node:1 is behind 76323104 bytes from the primary server (node:0) 2021-06-29 00:29:11: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 00:30:42: pid 24250: LOG: Replication of node:1 is behind 125890696 bytes from the primary server (node:0) 2021-06-29 00:30:42: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:01:15: pid 24250: LOG: Replication of node:1 is behind 58599768 bytes from the primary server (node:0) 2021-06-29 01:01:15: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:01:25: pid 24250: LOG: Replication of node:1 is behind 25261192 bytes from the primary server (node:0) 2021-06-29 01:01:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:05:25: pid 24250: LOG: Replication of node:1 is behind 41593728 bytes from the primary server (node:0) 2021-06-29 01:05:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:06:05: pid 24250: LOG: Replication of node:1 is behind 10109688 bytes from the primary server (node:0) 2021-06-29 01:06:05: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:07:10: pid 15137: LOG: new watchdog node connection is received from "10.234.234.66:60032" 2021-06-29 01:07:10: pid 15137: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-29 01:07:10: pid 15137: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 01:08:56: pid 24250: LOG: Replication of node:1 is behind 10902232 bytes from the primary server (node:0) 2021-06-29 01:08:56: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:20:57: pid 24250: LOG: Replication of node:1 is behind 13287120 bytes from the primary server (node:0) 2021-06-29 01:20:57: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:30:48: pid 24250: LOG: Replication of node:1 is behind 16779368 bytes from the primary server (node:0) 2021-06-29 01:30:48: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:55:41: pid 24250: LOG: Replication of node:1 is behind 11724752 bytes from the primary server (node:0) 2021-06-29 01:55:41: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:55:51: pid 24250: LOG: Replication of node:1 is behind 44401856 bytes from the primary server (node:0) 2021-06-29 01:55:51: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:56:01: pid 24250: LOG: Replication of node:1 is behind 43045144 bytes from the primary server (node:0) 2021-06-29 01:56:01: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:56:22: pid 24250: LOG: Replication of node:1 is behind 50884912 bytes from the primary server (node:0) 2021-06-29 01:56:22: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:56:32: pid 24250: LOG: Replication of node:1 is behind 145428384 bytes from the primary server (node:0) 2021-06-29 01:56:32: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:05:03: pid 24250: LOG: Replication of node:1 is behind 35012128 bytes from the primary server (node:0) 2021-06-29 02:05:03: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:05:43: pid 24250: LOG: Replication of node:1 is behind 12115048 bytes from the primary server (node:0) 2021-06-29 02:05:43: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:20:15: pid 24250: LOG: Replication of node:1 is behind 10686360 bytes from the primary server (node:0) 2021-06-29 02:20:15: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:25:25: pid 24250: LOG: Replication of node:1 is behind 18296856 bytes from the primary server (node:0) 2021-06-29 02:25:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:26:15: pid 24250: LOG: Replication of node:1 is behind 23071936 bytes from the primary server (node:0) 2021-06-29 02:26:15: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:26:25: pid 24250: LOG: Replication of node:1 is behind 167110704 bytes from the primary server (node:0) 2021-06-29 02:26:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:30:16: pid 24250: LOG: Replication of node:1 is behind 134225000 bytes from the primary server (node:0) 2021-06-29 02:30:16: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:33:44: pid 15137: LOG: read from socket failed with error :"Connection timed out" 2021-06-29 02:33:44: pid 15137: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-29 02:33:44: pid 15137: LOG: new outbound connection to 10.234.234.66:9000 2021-06-29 02:55:19: pid 24250: LOG: Replication of node:1 is behind 21476384 bytes from the primary server (node:0) 2021-06-29 02:55:19: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:55:29: pid 24250: LOG: Replication of node:1 is behind 23143320 bytes from the primary server (node:0) 2021-06-29 02:55:29: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 02:55:39: pid 24250: LOG: Replication of node:1 is behind 92030936 bytes from the primary server (node:0) 2021-06-29 02:55:39: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 03:07:19: pid 15137: LOG: new watchdog node connection is received from "10.234.234.66:1210" 2021-06-29 03:07:19: pid 15137: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-29 03:07:19: pid 15137: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 03:28:02: pid 24250: LOG: Replication of node:1 is behind 18846568 bytes from the primary server (node:0) 2021-06-29 03:28:02: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 03:30:03: pid 24250: LOG: Replication of node:1 is behind 28660584 bytes from the primary server (node:0) 2021-06-29 03:30:03: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 03:30:23: pid 24250: LOG: Replication of node:1 is behind 16765928 bytes from the primary server (node:0) 2021-06-29 03:30:23: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 03:30:33: pid 24250: LOG: Replication of node:1 is behind 30675344 bytes from the primary server (node:0) 2021-06-29 03:30:33: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 03:55:25: pid 24250: LOG: Replication of node:1 is behind 14049336 bytes from the primary server (node:0) 2021-06-29 03:55:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:05:26: pid 24250: LOG: Replication of node:1 is behind 28285160 bytes from the primary server (node:0) 2021-06-29 04:05:26: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:07:06: pid 24250: LOG: Replication of node:1 is behind 46823888 bytes from the primary server (node:0) 2021-06-29 04:07:06: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:07:16: pid 24250: LOG: Replication of node:1 is behind 218047440 bytes from the primary server (node:0) 2021-06-29 04:07:16: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:07:26: pid 24250: LOG: Replication of node:1 is behind 364145744 bytes from the primary server (node:0) 2021-06-29 04:07:26: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:07:36: pid 24250: LOG: Replication of node:1 is behind 262000600 bytes from the primary server (node:0) 2021-06-29 04:07:36: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:09:37: pid 24250: LOG: Replication of node:1 is behind 79957160 bytes from the primary server (node:0) 2021-06-29 04:09:37: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:09:47: pid 24250: LOG: Replication of node:1 is behind 317662024 bytes from the primary server (node:0) 2021-06-29 04:09:47: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:09:57: pid 24250: LOG: Replication of node:1 is behind 585409152 bytes from the primary server (node:0) 2021-06-29 04:09:57: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:10:07: pid 24250: LOG: Replication of node:1 is behind 776344456 bytes from the primary server (node:0) 2021-06-29 04:10:07: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:10:17: pid 24250: LOG: Replication of node:1 is behind 533161040 bytes from the primary server (node:0) 2021-06-29 04:10:17: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:11:58: pid 11771: LOG: failover or failback event detected 2021-06-29 04:11:58: pid 11771: DETAIL: restarting myself 2021-06-29 04:11:58: pid 15134: LOG: child process with pid: 11771 exits with status 256 2021-06-29 04:11:58: pid 15134: LOG: fork a new child process with pid: 28594 2021-06-29 04:30:19: pid 24250: LOG: Replication of node:1 is behind 1328266239 bytes from the primary server (node:0) 2021-06-29 04:30:19: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:30:30: pid 24250: LOG: Replication of node:1 is behind 1146858767 bytes from the primary server (node:0) 2021-06-29 04:30:30: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:30:40: pid 24250: LOG: Replication of node:1 is behind 1050516928 bytes from the primary server (node:0) 2021-06-29 04:30:40: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:30:50: pid 24250: LOG: Replication of node:1 is behind 396405416 bytes from the primary server (node:0) 2021-06-29 04:30:50: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:36:40: pid 24250: LOG: Replication of node:1 is behind 86744448 bytes from the primary server (node:0) 2021-06-29 04:36:40: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:36:50: pid 24250: LOG: Replication of node:1 is behind 1197558840 bytes from the primary server (node:0) 2021-06-29 04:36:50: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:37:00: pid 24250: LOG: Replication of node:1 is behind 1748617655 bytes from the primary server (node:0) 2021-06-29 04:37:00: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:37:10: pid 24250: LOG: Replication of node:1 is behind 794987696 bytes from the primary server (node:0) 2021-06-29 04:37:10: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:45:21: pid 15137: LOG: read from socket failed with error :"Connection timed out" 2021-06-29 04:45:21: pid 15137: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-29 04:45:21: pid 15137: LOG: new outbound connection to 10.234.234.66:9000 2021-06-29 04:53:52: pid 24250: LOG: Replication of node:1 is behind 55982104 bytes from the primary server (node:0) 2021-06-29 04:53:52: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 04:57:03: pid 24250: LOG: Replication of node:1 is behind 11365512 bytes from the primary server (node:0) 2021-06-29 04:57:03: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:01:53: pid 24250: LOG: Replication of node:1 is behind 12067168 bytes from the primary server (node:0) 2021-06-29 05:01:53: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:07:28: pid 15137: LOG: new watchdog node connection is received from "10.234.234.66:31987" 2021-06-29 05:07:28: pid 15137: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-29 05:07:28: pid 15137: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 05:08:04: pid 24250: LOG: Replication of node:1 is behind 152545072 bytes from the primary server (node:0) 2021-06-29 05:08:04: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:08:14: pid 24250: LOG: Replication of node:1 is behind 305148984 bytes from the primary server (node:0) 2021-06-29 05:08:14: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:08:24: pid 24250: LOG: Replication of node:1 is behind 610603888 bytes from the primary server (node:0) 2021-06-29 05:08:24: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:08:34: pid 24250: LOG: Replication of node:1 is behind 1010388064 bytes from the primary server (node:0) 2021-06-29 05:08:34: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:08:44: pid 24250: LOG: Replication of node:1 is behind 1593603567 bytes from the primary server (node:0) 2021-06-29 05:08:44: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:08:54: pid 24250: LOG: Replication of node:1 is behind 1692611288 bytes from the primary server (node:0) 2021-06-29 05:08:54: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:09:04: pid 24250: LOG: Replication of node:1 is behind 1761608016 bytes from the primary server (node:0) 2021-06-29 05:09:04: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:09:14: pid 24250: LOG: Replication of node:1 is behind 1530934800 bytes from the primary server (node:0) 2021-06-29 05:09:14: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:09:24: pid 24250: LOG: Replication of node:1 is behind 548216760 bytes from the primary server (node:0) 2021-06-29 05:09:24: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:30:36: pid 24250: LOG: Replication of node:1 is behind 18961576 bytes from the primary server (node:0) 2021-06-29 05:30:36: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:55:30: pid 24250: LOG: Replication of node:1 is behind 16302136 bytes from the primary server (node:0) 2021-06-29 05:55:30: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 05:55:40: pid 24250: LOG: Replication of node:1 is behind 46455536 bytes from the primary server (node:0) 2021-06-29 05:55:40: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:05:21: pid 24250: LOG: Replication of node:1 is behind 12323592 bytes from the primary server (node:0) 2021-06-29 06:05:21: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:20:23: pid 24250: LOG: Replication of node:1 is behind 23056712 bytes from the primary server (node:0) 2021-06-29 06:20:23: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:25:03: pid 24250: LOG: Replication of node:1 is behind 41496904 bytes from the primary server (node:0) 2021-06-29 06:25:03: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:25:33: pid 24250: LOG: Replication of node:1 is behind 19144528 bytes from the primary server (node:0) 2021-06-29 06:25:33: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:25:43: pid 24250: LOG: Replication of node:1 is behind 56200680 bytes from the primary server (node:0) 2021-06-29 06:25:43: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:25:53: pid 24250: LOG: Replication of node:1 is behind 12174536 bytes from the primary server (node:0) 2021-06-29 06:25:53: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:26:03: pid 24250: LOG: Replication of node:1 is behind 10922496 bytes from the primary server (node:0) 2021-06-29 06:26:03: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:26:13: pid 24250: LOG: Replication of node:1 is behind 39313528 bytes from the primary server (node:0) 2021-06-29 06:26:13: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:30:04: pid 24250: LOG: Replication of node:1 is behind 14443576 bytes from the primary server (node:0) 2021-06-29 06:30:04: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:30:24: pid 24250: LOG: Replication of node:1 is behind 14670744 bytes from the primary server (node:0) 2021-06-29 06:30:24: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:55:16: pid 24250: LOG: Replication of node:1 is behind 52451464 bytes from the primary server (node:0) 2021-06-29 06:55:16: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:55:26: pid 24250: LOG: Replication of node:1 is behind 22808808 bytes from the primary server (node:0) 2021-06-29 06:55:26: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:55:36: pid 24250: LOG: Replication of node:1 is behind 14684648 bytes from the primary server (node:0) 2021-06-29 06:55:36: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 06:56:58: pid 15137: LOG: read from socket failed with error :"Connection timed out" 2021-06-29 06:56:58: pid 15137: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-29 06:56:58: pid 15137: LOG: new outbound connection to 10.234.234.66:9000 2021-06-29 07:05:27: pid 24250: LOG: Replication of node:1 is behind 28560128 bytes from the primary server (node:0) 2021-06-29 07:05:27: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 07:07:37: pid 15137: LOG: new watchdog node connection is received from "10.234.234.66:17232" 2021-06-29 07:07:37: pid 15137: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-29 07:07:37: pid 15137: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 07:20:39: pid 24250: LOG: Replication of node:1 is behind 68994168 bytes from the primary server (node:0) 2021-06-29 07:20:39: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 07:28:51: pid 24250: LOG: Replication of node:1 is behind 10424040 bytes from the primary server (node:0) 2021-06-29 07:28:51: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 07:29:31: pid 24250: LOG: Replication of node:1 is behind 13306160 bytes from the primary server (node:0) 2021-06-29 07:29:31: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 07:30:41: pid 24250: LOG: Replication of node:1 is behind 110902768 bytes from the primary server (node:0) 2021-06-29 07:30:41: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 07:41:12: pid 24250: LOG: Replication of node:1 is behind 14005128 bytes from the primary server (node:0) 2021-06-29 07:41:12: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 07:47:09: pid 8315: LOG: failover or failback event detected 2021-06-29 07:47:09: pid 8315: DETAIL: restarting myself 2021-06-29 07:47:09: pid 8313: LOG: failover or failback event detected 2021-06-29 07:47:09: pid 8313: DETAIL: restarting myself 2021-06-29 07:47:09: pid 1080: LOG: failover or failback event detected 2021-06-29 07:47:09: pid 1080: DETAIL: restarting myself 2021-06-29 07:47:09: pid 15134: LOG: child process with pid: 8313 exits with status 256 2021-06-29 07:47:09: pid 15134: LOG: fork a new child process with pid: 5766 2021-06-29 07:47:09: pid 15134: LOG: child process with pid: 1080 exits with status 256 2021-06-29 07:47:09: pid 15134: LOG: fork a new child process with pid: 5767 2021-06-29 07:47:09: pid 15134: LOG: child process with pid: 8315 exits with status 256 2021-06-29 07:47:09: pid 15134: LOG: fork a new child process with pid: 5768 2021-06-29 07:53:41: pid 15134: LOG: child process with pid: 10599 exits with status 256 2021-06-29 07:53:41: pid 15134: LOG: fork a new child process with pid: 9041 2021-06-29 07:53:41: pid 15134: LOG: child process with pid: 18914 exits with status 256 2021-06-29 07:53:41: pid 15134: LOG: fork a new child process with pid: 9042 2021-06-29 08:11:41: pid 15134: LOG: child process with pid: 18915 exits with status 256 2021-06-29 08:11:41: pid 15134: LOG: fork a new child process with pid: 18834 2021-06-29 08:11:41: pid 15134: LOG: child process with pid: 5767 exits with status 256 2021-06-29 08:11:41: pid 15134: LOG: fork a new child process with pid: 18835 2021-06-29 08:11:41: pid 15134: LOG: child process with pid: 9041 exits with status 256 2021-06-29 08:11:41: pid 15134: LOG: fork a new child process with pid: 18836 2021-06-29 08:11:41: pid 15134: LOG: child process with pid: 20467 exits with status 256 2021-06-29 08:11:41: pid 15134: LOG: fork a new child process with pid: 18837 2021-06-29 08:20:36: pid 24250: LOG: Replication of node:1 is behind 13785600 bytes from the primary server (node:0) 2021-06-29 08:20:36: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 08:28:57: pid 24250: LOG: Replication of node:1 is behind 166531920 bytes from the primary server (node:0) 2021-06-29 08:28:57: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 08:30:27: pid 24250: LOG: Replication of node:1 is behind 60385496 bytes from the primary server (node:0) 2021-06-29 08:30:27: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 08:30:37: pid 24250: LOG: Replication of node:1 is behind 70722048 bytes from the primary server (node:0) 2021-06-29 08:30:37: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:01:32: pid 24250: LOG: Replication of node:1 is behind 320343736 bytes from the primary server (node:0) 2021-06-29 09:01:32: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:01:42: pid 24250: LOG: Replication of node:1 is behind 260935560 bytes from the primary server (node:0) 2021-06-29 09:01:42: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:01:52: pid 24250: LOG: Replication of node:1 is behind 466766216 bytes from the primary server (node:0) 2021-06-29 09:01:52: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:05:22: pid 24250: LOG: Replication of node:1 is behind 80170136 bytes from the primary server (node:0) 2021-06-29 09:05:22: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:07:46: pid 15137: LOG: new watchdog node connection is received from "10.234.234.66:58761" 2021-06-29 09:07:46: pid 15137: LOG: new node joined the cluster hostname:"10.234.234.66" port:9000 pgpool_port:5433 2021-06-29 09:07:46: pid 15137: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 09:08:35: pid 15137: LOG: read from socket failed with error :"Connection timed out" 2021-06-29 09:08:35: pid 15137: LOG: client socket of 10.234.234.66:5433 Linux m2mdbsz.unx.t-mobile.pl is closed 2021-06-29 09:08:35: pid 15137: LOG: new outbound connection to 10.234.234.66:9000 2021-06-29 09:10:23: pid 24250: LOG: Replication of node:1 is behind 47193120 bytes from the primary server (node:0) 2021-06-29 09:10:23: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:27:12: pid 15134: LOG: child process with pid: 18913 exits with status 256 2021-06-29 09:27:12: pid 15134: LOG: fork a new child process with pid: 31414 2021-06-29 09:28:25: pid 24250: LOG: Replication of node:1 is behind 32233632 bytes from the primary server (node:0) 2021-06-29 09:28:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:30:05: pid 24250: LOG: Replication of node:1 is behind 15912952 bytes from the primary server (node:0) 2021-06-29 09:30:05: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 09:30:25: pid 24250: LOG: Replication of node:1 is behind 71441048 bytes from the primary server (node:0) 2021-06-29 09:30:25: pid 24250: CONTEXT: while checking replication time lag primary_log.txt (24,377 bytes) standby_log.txt (23,673 bytes) 2021-06-29 00:05:25: pid 2990: LOG: Replication of node:1 is behind 176513375 bytes from the primary server (node:0) 2021-06-29 00:05:25: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 00:10:26: pid 2990: LOG: Replication of node:1 is behind 100665528 bytes from the primary server (node:0) 2021-06-29 00:10:26: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 00:10:56: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 00:10:56: pid 6083: LOG: outbound socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 00:20:27: pid 2990: LOG: Replication of node:1 is behind 179354712 bytes from the primary server (node:0) 2021-06-29 00:20:27: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 00:22:08: pid 6083: LOG: new watchdog node connection is received from "10.234.226.66:36210" 2021-06-29 00:22:08: pid 6083: LOG: new node joined the cluster hostname:"10.234.226.66" port:9000 pgpool_port:5433 2021-06-29 00:22:08: pid 6083: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 00:28:58: pid 2990: LOG: Replication of node:1 is behind 48987096 bytes from the primary server (node:0) 2021-06-29 00:28:58: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 00:29:18: pid 2990: LOG: Replication of node:1 is behind 28469600 bytes from the primary server (node:0) 2021-06-29 00:29:18: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 00:30:38: pid 2990: LOG: Replication of node:1 is behind 30674688 bytes from the primary server (node:0) 2021-06-29 00:30:38: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 00:59:32: pid 2990: LOG: Replication of node:1 is behind 11102616 bytes from the primary server (node:0) 2021-06-29 00:59:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:01:12: pid 2990: LOG: Replication of node:1 is behind 16781136 bytes from the primary server (node:0) 2021-06-29 01:01:12: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:01:22: pid 2990: LOG: Replication of node:1 is behind 79053728 bytes from the primary server (node:0) 2021-06-29 01:01:22: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:01:32: pid 2990: LOG: Replication of node:1 is behind 48498119 bytes from the primary server (node:0) 2021-06-29 01:01:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:07:11: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 01:07:11: pid 6083: LOG: client socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 01:07:11: pid 6083: LOG: new outbound connection to 10.234.226.66:9000 2021-06-29 01:22:04: pid 2990: LOG: Replication of node:1 is behind 17256624 bytes from the primary server (node:0) 2021-06-29 01:22:04: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:30:55: pid 2990: LOG: Replication of node:1 is behind 11275480 bytes from the primary server (node:0) 2021-06-29 01:30:55: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:34:16: pid 2990: LOG: Replication of node:1 is behind 15594896 bytes from the primary server (node:0) 2021-06-29 01:34:16: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:55:48: pid 2990: LOG: Replication of node:1 is behind 22630728 bytes from the primary server (node:0) 2021-06-29 01:55:48: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:55:58: pid 2990: LOG: Replication of node:1 is behind 131529200 bytes from the primary server (node:0) 2021-06-29 01:55:58: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:56:08: pid 2990: LOG: Replication of node:1 is behind 61038000 bytes from the primary server (node:0) 2021-06-29 01:56:08: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:56:18: pid 2990: LOG: Replication of node:1 is behind 28720344 bytes from the primary server (node:0) 2021-06-29 01:56:18: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 01:56:28: pid 2990: LOG: Replication of node:1 is behind 187760728 bytes from the primary server (node:0) 2021-06-29 01:56:28: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:20:01: pid 2990: LOG: Replication of node:1 is behind 31685408 bytes from the primary server (node:0) 2021-06-29 02:20:01: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:20:21: pid 2990: LOG: Replication of node:1 is behind 73970736 bytes from the primary server (node:0) 2021-06-29 02:20:21: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:22:33: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 02:22:33: pid 6083: LOG: outbound socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 02:25:32: pid 2990: LOG: Replication of node:1 is behind 10996096 bytes from the primary server (node:0) 2021-06-29 02:25:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:26:22: pid 2990: LOG: Replication of node:1 is behind 88087448 bytes from the primary server (node:0) 2021-06-29 02:26:22: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:26:32: pid 2990: LOG: Replication of node:1 is behind 188221560 bytes from the primary server (node:0) 2021-06-29 02:26:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:30:12: pid 2990: LOG: Replication of node:1 is behind 14371472 bytes from the primary server (node:0) 2021-06-29 02:30:12: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:30:22: pid 2990: LOG: Replication of node:1 is behind 118489704 bytes from the primary server (node:0) 2021-06-29 02:30:22: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:33:45: pid 6083: LOG: new watchdog node connection is received from "10.234.226.66:41908" 2021-06-29 02:33:45: pid 6083: LOG: new node joined the cluster hostname:"10.234.226.66" port:9000 pgpool_port:5433 2021-06-29 02:33:45: pid 6083: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 02:55:25: pid 2990: LOG: Replication of node:1 is behind 21343184 bytes from the primary server (node:0) 2021-06-29 02:55:25: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 02:55:35: pid 2990: LOG: Replication of node:1 is behind 16779040 bytes from the primary server (node:0) 2021-06-29 02:55:35: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 03:07:20: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 03:07:20: pid 6083: LOG: client socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 03:07:20: pid 6083: LOG: new outbound connection to 10.234.226.66:9000 2021-06-29 03:28:09: pid 2990: LOG: Replication of node:1 is behind 19410840 bytes from the primary server (node:0) 2021-06-29 03:28:09: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 03:55:02: pid 2990: LOG: Replication of node:1 is behind 10346456 bytes from the primary server (node:0) 2021-06-29 03:55:02: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 03:55:32: pid 2990: LOG: Replication of node:1 is behind 12978064 bytes from the primary server (node:0) 2021-06-29 03:55:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 03:56:22: pid 2990: LOG: Replication of node:1 is behind 10950424 bytes from the primary server (node:0) 2021-06-29 03:56:22: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:07:03: pid 2990: LOG: Replication of node:1 is behind 104605000 bytes from the primary server (node:0) 2021-06-29 04:07:03: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:07:13: pid 2990: LOG: Replication of node:1 is behind 134902112 bytes from the primary server (node:0) 2021-06-29 04:07:13: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:07:23: pid 2990: LOG: Replication of node:1 is behind 303472656 bytes from the primary server (node:0) 2021-06-29 04:07:23: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:07:33: pid 2990: LOG: Replication of node:1 is behind 477477232 bytes from the primary server (node:0) 2021-06-29 04:07:33: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:09:35: pid 2990: LOG: Replication of node:1 is behind 33204512 bytes from the primary server (node:0) 2021-06-29 04:09:35: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:09:45: pid 2990: LOG: Replication of node:1 is behind 225770040 bytes from the primary server (node:0) 2021-06-29 04:09:45: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:09:55: pid 2990: LOG: Replication of node:1 is behind 500162592 bytes from the primary server (node:0) 2021-06-29 04:09:55: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:10:05: pid 2990: LOG: Replication of node:1 is behind 645736240 bytes from the primary server (node:0) 2021-06-29 04:10:05: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:10:15: pid 2990: LOG: Replication of node:1 is behind 809913832 bytes from the primary server (node:0) 2021-06-29 04:10:15: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:30:17: pid 2990: LOG: Replication of node:1 is behind 800400288 bytes from the primary server (node:0) 2021-06-29 04:30:17: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:30:27: pid 2990: LOG: Replication of node:1 is behind 1312596279 bytes from the primary server (node:0) 2021-06-29 04:30:27: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:30:37: pid 2990: LOG: Replication of node:1 is behind 1112920712 bytes from the primary server (node:0) 2021-06-29 04:30:37: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:30:47: pid 2990: LOG: Replication of node:1 is behind 592686048 bytes from the primary server (node:0) 2021-06-29 04:30:47: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:33:47: pid 2990: LOG: Replication of node:1 is behind 12249208 bytes from the primary server (node:0) 2021-06-29 04:33:47: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:34:10: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 04:34:10: pid 6083: LOG: outbound socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 04:36:48: pid 2990: LOG: Replication of node:1 is behind 823102112 bytes from the primary server (node:0) 2021-06-29 04:36:48: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:36:58: pid 2990: LOG: Replication of node:1 is behind 1849934583 bytes from the primary server (node:0) 2021-06-29 04:36:58: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:37:08: pid 2990: LOG: Replication of node:1 is behind 1108250695 bytes from the primary server (node:0) 2021-06-29 04:37:08: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:37:18: pid 2990: LOG: Replication of node:1 is behind 195220240 bytes from the primary server (node:0) 2021-06-29 04:37:18: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 04:45:22: pid 6083: LOG: new watchdog node connection is received from "10.234.226.66:33782" 2021-06-29 04:45:22: pid 6083: LOG: new node joined the cluster hostname:"10.234.226.66" port:9000 pgpool_port:5433 2021-06-29 04:45:22: pid 6083: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 04:53:50: pid 2990: LOG: Replication of node:1 is behind 17337904 bytes from the primary server (node:0) 2021-06-29 04:53:50: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:01:21: pid 2990: LOG: Replication of node:1 is behind 27242680 bytes from the primary server (node:0) 2021-06-29 05:01:21: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:07:29: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 05:07:29: pid 6083: LOG: client socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 05:07:29: pid 6083: LOG: new outbound connection to 10.234.226.66:9000 2021-06-29 05:08:02: pid 2990: LOG: Replication of node:1 is behind 44491936 bytes from the primary server (node:0) 2021-06-29 05:08:02: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:08:12: pid 2990: LOG: Replication of node:1 is behind 266210424 bytes from the primary server (node:0) 2021-06-29 05:08:12: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:08:22: pid 2990: LOG: Replication of node:1 is behind 519484944 bytes from the primary server (node:0) 2021-06-29 05:08:22: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:08:32: pid 2990: LOG: Replication of node:1 is behind 852377160 bytes from the primary server (node:0) 2021-06-29 05:08:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:08:42: pid 2990: LOG: Replication of node:1 is behind 1504522215 bytes from the primary server (node:0) 2021-06-29 05:08:42: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:08:52: pid 2990: LOG: Replication of node:1 is behind 1604292855 bytes from the primary server (node:0) 2021-06-29 05:08:52: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:09:02: pid 2990: LOG: Replication of node:1 is behind 1628427064 bytes from the primary server (node:0) 2021-06-29 05:09:02: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:09:12: pid 2990: LOG: Replication of node:1 is behind 1805297336 bytes from the primary server (node:0) 2021-06-29 05:09:12: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:09:22: pid 2990: LOG: Replication of node:1 is behind 834576472 bytes from the primary server (node:0) 2021-06-29 05:09:22: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:32:05: pid 2990: LOG: Replication of node:1 is behind 16682232 bytes from the primary server (node:0) 2021-06-29 05:32:05: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:55:38: pid 2990: LOG: Replication of node:1 is behind 26308576 bytes from the primary server (node:0) 2021-06-29 05:55:38: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 05:55:58: pid 2990: LOG: Replication of node:1 is behind 10007760 bytes from the primary server (node:0) 2021-06-29 05:55:58: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:20:21: pid 2990: LOG: Replication of node:1 is behind 10809264 bytes from the primary server (node:0) 2021-06-29 06:20:21: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:25:32: pid 2990: LOG: Replication of node:1 is behind 12058192 bytes from the primary server (node:0) 2021-06-29 06:25:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:25:42: pid 2990: LOG: Replication of node:1 is behind 22942992 bytes from the primary server (node:0) 2021-06-29 06:25:42: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:26:02: pid 2990: LOG: Replication of node:1 is behind 13879392 bytes from the primary server (node:0) 2021-06-29 06:26:02: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:26:12: pid 2990: LOG: Replication of node:1 is behind 138906200 bytes from the primary server (node:0) 2021-06-29 06:26:12: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:26:32: pid 2990: LOG: Replication of node:1 is behind 12469056 bytes from the primary server (node:0) 2021-06-29 06:26:32: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:30:03: pid 2990: LOG: Replication of node:1 is behind 13686272 bytes from the primary server (node:0) 2021-06-29 06:30:03: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:30:23: pid 2990: LOG: Replication of node:1 is behind 21356872 bytes from the primary server (node:0) 2021-06-29 06:30:23: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:45:47: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 06:45:47: pid 6083: LOG: outbound socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 06:55:26: pid 2990: LOG: Replication of node:1 is behind 15357432 bytes from the primary server (node:0) 2021-06-29 06:55:26: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 06:56:59: pid 6083: LOG: new watchdog node connection is received from "10.234.226.66:4699" 2021-06-29 06:56:59: pid 6083: LOG: new node joined the cluster hostname:"10.234.226.66" port:9000 pgpool_port:5433 2021-06-29 06:56:59: pid 6083: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 07:05:27: pid 2990: LOG: Replication of node:1 is behind 13370992 bytes from the primary server (node:0) 2021-06-29 07:05:27: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 07:07:38: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 07:07:38: pid 6083: LOG: client socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 07:07:38: pid 6083: LOG: new outbound connection to 10.234.226.66:9000 2021-06-29 07:20:39: pid 2990: LOG: Replication of node:1 is behind 19195600 bytes from the primary server (node:0) 2021-06-29 07:20:39: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 07:29:31: pid 2990: LOG: Replication of node:1 is behind 21679512 bytes from the primary server (node:0) 2021-06-29 07:29:31: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 07:30:41: pid 2990: LOG: Replication of node:1 is behind 138639448 bytes from the primary server (node:0) 2021-06-29 07:30:41: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 07:41:13: pid 2990: LOG: Replication of node:1 is behind 13710216 bytes from the primary server (node:0) 2021-06-29 07:41:13: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 07:47:10: pid 20237: LOG: failover or failback event detected 2021-06-29 07:47:10: pid 20237: DETAIL: restarting myself 2021-06-29 07:47:10: pid 6079: LOG: child process with pid: 20237 exits with status 256 2021-06-29 07:47:10: pid 6079: LOG: fork a new child process with pid: 8372 2021-06-29 07:47:10: pid 20304: LOG: selecting backend connection 2021-06-29 07:47:10: pid 20304: DETAIL: failover or failback event detected, discarding existing connections 2021-06-29 07:47:10: pid 20286: LOG: selecting backend connection 2021-06-29 07:47:10: pid 20286: DETAIL: failover or failback event detected, discarding existing connections 2021-06-29 07:53:42: pid 6079: LOG: child process with pid: 13014 exits with status 256 2021-06-29 07:53:42: pid 6079: LOG: fork a new child process with pid: 10207 2021-06-29 07:53:42: pid 6079: LOG: child process with pid: 1356 exits with status 256 2021-06-29 07:53:42: pid 6079: LOG: fork a new child process with pid: 10208 2021-06-29 07:53:42: pid 6079: LOG: child process with pid: 20304 exits with status 256 2021-06-29 07:53:42: pid 6079: LOG: fork a new child process with pid: 10209 2021-06-29 07:53:42: pid 6079: LOG: child process with pid: 3137 exits with status 256 2021-06-29 07:53:42: pid 6079: LOG: fork a new child process with pid: 10210 2021-06-29 07:53:42: pid 6079: LOG: child process with pid: 4136 exits with status 256 2021-06-29 07:53:42: pid 6079: LOG: fork a new child process with pid: 10211 2021-06-29 07:53:42: pid 6079: LOG: child process with pid: 5914 exits with status 256 2021-06-29 07:53:42: pid 6079: LOG: fork a new child process with pid: 10212 2021-06-29 08:11:42: pid 6079: LOG: child process with pid: 10207 exits with status 256 2021-06-29 08:11:42: pid 6079: LOG: fork a new child process with pid: 15316 2021-06-29 08:11:42: pid 6079: LOG: child process with pid: 10209 exits with status 256 2021-06-29 08:11:42: pid 6079: LOG: fork a new child process with pid: 15317 2021-06-29 08:11:42: pid 6079: LOG: child process with pid: 10208 exits with status 256 2021-06-29 08:11:42: pid 6079: LOG: fork a new child process with pid: 15318 2021-06-29 08:11:42: pid 6079: LOG: child process with pid: 8372 exits with status 256 2021-06-29 08:11:42: pid 6079: LOG: fork a new child process with pid: 15319 2021-06-29 08:28:58: pid 2990: LOG: Replication of node:1 is behind 142914336 bytes from the primary server (node:0) 2021-06-29 08:28:58: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 08:30:28: pid 2990: LOG: Replication of node:1 is behind 64445752 bytes from the primary server (node:0) 2021-06-29 08:30:28: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 08:30:38: pid 2990: LOG: Replication of node:1 is behind 85200912 bytes from the primary server (node:0) 2021-06-29 08:30:38: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 08:57:24: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 08:57:24: pid 6083: LOG: outbound socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 09:01:33: pid 2990: LOG: Replication of node:1 is behind 318634880 bytes from the primary server (node:0) 2021-06-29 09:01:33: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:01:43: pid 2990: LOG: Replication of node:1 is behind 263166792 bytes from the primary server (node:0) 2021-06-29 09:01:43: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:01:53: pid 2990: LOG: Replication of node:1 is behind 467253920 bytes from the primary server (node:0) 2021-06-29 09:01:53: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:05:23: pid 2990: LOG: Replication of node:1 is behind 74451328 bytes from the primary server (node:0) 2021-06-29 09:05:23: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:07:47: pid 6083: LOG: read from socket failed with error :"Connection reset by peer" 2021-06-29 09:07:47: pid 6083: LOG: client socket of 10.234.226.66:5433 Linux m2mdban is closed 2021-06-29 09:07:47: pid 6083: LOG: new outbound connection to 10.234.226.66:9000 2021-06-29 09:08:36: pid 6083: LOG: new watchdog node connection is received from "10.234.226.66:12955" 2021-06-29 09:08:36: pid 6083: LOG: new node joined the cluster hostname:"10.234.226.66" port:9000 pgpool_port:5433 2021-06-29 09:08:36: pid 6083: DETAIL: Pgpool-II version:"4.2.2" watchdog messaging version: 1.2 2021-06-29 09:10:04: pid 2990: LOG: Replication of node:1 is behind 10226368 bytes from the primary server (node:0) 2021-06-29 09:10:04: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:10:24: pid 2990: LOG: Replication of node:1 is behind 50857656 bytes from the primary server (node:0) 2021-06-29 09:10:24: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:28:26: pid 2990: LOG: Replication of node:1 is behind 19693960 bytes from the primary server (node:0) 2021-06-29 09:28:26: pid 2990: CONTEXT: while checking replication time lag 2021-06-29 09:30:26: pid 2990: LOG: Replication of node:1 is behind 70674064 bytes from the primary server (node:0) 2021-06-29 09:30:26: pid 2990: CONTEXT: while checking replication time lag standby_log.txt (23,673 bytes) witness_log.txt (15,882 bytes) 2021-06-29 00:05:21: pid 23556: LOG: Replication of node:1 is behind 70688232 bytes from the primary server (node:0) 2021-06-29 00:05:21: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 00:20:23: pid 23556: LOG: Replication of node:1 is behind 93510544 bytes from the primary server (node:0) 2021-06-29 00:20:23: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 00:29:04: pid 23556: LOG: Replication of node:1 is behind 43173688 bytes from the primary server (node:0) 2021-06-29 00:29:04: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 00:30:45: pid 23556: LOG: Replication of node:1 is behind 82844648 bytes from the primary server (node:0) 2021-06-29 00:30:45: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 00:57:08: pid 23556: LOG: Replication of node:1 is behind 16287048 bytes from the primary server (node:0) 2021-06-29 00:57:08: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:01:18: pid 23556: LOG: Replication of node:1 is behind 106494392 bytes from the primary server (node:0) 2021-06-29 01:01:18: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:05:29: pid 23556: LOG: Replication of node:1 is behind 184550584 bytes from the primary server (node:0) 2021-06-29 01:05:29: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:32:03: pid 23556: LOG: Replication of node:1 is behind 10654792 bytes from the primary server (node:0) 2021-06-29 01:32:03: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:34:13: pid 23556: LOG: Replication of node:1 is behind 12739792 bytes from the primary server (node:0) 2021-06-29 01:34:13: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:55:05: pid 23556: LOG: Replication of node:1 is behind 22054680 bytes from the primary server (node:0) 2021-06-29 01:55:05: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:55:55: pid 23556: LOG: Replication of node:1 is behind 91099040 bytes from the primary server (node:0) 2021-06-29 01:55:55: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:56:05: pid 23556: LOG: Replication of node:1 is behind 24831304 bytes from the primary server (node:0) 2021-06-29 01:56:05: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 01:56:26: pid 23556: LOG: Replication of node:1 is behind 79351184 bytes from the primary server (node:0) 2021-06-29 01:56:26: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 02:20:29: pid 23556: LOG: Replication of node:1 is behind 41241784 bytes from the primary server (node:0) 2021-06-29 02:20:29: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 02:25:30: pid 23556: LOG: Replication of node:1 is behind 18529648 bytes from the primary server (node:0) 2021-06-29 02:25:30: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 02:26:20: pid 23556: LOG: Replication of node:1 is behind 44386256 bytes from the primary server (node:0) 2021-06-29 02:26:20: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 02:26:30: pid 23556: LOG: Replication of node:1 is behind 227139640 bytes from the primary server (node:0) 2021-06-29 02:26:30: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 02:30:21: pid 23556: LOG: Replication of node:1 is behind 206151784 bytes from the primary server (node:0) 2021-06-29 02:30:21: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 02:55:25: pid 23556: LOG: Replication of node:1 is behind 14114840 bytes from the primary server (node:0) 2021-06-29 02:55:25: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 03:28:09: pid 23556: LOG: Replication of node:1 is behind 15339416 bytes from the primary server (node:0) 2021-06-29 03:28:09: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 03:30:19: pid 23556: LOG: Replication of node:1 is behind 16782568 bytes from the primary server (node:0) 2021-06-29 03:30:19: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:07:04: pid 23556: LOG: Replication of node:1 is behind 101984544 bytes from the primary server (node:0) 2021-06-29 04:07:04: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:07:14: pid 23556: LOG: Replication of node:1 is behind 151462752 bytes from the primary server (node:0) 2021-06-29 04:07:14: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:07:24: pid 23556: LOG: Replication of node:1 is behind 314299104 bytes from the primary server (node:0) 2021-06-29 04:07:24: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:07:34: pid 23556: LOG: Replication of node:1 is behind 470608432 bytes from the primary server (node:0) 2021-06-29 04:07:34: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:09:44: pid 23556: LOG: Replication of node:1 is behind 214143800 bytes from the primary server (node:0) 2021-06-29 04:09:44: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:09:54: pid 23556: LOG: Replication of node:1 is behind 477149128 bytes from the primary server (node:0) 2021-06-29 04:09:54: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:10:04: pid 23556: LOG: Replication of node:1 is behind 665042504 bytes from the primary server (node:0) 2021-06-29 04:10:04: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:10:14: pid 23556: LOG: Replication of node:1 is behind 901907088 bytes from the primary server (node:0) 2021-06-29 04:10:14: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:10:54: pid 23556: LOG: Replication of node:1 is behind 27791896 bytes from the primary server (node:0) 2021-06-29 04:10:54: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:20:45: pid 23556: LOG: Replication of node:1 is behind 17679464 bytes from the primary server (node:0) 2021-06-29 04:20:45: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:30:17: pid 23556: LOG: Replication of node:1 is behind 688768920 bytes from the primary server (node:0) 2021-06-29 04:30:17: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:30:27: pid 23556: LOG: Replication of node:1 is behind 1371738663 bytes from the primary server (node:0) 2021-06-29 04:30:27: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:30:37: pid 23556: LOG: Replication of node:1 is behind 1134302584 bytes from the primary server (node:0) 2021-06-29 04:30:37: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:30:47: pid 23556: LOG: Replication of node:1 is behind 652029536 bytes from the primary server (node:0) 2021-06-29 04:30:47: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:33:47: pid 23556: LOG: Replication of node:1 is behind 16780160 bytes from the primary server (node:0) 2021-06-29 04:33:47: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:36:47: pid 23556: LOG: Replication of node:1 is behind 722309264 bytes from the primary server (node:0) 2021-06-29 04:36:47: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:36:57: pid 23556: LOG: Replication of node:1 is behind 1772802119 bytes from the primary server (node:0) 2021-06-29 04:36:57: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:37:07: pid 23556: LOG: Replication of node:1 is behind 1183484247 bytes from the primary server (node:0) 2021-06-29 04:37:07: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 04:37:17: pid 23556: LOG: Replication of node:1 is behind 268736328 bytes from the primary server (node:0) 2021-06-29 04:37:17: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:01:22: pid 23556: LOG: Replication of node:1 is behind 22669616 bytes from the primary server (node:0) 2021-06-29 05:01:22: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:08:03: pid 23556: LOG: Replication of node:1 is behind 44234200 bytes from the primary server (node:0) 2021-06-29 05:08:03: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:08:13: pid 23556: LOG: Replication of node:1 is behind 266298280 bytes from the primary server (node:0) 2021-06-29 05:08:13: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:08:23: pid 23556: LOG: Replication of node:1 is behind 518640816 bytes from the primary server (node:0) 2021-06-29 05:08:23: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:08:33: pid 23556: LOG: Replication of node:1 is behind 852275416 bytes from the primary server (node:0) 2021-06-29 05:08:33: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:08:43: pid 23556: LOG: Replication of node:1 is behind 1504818831 bytes from the primary server (node:0) 2021-06-29 05:08:43: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:08:53: pid 23556: LOG: Replication of node:1 is behind 1607090175 bytes from the primary server (node:0) 2021-06-29 05:08:53: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:09:03: pid 23556: LOG: Replication of node:1 is behind 1627392088 bytes from the primary server (node:0) 2021-06-29 05:09:03: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:09:13: pid 23556: LOG: Replication of node:1 is behind 1803102840 bytes from the primary server (node:0) 2021-06-29 05:09:13: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:09:23: pid 23556: LOG: Replication of node:1 is behind 832285736 bytes from the primary server (node:0) 2021-06-29 05:09:23: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:55:28: pid 23556: LOG: Replication of node:1 is behind 17664336 bytes from the primary server (node:0) 2021-06-29 05:55:28: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:55:38: pid 23556: LOG: Replication of node:1 is behind 26457808 bytes from the primary server (node:0) 2021-06-29 05:55:38: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:55:48: pid 23556: LOG: Replication of node:1 is behind 12462384 bytes from the primary server (node:0) 2021-06-29 05:55:48: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 05:55:58: pid 23556: LOG: Replication of node:1 is behind 30543152 bytes from the primary server (node:0) 2021-06-29 05:55:58: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:05:29: pid 23556: LOG: Replication of node:1 is behind 12132832 bytes from the primary server (node:0) 2021-06-29 06:05:29: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:20:22: pid 23556: LOG: Replication of node:1 is behind 15070072 bytes from the primary server (node:0) 2021-06-29 06:20:22: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:25:43: pid 23556: LOG: Replication of node:1 is behind 10552808 bytes from the primary server (node:0) 2021-06-29 06:25:43: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:25:53: pid 23556: LOG: Replication of node:1 is behind 11844784 bytes from the primary server (node:0) 2021-06-29 06:25:53: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:26:03: pid 23556: LOG: Replication of node:1 is behind 19620840 bytes from the primary server (node:0) 2021-06-29 06:26:03: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:26:13: pid 23556: LOG: Replication of node:1 is behind 136836304 bytes from the primary server (node:0) 2021-06-29 06:26:13: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 06:55:26: pid 23556: LOG: Replication of node:1 is behind 15181408 bytes from the primary server (node:0) 2021-06-29 06:55:26: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 07:05:27: pid 23556: LOG: Replication of node:1 is behind 15470344 bytes from the primary server (node:0) 2021-06-29 07:05:27: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 07:20:40: pid 23556: LOG: Replication of node:1 is behind 41345280 bytes from the primary server (node:0) 2021-06-29 07:20:40: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 07:29:31: pid 23556: LOG: Replication of node:1 is behind 14453232 bytes from the primary server (node:0) 2021-06-29 07:29:31: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 07:30:41: pid 23556: LOG: Replication of node:1 is behind 148510664 bytes from the primary server (node:0) 2021-06-29 07:30:41: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 07:41:13: pid 23556: LOG: Replication of node:1 is behind 13915016 bytes from the primary server (node:0) 2021-06-29 07:41:13: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 07:56:05: pid 23556: LOG: Replication of node:1 is behind 14291744 bytes from the primary server (node:0) 2021-06-29 07:56:05: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 08:20:38: pid 23556: LOG: Replication of node:1 is behind 21249072 bytes from the primary server (node:0) 2021-06-29 08:20:38: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 08:28:59: pid 23556: LOG: Replication of node:1 is behind 176490536 bytes from the primary server (node:0) 2021-06-29 08:28:59: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 08:30:29: pid 23556: LOG: Replication of node:1 is behind 43422536 bytes from the primary server (node:0) 2021-06-29 08:30:29: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 08:30:39: pid 23556: LOG: Replication of node:1 is behind 37924232 bytes from the primary server (node:0) 2021-06-29 08:30:39: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:00:04: pid 23556: LOG: Replication of node:1 is behind 29481112 bytes from the primary server (node:0) 2021-06-29 09:00:04: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:01:34: pid 23556: LOG: Replication of node:1 is behind 334392920 bytes from the primary server (node:0) 2021-06-29 09:01:34: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:01:44: pid 23556: LOG: Replication of node:1 is behind 297161288 bytes from the primary server (node:0) 2021-06-29 09:01:44: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:01:54: pid 23556: LOG: Replication of node:1 is behind 486774152 bytes from the primary server (node:0) 2021-06-29 09:01:54: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:05:05: pid 23556: LOG: Replication of node:1 is behind 10937776 bytes from the primary server (node:0) 2021-06-29 09:05:05: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:05:25: pid 23556: LOG: Replication of node:1 is behind 74505880 bytes from the primary server (node:0) 2021-06-29 09:05:25: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:10:05: pid 23556: LOG: Replication of node:1 is behind 14035056 bytes from the primary server (node:0) 2021-06-29 09:10:05: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:10:25: pid 23556: LOG: Replication of node:1 is behind 59770632 bytes from the primary server (node:0) 2021-06-29 09:10:25: pid 23556: CONTEXT: while checking replication time lag 2021-06-29 09:30:28: pid 23556: LOG: Replication of node:1 is behind 98808040 bytes from the primary server (node:0) 2021-06-29 09:30:28: pid 23556: CONTEXT: while checking replication time lag witness_log.txt (15,882 bytes)

pengbo 2021-06-30 11:05 developer ~0003891	> Yes on machine 10.234.234.66 is such same error. What i have to do that this error not appearing, maybe increase timeout? > But if would were problem with network that replication had problem too while replication is ok It may be caused by the network failure between 10.234.226.66 and 10.234.234.66. There are many replication delay log messages in the log files. That means the network connection between 10.234.226.66 and 10.234.234.66 is unstable. --- 2021-06-29 01:05:25: pid 24250: LOG: Replication of node:1 is behind 41593728 bytes from the primary server (node:0) 2021-06-29 01:05:25: pid 24250: CONTEXT: while checking replication time lag 2021-06-29 01:06:05: pid 24250: LOG: Replication of node:1 is behind 10109688 bytes from the primary server (node:0) 2021-06-29 01:06:05: pid 24250: CONTEXT: while checking replication time lag ---

Ken 2021-07-12 16:35 reporter ~0003896	Topic is to the close You had right, this problem is network. I did test, I made new environment with infrastructure how on production and test environment begin log this same entries where on production.

pengbo 2021-07-12 17:37 developer ~0003897	I am going to close this issue.

Date Modified	Username	Field	Change
2021-06-25 17:53	Ken	New Issue
2021-06-25 17:53	Ken	File Added: logs.txt
2021-06-25 17:53	Ken	File Added: pgpool.conf
2021-06-28 17:54	pengbo	Note Added: 0003885
2021-06-28 17:54	pengbo	Assigned To	=> pengbo
2021-06-28 17:54	pengbo	Status	new => feedback
2021-06-28 17:54	pengbo	Description Updated
2021-06-28 19:17	Ken	Note Added: 0003887
2021-06-28 19:17	Ken	Status	feedback => assigned
2021-06-28 21:57	Ken	Note Added: 0003889
2021-06-29 16:35	Ken	Note Added: 0003890
2021-06-29 16:35	Ken	File Added: primary_log.txt
2021-06-29 16:35	Ken	File Added: standby_log.txt
2021-06-29 16:35	Ken	File Added: witness_log.txt
2021-06-30 11:05	pengbo	Note Added: 0003891
2021-06-30 11:08	administrator	Status	assigned => feedback
2021-07-12 16:35	Ken	Note Added: 0003896
2021-07-12 16:35	Ken	Status	feedback => assigned
2021-07-12 17:37	pengbo	Note Added: 0003897
2021-07-12 17:37	pengbo	Status	assigned => closed

View Issue Details

Activities

Issue History