[pgpool-general: 5013] New connection to master cannot initiate when slave goes down

Krzysztof Mościcki stivi at kity.pl
Thu Sep 22 06:12:27 JST 2016


Hello

I have problem with initiate new connection to master node through 
pgpool, when slave node goes down.

I have configuration, where is streamin replication mode, and 
failover_over_backend_error is off. System Ubuntu 14.04, Pgpool 3.5.3 
version + patch from: 
http://www.sraoss.jp/pipermail/pgpool-general/2016-July/004884.html, 
Postgres version 9.4.

I tested this on Google Cloud, when are 3 servers:

1. Postgres master + Pgpool (ip: 10.1.0.19),

2. Postgres slave + Pgpool (ip: 10.1.0.23),

3. Pgpool

On second server I run "ifdown eth0", and current connections works, but 
I can't initiate new connection.

Some logs from first pgpool:
Sep 21 20:47:38 usc-sprod-pg-00 pgpool: 2016-09-21 20:47:38: pid 10639: 
DETAIL:  timed out. retrying...
Sep 21 20:47:38 usc-sprod-pg-00 pgpool: 2016-09-21 20:47:38: pid 9969: 
LOG:  trying connecting to PostgreSQL server on "10.1.0.23:5432" by INET 
socket
Sep 21 20:47:38 usc-sprod-pg-00 pgpool: 2016-09-21 20:47:38: pid 9969: 
DETAIL:  timed out. retrying...
[above logs are all time after second node goes down]

Sep 21 20:48:06 usc-sprod-pg-00 pgpool: 2016-09-21 20:48:06: pid 1258: 
LOG:  connect_inet_domain_socket: select() interrupted by certain 
signal. retrying...

Sep 21 20:48:11 usc-sprod-pg-00 pgpool: 2016-09-21 20:48:11: pid 1258: 
DETAIL:  select() system call failed with an error "Interrupted system call"
Sep 21 20:48:11 usc-sprod-pg-00 pgpool: 2016-09-21 20:48:11: pid 1258: 
ERROR:  failed to make persistent db connection
Sep 21 20:48:11 usc-sprod-pg-00 pgpool: 2016-09-21 20:48:11: pid 1258: 
DETAIL:  connection to host:"10.1.0.23:5432" failed

Sep 21 20:48:17 usc-sprod-pg-00 pgpool: 2016-09-21 20:48:17: pid 3290: 
LOG:  pool_send_and_wait: Error or notice message from backend: : DB 
node id: 0 backend pid: 3473 statement: "select * from pgtest;" message: 
"relation "pgtest" does not exist"

Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1258: 
LOG:  failed to connect to PostgreSQL server on "10.1.0.23:5432" using 
INET socket
Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1258: 
DETAIL:  select() system call failed with an error "Interrupted system call"
Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1258: 
ERROR:  failed to make persistent db connection
Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1258: 
DETAIL:  connection to host:"10.1.0.23:5432" failed
Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1258: 
LOG:  setting backend node 1 status to NODE DOWN
Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1258: 
LOG:  received degenerate backend request for node_id: 1 from pid [1258]
Sep 21 20:49:12 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:12: pid 1284: 
LOG:  new IPC connection received
Sep 21 20:49:13 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:13: pid 10938: 
LOG:  trying connecting to PostgreSQL server on "10.1.0.23:5432" by INET 
socket
Sep 21 20:49:13 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:13: pid 10938: 
DETAIL:  timed out. retrying...
Sep 21 20:49:13 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:13: pid 9583: 
LOG:  trying connecting to PostgreSQL server on "10.1.0.23:5432" by INET 
socket

In previos tests, after this "setting backend node 1 status to NODE 
DOWN" I can initiate new connections, but in this case I can't (all time 
new connection is not possible), and in logs is:

Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 10639: 
LOG:  failed to connect to PostgreSQL server on "10.1.0.23:5432", 
getsockopt() detected error "Connection timed out"
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 9969: 
LOG:  failed to connect to PostgreSQL server on "10.1.0.23:5432", 
getsockopt() detected error "Connection timed out"
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 10639: 
LOG:  failed to create a backend 1 connection
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 10639: 
DETAIL:  skip this backend because because fail_over_on_backend_error is 
off and we are in streaming replication mode and node is standby node
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 9969: 
LOG:  failed to create a backend 1 connection
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 9969: 
DETAIL:  skip this backend because because fail_over_on_backend_error is 
off and we are in streaming replication mode and node is standby node
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 10639: 
FATAL:  unable to read data from DB node 0
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 10639: 
DETAIL:  EOF encountered with backend
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 9969: 
FATAL:  unable to read data from DB node 0
Sep 21 20:49:35 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:35: pid 9969: 
DETAIL:  EOF encountered with backend


And of course all time:
Sep 21 20:49:36 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:36: pid 10763: 
LOG:  trying connecting to PostgreSQL server on "10.1.0.23:5432" by INET 
socket
Sep 21 20:49:36 usc-sprod-pg-00 pgpool: 2016-09-21 20:49:36: pid 10763: 
DETAIL:  timed out. retrying...

Regards,
Kris


More information about the pgpool-general mailing list