[pgpool-general: 930] Connections failures

Juan Jose Perez jperez at oteara.com
Thu Aug 30 09:57:07 JST 2012


     Hi all,

     We have pgpool 3.1.2, with postgres 9.1 in streaming replication. 
Pgpool is load balancing. There are two hosts both with postgres and 
pgpool. They are in high availability with corosync.
     It has been working fine but recently it has started to fail. First 
disconnecting one backend and then the other, and fortunately, corosync 
detects this situation and restarts pgpool. Some client connections are 
lost during restarting but from there on, it continues normal operation.
     And it looks like the cause is some network failure, even between 
the pgpool and the postgres in the same host. This is the part of the 
log where the failure happens.

     What does it mean "Cannot assign requested address" in this 
context? Any idea what could be the cause of failure and how can we 
avoid it?

     Many thanks.

2012-08-29 20:37:48 ERROR: pid 19289: connect_inet_domain_socket: 
connect() failed: Cannot assign requested address
2012-08-29 20:37:48 ERROR: pid 19289: connection to host0(5432) failed
2012-08-29 20:37:48 ERROR: pid 19289: new_connection: create_cp() failed
2012-08-29 20:37:48 LOG:   pid 19289: degenerate_backend_set: 0 fail 
over request from pid 19289
2012-08-29 20:37:48 ERROR: pid 19300: connect_inet_domain_socket: 
connect() failed: Cannot assign requested address
2012-08-29 20:37:48 LOG:   pid 19307: DB node id: 0 backend pid: 32408 
statement: B message
2012-08-29 20:37:48 LOG:   pid 19286: DB node id: 1 backend pid: 23636 
statement: B message
2012-08-29 20:37:48 ERROR: pid 19300: connection to host0(5432) failed
2012-08-29 20:37:48 LOG:   pid 19274: starting degeneration. shutdown 
host host0(5432)
2012-08-29 20:37:48 ERROR: pid 19300: new_connection: create_cp() failed
2012-08-29 20:37:48 LOG:   pid 19274: Restart all children
2012-08-29 20:37:48 LOG:   pid 19307: DB node id: 1 backend pid: 23637 
statement: B message
2012-08-29 20:37:48 LOG:   pid 19274: execute command: /etc/failover.sh 
0 "host0" 5432 /var/lib/postgresql/9.1/main/ 1 0 "host1" 1
2012-08-29 20:37:48 LOG:   pid 19274: find_primary_node_repeatedly: 
waiting for finding a primary node
2012-08-29 20:37:48 LOG:   pid 19274: find_primary_node: primary node id 
is 1
2012-08-29 20:37:48 LOG:   pid 19274: failover: set new primary node: 1
2012-08-29 20:37:48 LOG:   pid 19274: failover: set new master node: 1
###
### It seems to start receiving normal connections from clients again
###
2012-08-29 20:37:48 LOG:   pid 32414: connection received: 
host=client30011 port=58746
2012-08-29 20:37:48 LOG:   pid 32415: connection received: 
host=client30011 port=58747
2012-08-29 20:37:48 LOG:   pid 32416: connection received: 
host=client30011 port=58748
2012-08-29 20:37:48 LOG:   pid 32417: connection received: 
host=client30011 port=58750
2012-08-29 20:37:48 LOG:   pid 32418: connection received: 
host=client30011 port=58752
2012-08-29 20:37:48 LOG:   pid 32419: connection received: 
host=client30121 port=55205
2012-08-29 20:37:48 LOG:   pid 32420: connection received: 
host=client30023 port=42473
2012-08-29 20:37:48 LOG:   pid 32414: DB node id: 1 backend pid: 23641 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32414: DB node id: 1 backend pid: 23641 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32414: DB node id: 1 backend pid: 23641 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32414: DB node id: 1 backend pid: 23641 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32416: DB node id: 1 backend pid: 23642 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32417: DB node id: 1 backend pid: 23643 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32419: DB node id: 1 backend pid: 23644 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32416: DB node id: 1 backend pid: 23642 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32417: DB node id: 1 backend pid: 23643 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32416: DB node id: 1 backend pid: 23642 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32419: DB node id: 1 backend pid: 23644 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32439: connection received: 
host=client30021 port=48696
2012-08-29 20:37:48 LOG:   pid 32417: DB node id: 1 backend pid: 23643 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32419: DB node id: 1 backend pid: 23644 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: B message
2012-08-29 20:37:48 LOG:   pid 19274: failover done. shutdown host 
host0(5432)
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32416: DB node id: 1 backend pid: 23642 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32417: DB node id: 1 backend pid: 23643 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32419: DB node id: 1 backend pid: 23644 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: Parse: UPDATE "Documents2" SET "Lemas" = '', "SentimentAuto" 
= 0, "Franja" = 67878000, "Processed" = true WHERE "Id" = 5694304
2012-08-29 20:37:48 LOG:   pid 32415: DB node id: 1 backend pid: 23646 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32415: DB node id: 1 backend pid: 23646 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32415: DB node id: 1 backend pid: 23646 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: D message
2012-08-29 20:37:48 LOG:   pid 32415: DB node id: 1 backend pid: 23646 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32418: DB node id: 1 backend pid: 23645 
statement: Execute: UPDATE "Documents2" SET "Lemas" = '', 
"SentimentAuto" = 0, "Franja" = 67878000, "Processed" = true WHERE "Id" 
= 5694304
2012-08-29 20:37:48 LOG:   pid 32420: DB node id: 1 backend pid: 23647 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32420: DB node id: 1 backend pid: 23647 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32420: DB node id: 1 backend pid: 23647 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32420: DB node id: 1 backend pid: 23647 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32439: DB node id: 1 backend pid: 23648 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32439: DB node id: 1 backend pid: 23648 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32439: DB node id: 1 backend pid: 23648 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32439: DB node id: 1 backend pid: 23648 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
2012-08-29 20:37:48 LOG:   pid 32437: connection received: 
host=client30021 port=48698
2012-08-29 20:37:48 LOG:   pid 32444: connection received: 
host=client30011 port=58757
2012-08-29 20:37:48 LOG:   pid 32437: DB node id: 1 backend pid: 23650 
statement: Parse: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32437: DB node id: 1 backend pid: 23650 
statement: B message
2012-08-29 20:37:48 LOG:   pid 32437: DB node id: 1 backend pid: 23650 
statement: Execute: SET extra_float_digits = 3
2012-08-29 20:37:48 LOG:   pid 32437: DB node id: 1 backend pid: 23650 
statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname 
= 'pgpool_regclass'
###
### But in the same second, connections to the second postgres failed too
###
2012-08-29 20:37:48 ERROR: pid 32444: connect_inet_domain_socket: 
connect() failed: Cannot assign requested address
2012-08-29 20:37:48 ERROR: pid 32444: connection to host1(5432) failed
2012-08-29 20:37:48 ERROR: pid 32444: new_connection: create_cp() failed
2012-08-29 20:37:48 LOG:   pid 32444: degenerate_backend_set: 1 fail 
over request from pid 32444
###
### These last lines with more connections from clients and initial 
statements repeat about twenty times. And then...
###
2012-08-29 20:37:49 LOG:   pid 19274: starting degeneration. shutdown 
host host1(5432)
2012-08-29 20:37:49 ERROR: pid 19274: failover_handler: no valid DB node 
found
2012-08-29 20:37:49 LOG:   pid 19274: Restart all children
2012-08-29 20:37:49 LOG:   pid 19308: pcp child process received restart 
request
2012-08-29 20:37:49 LOG:   pid 19274: execute command: /etc/failover.sh 
1 "host1" 5432 /var/lib/postgresql/9.1/main/ -1 1 "" 1
2012-08-29 20:37:49 LOG:   pid 19274: find_primary_node_repeatedly: 
waiting for finding a primary node
2012-08-29 20:38:50 LOG:   pid 19274: received immediate shutdown request
2012-08-29 20:38:51 ERROR: pid 458: pool_init_pool_passwd: couldn't open 
/etc/pool_passwd. reason: Permission denied
2012-08-29 20:38:51 LOG:   pid 458: read_status_file: 0 th backend is 
set to down status
2012-08-29 20:38:51 LOG:   pid 458: read_status_file: 1 th backend is 
set to down status
2012-08-29 20:38:51 LOG:   pid 465: connection received: host=client000 
port=35578
2012-08-29 20:38:51 LOG:   pid 470: connection received: host=client000 
port=35582
2012-08-29 20:38:51 LOG:   pid 466: connection received: host=client000 
port=35579
2012-08-29 20:38:51 LOG:   pid 467: connection received: 
host=client30021 port=49154
2012-08-29 20:38:51 LOG:   pid 464: connection received: host=client000 
port=35577
2012-08-29 20:38:51 LOG:   pid 469: connection received: host=client000 
port=35581
2012-08-29 20:38:51 LOG:   pid 468: connection received: host=client000 
port=35580
2012-08-29 20:38:51 LOG:   pid 458: pgpool-II successfully started. 
version 3.1.2 (hatsuiboshi)

The same happened two more times in the next 30 minutes, but now it can 
take up to 20 hours before a new failure happens.

-- 
Juanjo Pérez

www.oteara.com



More information about the pgpool-general mailing list