[pgpool-general: 932] Re: Connections failures

Tatsuo Ishii ishii at postgresql.org
Thu Aug 30 10:04:38 JST 2012


> 2012-08-29 20:37:48 ERROR: pid 19289: connect_inet_domain_socket:
> connect() failed: Cannot assign requested address

The error message just returns error string set by connect(2).
Googling the error message might give you a hint. It seems the cause
is network high load. There are discussions around tweaking some
kernel resources such as ip_local_port_range and tcp_tw_recycle.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

>     We have pgpool 3.1.2, with postgres 9.1 in streaming
>     replication. Pgpool is load balancing. There are two hosts both with
>     postgres and pgpool. They are in high availability with corosync.
>     It has been working fine but recently it has started to fail. First
>     disconnecting one backend and then the other, and fortunately,
>     corosync detects this situation and restarts pgpool. Some client
>     connections are lost during restarting but from there on, it continues
>     normal operation.
>     And it looks like the cause is some network failure, even between the
>     pgpool and the postgres in the same host. This is the part of the log
>     where the failure happens.
> 
>     What does it mean "Cannot assign requested address" in this context?
>     Any idea what could be the cause of failure and how can we avoid it?
> 
>     Many thanks.
> 
> 2012-08-29 20:37:48 ERROR: pid 19289: connect_inet_domain_socket:
> connect() failed: Cannot assign requested address
> 2012-08-29 20:37:48 ERROR: pid 19289: connection to host0(5432) failed
> 2012-08-29 20:37:48 ERROR: pid 19289: new_connection: create_cp()
> failed
> 2012-08-29 20:37:48 LOG: pid 19289: degenerate_backend_set: 0 fail
> over request from pid 19289
> 2012-08-29 20:37:48 ERROR: pid 19300: connect_inet_domain_socket:
> connect() failed: Cannot assign requested address
> 2012-08-29 20:37:48 LOG: pid 19307: DB node id: 0 backend pid: 32408
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 19286: DB node id: 1 backend pid: 23636
> statement: B message
> 2012-08-29 20:37:48 ERROR: pid 19300: connection to host0(5432) failed
> 2012-08-29 20:37:48 LOG: pid 19274: starting degeneration. shutdown
> host host0(5432)
> 2012-08-29 20:37:48 ERROR: pid 19300: new_connection: create_cp()
> failed
> 2012-08-29 20:37:48 LOG:   pid 19274: Restart all children
> 2012-08-29 20:37:48 LOG: pid 19307: DB node id: 1 backend pid: 23637
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 19274: execute command: /etc/failover.sh
> 0 "host0" 5432 /var/lib/postgresql/9.1/main/ 1 0 "host1" 1
> 2012-08-29 20:37:48 LOG: pid 19274: find_primary_node_repeatedly:
> waiting for finding a primary node
> 2012-08-29 20:37:48 LOG: pid 19274: find_primary_node: primary node id
> is 1
> 2012-08-29 20:37:48 LOG: pid 19274: failover: set new primary node: 1
> 2012-08-29 20:37:48 LOG:   pid 19274: failover: set new master node: 1
> ###
> ### It seems to start receiving normal connections from clients again
> ###
> 2012-08-29 20:37:48 LOG: pid 32414: connection received:
> host=client30011 port=58746
> 2012-08-29 20:37:48 LOG: pid 32415: connection received:
> host=client30011 port=58747
> 2012-08-29 20:37:48 LOG: pid 32416: connection received:
> host=client30011 port=58748
> 2012-08-29 20:37:48 LOG: pid 32417: connection received:
> host=client30011 port=58750
> 2012-08-29 20:37:48 LOG: pid 32418: connection received:
> host=client30011 port=58752
> 2012-08-29 20:37:48 LOG: pid 32419: connection received:
> host=client30121 port=55205
> 2012-08-29 20:37:48 LOG: pid 32420: connection received:
> host=client30023 port=42473
> 2012-08-29 20:37:48 LOG: pid 32414: DB node id: 1 backend pid: 23641
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32414: DB node id: 1 backend pid: 23641
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32414: DB node id: 1 backend pid: 23641
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32414: DB node id: 1 backend pid: 23641
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32416: DB node id: 1 backend pid: 23642
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32417: DB node id: 1 backend pid: 23643
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32419: DB node id: 1 backend pid: 23644
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32416: DB node id: 1 backend pid: 23642
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32417: DB node id: 1 backend pid: 23643
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32416: DB node id: 1 backend pid: 23642
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32419: DB node id: 1 backend pid: 23644
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32439: connection received:
> host=client30021 port=48696
> 2012-08-29 20:37:48 LOG: pid 32417: DB node id: 1 backend pid: 23643
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32419: DB node id: 1 backend pid: 23644
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 19274: failover done. shutdown host
> host0(5432)
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32416: DB node id: 1 backend pid: 23642
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32417: DB node id: 1 backend pid: 23643
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32419: DB node id: 1 backend pid: 23644
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: Parse: UPDATE "Documents2" SET "Lemas" = '',
> "SentimentAuto" = 0, "Franja" = 67878000, "Processed" = true WHERE
> "Id" = 5694304
> 2012-08-29 20:37:48 LOG: pid 32415: DB node id: 1 backend pid: 23646
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32415: DB node id: 1 backend pid: 23646
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32415: DB node id: 1 backend pid: 23646
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: D message
> 2012-08-29 20:37:48 LOG: pid 32415: DB node id: 1 backend pid: 23646
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32418: DB node id: 1 backend pid: 23645
> statement: Execute: UPDATE "Documents2" SET "Lemas" = '',
> "SentimentAuto" = 0, "Franja" = 67878000, "Processed" = true WHERE
> "Id" = 5694304
> 2012-08-29 20:37:48 LOG: pid 32420: DB node id: 1 backend pid: 23647
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32420: DB node id: 1 backend pid: 23647
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32420: DB node id: 1 backend pid: 23647
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32420: DB node id: 1 backend pid: 23647
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32439: DB node id: 1 backend pid: 23648
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32439: DB node id: 1 backend pid: 23648
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32439: DB node id: 1 backend pid: 23648
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32439: DB node id: 1 backend pid: 23648
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> 2012-08-29 20:37:48 LOG: pid 32437: connection received:
> host=client30021 port=48698
> 2012-08-29 20:37:48 LOG: pid 32444: connection received:
> host=client30011 port=58757
> 2012-08-29 20:37:48 LOG: pid 32437: DB node id: 1 backend pid: 23650
> statement: Parse: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32437: DB node id: 1 backend pid: 23650
> statement: B message
> 2012-08-29 20:37:48 LOG: pid 32437: DB node id: 1 backend pid: 23650
> statement: Execute: SET extra_float_digits = 3
> 2012-08-29 20:37:48 LOG: pid 32437: DB node id: 1 backend pid: 23650
> statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE
> p.proname = 'pgpool_regclass'
> ###
> ### But in the same second, connections to the second postgres failed too
> ###
> 2012-08-29 20:37:48 ERROR: pid 32444: connect_inet_domain_socket:
> connect() failed: Cannot assign requested address
> 2012-08-29 20:37:48 ERROR: pid 32444: connection to host1(5432) failed
> 2012-08-29 20:37:48 ERROR: pid 32444: new_connection: create_cp()
> failed
> 2012-08-29 20:37:48 LOG: pid 32444: degenerate_backend_set: 1 fail
> over request from pid 32444
> ###
> ### These last lines with more connections from clients and initial
> ### statements repeat about twenty times. And then...
> ###
> 2012-08-29 20:37:49 LOG: pid 19274: starting degeneration. shutdown
> host host1(5432)
> 2012-08-29 20:37:49 ERROR: pid 19274: failover_handler: no valid DB
> node found
> 2012-08-29 20:37:49 LOG:   pid 19274: Restart all children
> 2012-08-29 20:37:49 LOG: pid 19308: pcp child process received restart
> request
> 2012-08-29 20:37:49 LOG: pid 19274: execute command: /etc/failover.sh
> 1 "host1" 5432 /var/lib/postgresql/9.1/main/ -1 1 "" 1
> 2012-08-29 20:37:49 LOG: pid 19274: find_primary_node_repeatedly:
> waiting for finding a primary node
> 2012-08-29 20:38:50 LOG: pid 19274: received immediate shutdown
> request
> 2012-08-29 20:38:51 ERROR: pid 458: pool_init_pool_passwd: couldn't
> open /etc/pool_passwd. reason: Permission denied
> 2012-08-29 20:38:51 LOG: pid 458: read_status_file: 0 th backend is
> set to down status
> 2012-08-29 20:38:51 LOG: pid 458: read_status_file: 1 th backend is
> set to down status
> 2012-08-29 20:38:51 LOG: pid 465: connection received: host=client000
> port=35578
> 2012-08-29 20:38:51 LOG: pid 470: connection received: host=client000
> port=35582
> 2012-08-29 20:38:51 LOG: pid 466: connection received: host=client000
> port=35579
> 2012-08-29 20:38:51 LOG: pid 467: connection received:
> host=client30021 port=49154
> 2012-08-29 20:38:51 LOG: pid 464: connection received: host=client000
> port=35577
> 2012-08-29 20:38:51 LOG: pid 469: connection received: host=client000
> port=35581
> 2012-08-29 20:38:51 LOG: pid 468: connection received: host=client000
> port=35580
> 2012-08-29 20:38:51 LOG: pid 458: pgpool-II successfully
> started. version 3.1.2 (hatsuiboshi)
> 
> The same happened two more times in the next 30 minutes, but now it
> can take up to 20 hours before a new failure happens.
> 
> -- 
> Juanjo Pérez
> 
> www.oteara.com
> 
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list