[Pgpool-general] pcp_child: pcp_read() failed. reason: Success

Tatsuo Ishii ishii at sraoss.co.jp
Fri Nov 6 04:26:07 UTC 2009


Besides the useless error message from pgp_child(it seems someone
believed that EOF will set some error number to the global errono
variable. I will fix this anyway.), for me it seems socket files are
going dead. I suspect some network stack bugs could cause this but I'm
not sure. One thing you might want to try is, changing this:

backend_hostname0 = 'db1.xxx.xxx'

to:

backend_hostname0 = ''

This will make pgpool to use UNIX domain socket for the communication
channel to PostgreSQL, rather than TCP/IP. It may or may not affect
the problem you have, since the network code in the kernel will be
different.

(I assume you are running pgpool on db1.xxx.xxx)
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Has anyone else run into this:
> 
> My pgpool instance runs without problems for days on end and then suddenly
> stops responding to all requests.
> At the same moment, one of my three backend db hosts becomes completely
> inaccessible.
> Pgpool will not respond to shutdown, or even kill and must be kill -9'd
> Once all pgpool processes are out of the way, the inaccessible postgres
> server once again becomes responsive.
> I restart pgpool and everything works properly for a few more days.
> 
> At the moment the problem occurs, pgpool's log output, which typically
> consists of just connection logging, turns into a steady stream of this:
> Nov  5 11:33:18 src at obfuscated pgpool: 2009-11-05 11:33:18 ERROR: pid 12811:
> pcp_child: pcp_read() failed. reason: Success
> These errors show up sporaticlly in my pgpool logs all the time but don't
> appear to have any adverse effects until the whole thing takes a dive.
> I would desperately like to know what this error message is trying to tell
> me.
> 
> I have not been able to correlate any given query/connection/process to the
> timing of the outages.
> Sometimes they happens at peak usage periods, sometimes they happen in the
> middle of the night.
> 
> I experienced this problem using pgpool-II v1.3 and have recently upgraded
> to pgpool-II v2.2.5 but am still seeing the same issue.
> 
> It may be relevant to point out that I am running pgpool on one of the
> machines that is also acting as a postgres backend and it is always the
> postgres instance on the pgpool host that locks up.
> This morning I moved the pgpool instance onto another one of the postgres
> backend hosts in an effort to see if the cohabitation of pgpool and postgres
> is causing problems or if there is simply an issue with that postres on that
> host of if this is just a coincidence.
> I likely won't gain anything from this test for a day or more.
> 
> Also relevant is that I am running mammoth replicator and am only using
> pgpool for connection load balancing and high availability.
> 
> Below is my pgpool.conf.
> 
> Any thoughts appreciated.
> 
> -steve crandell
> 
> 
> 
> f
> 
> #
> # pgpool-II configuration file sample
> # $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.4.2.3
> 2007/10/12 09:15:02 y-asaba Exp $
> 
> # Host name or IP address to listen on: '*' for all, '' for no TCP/IP
> # connections
> #listen_addresses = 'localhost'
> listen_addresses = '10.xxx.xxx.xxx'
> 
> # Port number for pgpool
> port = 5432
> 
> # Port number for pgpool communication manager
> pcp_port = 9898
> 
> # Unix domain socket path.  (The Debian package defaults to
> # /var/run/postgresql.)
> socket_dir = '/usr/local/pgpool'
> 
> # Unix domain socket path for pgpool communication manager.
> pcp_socket_dir = '/usr/local/pgpool'
> 
> # Unix domain socket path for the backend. Debian package defaults to
> /var/run/postgresql!
> backend_socket_dir = '/usr/local/pgpool'
> 
> # pgpool communication manager timeout. 0 means no timeout, but
> strongly not recommended!
> pcp_timeout = 10
> 
> # number of pre-forked child process
> num_init_children = 32
> 
> 
> # Number of connection pools allowed for a child process
> max_pool = 4
> 
> 
> # If idle for this many seconds, child exits.  0 means no timeout.
> child_life_time = 30
> 
> # If idle for this many seconds, connection to PostgreSQL closes.
> # 0 means no timeout.
> #connection_life_time = 0
> connection_life_time = 30
> 
> # If child_max_connections connections were received, child exits.
> # 0 means no exit.
> # change
> child_max_connections = 0
> 
> # Maximum time in seconds to complete client authentication.
> # 0 means no timeout.
> authentication_timeout = 60
> 
> # Logging directory (more accurately, the directory for the PID file)
> logdir = '/usr/local/pgpool'
> 
> # Replication mode
> replication_mode = false
> 
> # Set this to true if you want to avoid deadlock situations when
> # replication is enabled.  There will, however, be a noticable performance
> # degradation.  A workaround is to set this to false and insert a /*STRICT*/
> # comment at the beginning of the SQL command.
> replication_strict = false
> 
> # When replication_strict is set to false, there will be a chance for
> # deadlocks.  Set this to nonzero (in milliseconds) to detect this
> # situation and resolve the deadlock by aborting current session.
> replication_timeout = 5000
> 
> # Load balancing mode, i.e., all SELECTs except in a transaction block
> # are load balanced.  This is ignored if replication_mode is false.
> # change
> load_balance_mode = true
> 
> # if there's a data mismatch between master and secondary
> # start degeneration to stop replication mode
> replication_stop_on_mismatch = false
> 
> # If true, replicate SELECT statement when load balancing is disabled.
> # If false, it is only sent to the master node.
> # change
> replicate_select = true
> 
> # Semicolon separated list of queries to be issued at the end of a session
> reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'
> 
> # If true print timestamp on each log line.
> print_timestamp = true
> 
> # If true, operate in master/slave mode.
> # change
> master_slave_mode = true
> 
> # If true, cache connection pool.
> connection_cache = false
> 
> # Health check timeout.  0 means no timeout.
> health_check_timeout = 20
> 
> # Health check period.  0 means no health check.
> health_check_period = 0
> 
> # Health check user
> health_check_user = 'nobody'
> 
> # If true, automatically lock table with INSERT statements to keep SERIAL
> # data consistency.  An /*INSERT LOCK*/ comment has the same effect.  A
> # /NO INSERT LOCK*/ comment disables the effect.
> insert_lock = false
> 
> # If true, ignore leading white spaces of each query while pgpool judges
> # whether the query is a SELECT so that it can be load balanced.  This
> # is useful for certain APIs such as DBI/DBD which is known to adding an
> # extra leading white space.
> ignore_leading_white_space = false
> 
> # If true, print all statements to the log.  Like the log_statement option
> # to PostgreSQL, this allows for observing queries without engaging in full
> # debugging.
> log_statement = false
> 
> # If true, incoming connections will be printed to the log.
> # change
> log_connections = true
> 
> # If true, hostname will be shown in ps status. Also shown in
> # connection log if log_connections = true.
> # Be warned that this feature will add overhead to look up hostname.
> log_hostname = false
> 
> # if non 0, run in parallel query mode
> parallel_mode = false
> 
> # if non 0, use query cache
> enable_query_cache = 0
> 
> #set pgpool2 hostname
> pgpool2_hostname = ''
> 
> # system DB info
> #system_db_hostname = 'localhost'
> #system_db_port = 5432
> #system_db_dbname = 'pgpool'
> #system_db_schema = 'pgpool_catalog'
> #system_db_user = 'pgpool'
> #system_db_password = ''
> 
> # backend_hostname, backend_port, backend_weight
> # here are examples
> backend_hostname0 = 'db1.xxx.xxx'
> backend_port0 = 5433
> backend_weight0 = 0.0
> 
> backend_hostname1 = 'db2.xxx.xxx'
> backend_port1 = 5433
> backend_weight1 = 0.4
> 
> backend_hostname2 = 'db3.xxx.xxx'
> backend_port2 = 5433
> backend_weight2 = 0.6
> 
> 
> 
> # - HBA -
> 
> # If true, use pool_hba.conf for client authentication. In pgpool-II
> # 1.1, the default value is false. The default value will be true in
> # 1.2.
> enable_pool_hba = false


More information about the Pgpool-general mailing list