[Pgpool-general] pcp_child: pcp_read() failed. reason: Success

Thu Nov 5 19:36:15 UTC 2009

Has anyone else run into this:

My pgpool instance runs without problems for days on end and then suddenly
stops responding to all requests.
At the same moment, one of my three backend db hosts becomes completely
inaccessible.
Pgpool will not respond to shutdown, or even kill and must be kill -9'd
Once all pgpool processes are out of the way, the inaccessible postgres
server once again becomes responsive.
I restart pgpool and everything works properly for a few more days.

At the moment the problem occurs, pgpool's log output, which typically
consists of just connection logging, turns into a steady stream of this:
Nov  5 11:33:18 src at obfuscated pgpool: 2009-11-05 11:33:18 ERROR: pid 12811:
pcp_child: pcp_read() failed. reason: Success
These errors show up sporaticlly in my pgpool logs all the time but don't
appear to have any adverse effects until the whole thing takes a dive.
I would desperately like to know what this error message is trying to tell
me.

I have not been able to correlate any given query/connection/process to the
timing of the outages.
Sometimes they happens at peak usage periods, sometimes they happen in the
middle of the night.

I experienced this problem using pgpool-II v1.3 and have recently upgraded
to pgpool-II v2.2.5 but am still seeing the same issue.

It may be relevant to point out that I am running pgpool on one of the
machines that is also acting as a postgres backend and it is always the
postgres instance on the pgpool host that locks up.
This morning I moved the pgpool instance onto another one of the postgres
backend hosts in an effort to see if the cohabitation of pgpool and postgres
is causing problems or if there is simply an issue with that postres on that
host of if this is just a coincidence.
I likely won't gain anything from this test for a day or more.

Also relevant is that I am running mammoth replicator and am only using
pgpool for connection load balancing and high availability.

Below is my pgpool.conf.

Any thoughts appreciated.

-steve crandell

f

#
# pgpool-II configuration file sample
# $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.4.2.3
2007/10/12 09:15:02 y-asaba Exp $

# Host name or IP address to listen on: '*' for all, '' for no TCP/IP
# connections
#listen_addresses = 'localhost'
listen_addresses = '10.xxx.xxx.xxx'

# Port number for pgpool
port = 5432

# Port number for pgpool communication manager
pcp_port = 9898

# Unix domain socket path.  (The Debian package defaults to
# /var/run/postgresql.)
socket_dir = '/usr/local/pgpool'

# Unix domain socket path for pgpool communication manager.
pcp_socket_dir = '/usr/local/pgpool'

# Unix domain socket path for the backend. Debian package defaults to
/var/run/postgresql!
backend_socket_dir = '/usr/local/pgpool'

# pgpool communication manager timeout. 0 means no timeout, but
strongly not recommended!
pcp_timeout = 10

# number of pre-forked child process
num_init_children = 32

# Number of connection pools allowed for a child process
max_pool = 4

# If idle for this many seconds, child exits.  0 means no timeout.
child_life_time = 30

# If idle for this many seconds, connection to PostgreSQL closes.
# 0 means no timeout.
#connection_life_time = 0
connection_life_time = 30

# If child_max_connections connections were received, child exits.
# 0 means no exit.
# change
child_max_connections = 0

# Maximum time in seconds to complete client authentication.
# 0 means no timeout.
authentication_timeout = 60

# Logging directory (more accurately, the directory for the PID file)
logdir = '/usr/local/pgpool'

# Replication mode
replication_mode = false

# Set this to true if you want to avoid deadlock situations when
# replication is enabled.  There will, however, be a noticable performance
# degradation.  A workaround is to set this to false and insert a /*STRICT*/
# comment at the beginning of the SQL command.
replication_strict = false

# When replication_strict is set to false, there will be a chance for
# deadlocks.  Set this to nonzero (in milliseconds) to detect this
# situation and resolve the deadlock by aborting current session.
replication_timeout = 5000

# Load balancing mode, i.e., all SELECTs except in a transaction block
# are load balanced.  This is ignored if replication_mode is false.
# change
load_balance_mode = true

# if there's a data mismatch between master and secondary
# start degeneration to stop replication mode
replication_stop_on_mismatch = false

# If true, replicate SELECT statement when load balancing is disabled.
# If false, it is only sent to the master node.
# change
replicate_select = true

# Semicolon separated list of queries to be issued at the end of a session
reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION DEFAULT'

# If true print timestamp on each log line.
print_timestamp = true

# If true, operate in master/slave mode.
# change
master_slave_mode = true

# If true, cache connection pool.
connection_cache = false

# Health check timeout.  0 means no timeout.
health_check_timeout = 20

# Health check period.  0 means no health check.
health_check_period = 0

# Health check user
health_check_user = 'nobody'

# If true, automatically lock table with INSERT statements to keep SERIAL
# data consistency.  An /*INSERT LOCK*/ comment has the same effect.  A
# /NO INSERT LOCK*/ comment disables the effect.
insert_lock = false

# If true, ignore leading white spaces of each query while pgpool judges
# whether the query is a SELECT so that it can be load balanced.  This
# is useful for certain APIs such as DBI/DBD which is known to adding an
# extra leading white space.
ignore_leading_white_space = false

# If true, print all statements to the log.  Like the log_statement option
# to PostgreSQL, this allows for observing queries without engaging in full
# debugging.
log_statement = false

# If true, incoming connections will be printed to the log.
# change
log_connections = true

# If true, hostname will be shown in ps status. Also shown in
# connection log if log_connections = true.
# Be warned that this feature will add overhead to look up hostname.
log_hostname = false

# if non 0, run in parallel query mode
parallel_mode = false

# if non 0, use query cache
enable_query_cache = 0

#set pgpool2 hostname
pgpool2_hostname = ''

# system DB info
#system_db_hostname = 'localhost'
#system_db_port = 5432
#system_db_dbname = 'pgpool'
#system_db_schema = 'pgpool_catalog'
#system_db_user = 'pgpool'
#system_db_password = ''

# backend_hostname, backend_port, backend_weight
# here are examples
backend_hostname0 = 'db1.xxx.xxx'
backend_port0 = 5433
backend_weight0 = 0.0

backend_hostname1 = 'db2.xxx.xxx'
backend_port1 = 5433
backend_weight1 = 0.4

backend_hostname2 = 'db3.xxx.xxx'
backend_port2 = 5433
backend_weight2 = 0.6

# - HBA -

# If true, use pool_hba.conf for client authentication. In pgpool-II
# 1.1, the default value is false. The default value will be true in
# 1.2.
enable_pool_hba = false
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20091105/e183f4a0/attachment.html>