[Pgpool-general] pcp_child: pcp_read() failed. reason: Success

Fri Nov 6 22:17:07 UTC 2009

I've tried setting the local backend_hostname = ''
Same problems are occurring.
Pgpool has actually failed something like 4 separate times today, all but
one of them using this local socket configuration.

Any other thoughts?

thx
-s

On Thu, Nov 5, 2009 at 10:52 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> > Actually I'm was running pgpool on db2 (backend_hostname1) and am now
> > running it on db3 (backend_hostname2).
> > I have actually suspected that pgpool might be opting for some sort of
> > socket connection to the local instance of postgres instead of using the
> > TCP/IP connection parameters in an effort to speed things up.
> >
> > I have done my best to ensure that pgpool has completely separate socket
> > directories but it wouldn't be hard for pgpool to find a local postgres
> > socket if it wanted.  If I end up with another outage and this time db3
> is
> > the postgres instance that locks up, I'll be fairly certain that this is
> the
> > problem but for the moment I can only speculate.
> >
> > I'm assuming you're suggesting I set backend_hostname0 = '' because it is
> > already weighted to 0.0 anyway?
>
> No. Because I thought you are running pgpool on db1. '' means force
> pgpool to use UNIX domain socket. So if you running pgpool on db3, you
> could set:
>
> backend_hostname2 = ''
>
> > I have db1 (backend_hostname0) weighted to 0.0 in an effort to direct all
> > selects to the two slave hosts (db2 and db3) but still benefit from
> pgpool
> > intelligently sending writes to db1.
> > db1 is the mammoth master host and needs all available i/o to deal with
> > writes.
> > My understanding is that this is how "master_slave_mode = true" works.
> > Writes are always directed to backend_hostname0.
> >
> > If I need to reevaluate that thinking, please advise but that has been
> > working for me for months now.
> >
> > thx
> > -s
> >
> > On Thu, Nov 5, 2009 at 9:26 PM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> > > Besides the useless error message from pgp_child(it seems someone
> > > believed that EOF will set some error number to the global errono
> > > variable. I will fix this anyway.), for me it seems socket files are
> > > going dead. I suspect some network stack bugs could cause this but I'm
> > > not sure. One thing you might want to try is, changing this:
> > >
> > > backend_hostname0 = 'db1.xxx.xxx'
> > >
> > > to:
> > >
> > > backend_hostname0 = ''
> > >
> > > This will make pgpool to use UNIX domain socket for the communication
> > > channel to PostgreSQL, rather than TCP/IP. It may or may not affect
> > > the problem you have, since the network code in the kernel will be
> > > different.
> > >
> > > (I assume you are running pgpool on db1.xxx.xxx)
> > > --
> > > Tatsuo Ishii
> > > SRA OSS, Inc. Japan
> > >
> > > > Has anyone else run into this:
> > > >
> > > > My pgpool instance runs without problems for days on end and then
> > > suddenly
> > > > stops responding to all requests.
> > > > At the same moment, one of my three backend db hosts becomes
> completely
> > > > inaccessible.
> > > > Pgpool will not respond to shutdown, or even kill and must be kill
> -9'd
> > > > Once all pgpool processes are out of the way, the inaccessible
> postgres
> > > > server once again becomes responsive.
> > > > I restart pgpool and everything works properly for a few more days.
> > > >
> > > > At the moment the problem occurs, pgpool's log output, which
> typically
> > > > consists of just connection logging, turns into a steady stream of
> this:
> > > > Nov  5 11:33:18 src at obfuscated pgpool: 2009-11-05 11:33:18 ERROR:
> pid
> > > 12811:
> > > > pcp_child: pcp_read() failed. reason: Success
> > > > These errors show up sporaticlly in my pgpool logs all the time but
> don't
> > > > appear to have any adverse effects until the whole thing takes a
> dive.
> > > > I would desperately like to know what this error message is trying to
> > > tell
> > > > me.
> > > >
> > > > I have not been able to correlate any given query/connection/process
> to
> > > the
> > > > timing of the outages.
> > > > Sometimes they happens at peak usage periods, sometimes they happen
> in
> > > the
> > > > middle of the night.
> > > >
> > > > I experienced this problem using pgpool-II v1.3 and have recently
> > > upgraded
> > > > to pgpool-II v2.2.5 but am still seeing the same issue.
> > > >
> > > > It may be relevant to point out that I am running pgpool on one of
> the
> > > > machines that is also acting as a postgres backend and it is always
> the
> > > > postgres instance on the pgpool host that locks up.
> > > > This morning I moved the pgpool instance onto another one of the
> postgres
> > > > backend hosts in an effort to see if the cohabitation of pgpool and
> > > postgres
> > > > is causing problems or if there is simply an issue with that postres
> on
> > > that
> > > > host of if this is just a coincidence.
> > > > I likely won't gain anything from this test for a day or more.
> > > >
> > > > Also relevant is that I am running mammoth replicator and am only
> using
> > > > pgpool for connection load balancing and high availability.
> > > >
> > > > Below is my pgpool.conf.
> > > >
> > > > Any thoughts appreciated.
> > > >
> > > > -steve crandell
> > > >
> > > >
> > > >
> > > > f
> > > >
> > > > #
> > > > # pgpool-II configuration file sample
> > > > # $Header: /cvsroot/pgpool/pgpool-II/pgpool.conf.sample,v 1.4.2.3
> > > > 2007/10/12 09:15:02 y-asaba Exp $
> > > >
> > > > # Host name or IP address to listen on: '*' for all, '' for no TCP/IP
> > > > # connections
> > > > #listen_addresses = 'localhost'
> > > > listen_addresses = '10.xxx.xxx.xxx'
> > > >
> > > > # Port number for pgpool
> > > > port = 5432
> > > >
> > > > # Port number for pgpool communication manager
> > > > pcp_port = 9898
> > > >
> > > > # Unix domain socket path.  (The Debian package defaults to
> > > > # /var/run/postgresql.)
> > > > socket_dir = '/usr/local/pgpool'
> > > >
> > > > # Unix domain socket path for pgpool communication manager.
> > > > pcp_socket_dir = '/usr/local/pgpool'
> > > >
> > > > # Unix domain socket path for the backend. Debian package defaults to
> > > > /var/run/postgresql!
> > > > backend_socket_dir = '/usr/local/pgpool'
> > > >
> > > > # pgpool communication manager timeout. 0 means no timeout, but
> > > > strongly not recommended!
> > > > pcp_timeout = 10
> > > >
> > > > # number of pre-forked child process
> > > > num_init_children = 32
> > > >
> > > >
> > > > # Number of connection pools allowed for a child process
> > > > max_pool = 4
> > > >
> > > >
> > > > # If idle for this many seconds, child exits.  0 means no timeout.
> > > > child_life_time = 30
> > > >
> > > > # If idle for this many seconds, connection to PostgreSQL closes.
> > > > # 0 means no timeout.
> > > > #connection_life_time = 0
> > > > connection_life_time = 30
> > > >
> > > > # If child_max_connections connections were received, child exits.
> > > > # 0 means no exit.
> > > > # change
> > > > child_max_connections = 0
> > > >
> > > > # Maximum time in seconds to complete client authentication.
> > > > # 0 means no timeout.
> > > > authentication_timeout = 60
> > > >
> > > > # Logging directory (more accurately, the directory for the PID file)
> > > > logdir = '/usr/local/pgpool'
> > > >
> > > > # Replication mode
> > > > replication_mode = false
> > > >
> > > > # Set this to true if you want to avoid deadlock situations when
> > > > # replication is enabled.  There will, however, be a noticable
> > > performance
> > > > # degradation.  A workaround is to set this to false and insert a
> > > /*STRICT*/
> > > > # comment at the beginning of the SQL command.
> > > > replication_strict = false
> > > >
> > > > # When replication_strict is set to false, there will be a chance for
> > > > # deadlocks.  Set this to nonzero (in milliseconds) to detect this
> > > > # situation and resolve the deadlock by aborting current session.
> > > > replication_timeout = 5000
> > > >
> > > > # Load balancing mode, i.e., all SELECTs except in a transaction
> block
> > > > # are load balanced.  This is ignored if replication_mode is false.
> > > > # change
> > > > load_balance_mode = true
> > > >
> > > > # if there's a data mismatch between master and secondary
> > > > # start degeneration to stop replication mode
> > > > replication_stop_on_mismatch = false
> > > >
> > > > # If true, replicate SELECT statement when load balancing is
> disabled.
> > > > # If false, it is only sent to the master node.
> > > > # change
> > > > replicate_select = true
> > > >
> > > > # Semicolon separated list of queries to be issued at the end of a
> > > session
> > > > reset_query_list = 'ABORT; RESET ALL; SET SESSION AUTHORIZATION
> DEFAULT'
> > > >
> > > > # If true print timestamp on each log line.
> > > > print_timestamp = true
> > > >
> > > > # If true, operate in master/slave mode.
> > > > # change
> > > > master_slave_mode = true
> > > >
> > > > # If true, cache connection pool.
> > > > connection_cache = false
> > > >
> > > > # Health check timeout.  0 means no timeout.
> > > > health_check_timeout = 20
> > > >
> > > > # Health check period.  0 means no health check.
> > > > health_check_period = 0
> > > >
> > > > # Health check user
> > > > health_check_user = 'nobody'
> > > >
> > > > # If true, automatically lock table with INSERT statements to keep
> SERIAL
> > > > # data consistency.  An /*INSERT LOCK*/ comment has the same effect.
>  A
> > > > # /NO INSERT LOCK*/ comment disables the effect.
> > > > insert_lock = false
> > > >
> > > > # If true, ignore leading white spaces of each query while pgpool
> judges
> > > > # whether the query is a SELECT so that it can be load balanced.
>  This
> > > > # is useful for certain APIs such as DBI/DBD which is known to adding
> an
> > > > # extra leading white space.
> > > > ignore_leading_white_space = false
> > > >
> > > > # If true, print all statements to the log.  Like the log_statement
> > > option
> > > > # to PostgreSQL, this allows for observing queries without engaging
> in
> > > full
> > > > # debugging.
> > > > log_statement = false
> > > >
> > > > # If true, incoming connections will be printed to the log.
> > > > # change
> > > > log_connections = true
> > > >
> > > > # If true, hostname will be shown in ps status. Also shown in
> > > > # connection log if log_connections = true.
> > > > # Be warned that this feature will add overhead to look up hostname.
> > > > log_hostname = false
> > > >
> > > > # if non 0, run in parallel query mode
> > > > parallel_mode = false
> > > >
> > > > # if non 0, use query cache
> > > > enable_query_cache = 0
> > > >
> > > > #set pgpool2 hostname
> > > > pgpool2_hostname = ''
> > > >
> > > > # system DB info
> > > > #system_db_hostname = 'localhost'
> > > > #system_db_port = 5432
> > > > #system_db_dbname = 'pgpool'
> > > > #system_db_schema = 'pgpool_catalog'
> > > > #system_db_user = 'pgpool'
> > > > #system_db_password = ''
> > > >
> > > > # backend_hostname, backend_port, backend_weight
> > > > # here are examples
> > > > backend_hostname0 = 'db1.xxx.xxx'
> > > > backend_port0 = 5433
> > > > backend_weight0 = 0.0
> > > >
> > > > backend_hostname1 = 'db2.xxx.xxx'
> > > > backend_port1 = 5433
> > > > backend_weight1 = 0.4
> > > >
> > > > backend_hostname2 = 'db3.xxx.xxx'
> > > > backend_port2 = 5433
> > > > backend_weight2 = 0.6
> > > >
> > > >
> > > >
> > > > # - HBA -
> > > >
> > > > # If true, use pool_hba.conf for client authentication. In pgpool-II
> > > > # 1.1, the default value is false. The default value will be true in
> > > > # 1.2.
> > > > enable_pool_hba = false
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20091106/b6ed2f96/attachment.html>