[pgpool-general: 7784] Re: Error: sorry, too many clients

Wed Oct 13 18:00:51 JST 2021

Hello Bo,

In our environment:
reserved_connections=0
listen_backlog_multiplier=2

So, if I understand correctly, we have 2 different problematic cases:

(1) a client silently disappears (doesn't properly close the connection):
In this case, setting `child_idle_limit` to a non-zero value (say 15min)
could mitigate the problem, and eventually the child worker will again be
available to handle an incoming connection.

(2) a server-side child process crashes: In this case, Pgpool still keeps a
dangling reference to this process, and if some client is routed there will
get a "sorry, too many clients" error. As for now, this problem cannot be
mitigated unless the (parent) Pgpool process is restarted.

One more question:
What can make a child process crash? (well, I mean, what are the most
common reasons for that).
Does this crash leave any trace (that I should search for) in Pgpool's
error log? Does parent Pgpool get notified on this (e.g. via waitpid or
SIGCHLD)?

Best regards, Michail

On Wed, Oct 13, 2021 at 10:10 AM Bo Peng <pengbo at sraoss.co.jp> wrote:

> Hello,
>
> I could reproduce this "sorry, too many clients" issue,
> even if reserved_connections=0 and pgpool child processes are available.
>
> The cause is that the pgpool child process connected by client is
> terminated unexpectedly.
> After that, if client connects to that terminated child process,
> "sorry, too many clients" error occurs.
>
> So far, the only workaround is to restart pgpool.
> We are considering a solution to avoid the problem.
>
> Regards,
>
> > Hello,
> >
> > I want to make sure if the message "sorry, too many clients" is issued
> by Pgpool,
> > because Pgpool-II issues "sorry, too many clients" messages,
> > only when reserved_connections is a non-zero value.
> > What is the setting of "reserved_connections"?
> >
> > > I will  try `client_idle_limit` parameter.
> > > I was reluctant to use it because some clients probably won't like it
> (but
> > > it may be easier to fix it client-side).
> > >
> > > I have a basic question: how can I detect leaked (occupied and not
> reused)
> > > connection slots?
> > > Can `pcp_proc_info --all` provide some meaningful information in this
> > > direction?
> >
> > You can use pcp_proc_info command to confirm the used (occupied)
> connections.
> > However, the "not resued" connection is due to your application.
> > You may need to find out the reason why application doesn't use the
> pooled
> > connections on application side.
> >
> > For example,
> > application processes terminated unexpectedly, but the connections
> remained on pgpool side.
> >
> > BTW, in the coming major version 4.3, pcp_proc_info command is able to
> > display more useful information such as:
> > - connection created timestamp
> > - connection created timestamp
> > - last client connected timestamp
> > - client idle duration (sec)
> > - last client disconnected timestamp
> > - connection-reuse counter
> > - etc.
> >
> > > Let M be the maximum number (worst case) of connections that clients
> can
> > > ask for.
> > > Let N be the number of per-backend connection slots (=
> num_init_children *
> > > max_pool) available in Pgpool.
> > > Of course: M < N.
> > >
> > > I can measure the number m of occupied slots (say for backend #0) as:
> > > m=$(pcp_proc_info ${pcp_opts} --all | grep '0$' | grep -c -v
> '^[[:space:]]')
> > >
> > > If m grows greater than M for some "steady" period of time (say for
> over
> > > 10min),
> > > can I interpret this as a sign of leaked (non-reused) connection slots
> ?
> >
> > I think m could not be greater than M,
> > because M is the max connections that client can ask for.
> >
> >   M = num_init_children - reserved_connections
> >
> > Regrads,
> >
> > > On Tue, Oct 12, 2021 at 7:10 AM Bo Peng <pengbo at sraoss.co.jp> wrote:
> > >
> > > > Hello,
> > > >
> > > > > Hello PgPool
> > > > >
> > > > > We are using PgPool 4.2.4 (from Alpine 3.13) as the frontend of a
> > > > > master/standby PostgreSQL 10.14 cluster (2 synchronous standbys in
> > > > > streaming replication mode). PgPool is operating as a load
> balancer and a
> > > > > connection pool.
> > > > >
> > > > > PgPool forks 90 pool workers (num_init_children=90, max_pool=2).
> All
> > > > > PostgreSQL backend servers are configured properly to accept a
> maximum of
> > > > > 190 connections (so, num_init_children * max_pool is below this
> maximum
> > > > > with a safety margin of 10).
> > > > >
> > > > > The relevant (pool-specific) part of the configuration is:
> > > > > num_init_children = 90
> > > > > max_pool = 2
> > > > > child_life_time = 5min
> > > > > child_max_connections = 0
> > > > > connection_life_time = 0
> > > > > client_idle_limit = 0
> > > > >
> > > > > After a while (5-10 days of operation), we face several errors of
> the
> > > > type
> > > > > of "Sorry, too many clients".  These errors are not originating
> from a
> > > > > PostgreSQL backend (error logs are completely clear and
> pg_stat_activity
> > > > > shows no more than 140 (from 190) active connections).
> > > > >
> > > > > This problem has been around for some time now, and we have to
> resort to
> > > > > restarting the entire PgPool service. Of course, this is not a
> perfect
> > > > > solution because it disrupts some database clients (so, we also
> have to
> > > > > cascade those restarts to some other services, and so on).
> > > > >
> > > > > The majority of database clients are connecting from a Hikari pool
> (Java
> > > > > Spring/Boot) which creates long-standing connections (i.e. not one
> > > > > per-transaction).
> > > > >
> > > > > Any light on this issue is highly appreciated!
> > > >
> > > > The connections hasn't been resued?
> > > > Because you are using connection pooling between client and pgpool,
> > > > I think the cached connections should be reused.
> > > >
> > > > You may need to configure "client_idle_limit" paramater
> > > > to prevent the Pgpool-II child processes from being occupied for
> long time
> > > > by broken connections.
> > > >
> > > >
> > > >
> https://www.pgpool.net/docs/latest/en/html/runtime-config-connection-pooling.html#GUC-CLIENT-IDLE-LIMIT
> > > >
> > > > --
> > > > Bo Peng <pengbo at sraoss.co.jp>
> > > > SRA OSS, Inc. Japan
> > > > http://www.sraoss.co.jp/
> > > >
> >
> >
> > --
> > Bo Peng <pengbo at sraoss.co.jp>
> > SRA OSS, Inc. Japan
> > http://www.sraoss.co.jp/
> > _______________________________________________
> > pgpool-general mailing list
> > pgpool-general at pgpool.net
> > http://www.pgpool.net/mailman/listinfo/pgpool-general
>
>
> --
> Bo Peng <pengbo at sraoss.co.jp>
> SRA OSS, Inc. Japan
> http://www.sraoss.co.jp/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20211013/d5290434/attachment.htm>