[pgpool-general: 2157] Re: pgpool latency, hesitation under load

Tatsuo Ishii ishii at postgresql.org
Wed Sep 25 09:02:51 JST 2013


Thanks for your precise report. Your analysis is very helpful and I'm
going to think about how to enhance the connection queuing mechanism
used in pgpool-II. In the mean time I think adding attachments should
be allowed in the list. I'm very interested in the graph and query
plan. If you have problem in attaching them, please let me know.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hello,
> Thanks for your message.
> Yes, we were cascading pgpool-II.
> At first, we tried only one, but when connections exceeded around 10% the
> available pool, it increased latency for new connections to up 30 seconds.
> By cascading pooling the maximum latency reduced to around 3 seconds.
> But even 3 seconds were too much trouble at peak times. 
> The web application impose around 2 thousand transactions per second, and even
> more at peak times.
> By my limited knowledge about pgpool2, it seems that each process forked has its
> own queue. Therefore, one given connection could be assigned by kernel to a long
> queue composed of long running queries, while at the same moment a side process
> containing short queries has its queue flushed.
> As we observed many queries running at 0.5 seconds, it takes only 6 queries
> queued at same process to sum 3 seconds.
> Our team gave pgbouncer a try, as it uses threads and linux kernel assigns tcp
> connections to threads in the same process, afaik.
> The results were astounding for pooling, with clear visibility of queues by
> pgbouncer system queries. The latencies were reduced to around 0.5 seconds most
> of time. At some peak hours one or other frontend reach 2 seconds, rarely.
> (we have graphs of this difference, but this list may not allow attachments)
> It seems that the kernel logic to assign tcp connections to process queues is
> causing this behaviour.
> The custom queries for monitoring in pgbouncer show the waiting connections
> clearly at peak hours and they are flushed very fast. I guess it is because of
> the single queue.
> A very used feature by application is the reserved connections for slow queries.
> Is there a way to configure pgpool to behave similarly and show waiting
> connections by process?
> 
> We use pg_activity, zabbix, log threshold, and queries agains pg_stats to
> monitor backend. This way we see that application issues thousands of short
> queries and many queries between 0.1 to 0.5 seconds all time. Sometimes it
> issues very long queries of 30 seconds.
> The server has enough cpu for the present load (after many tunings and
> indexing), as now is running around 12%.
> 
> There has been some weeks and app versions since we blacklisted some tables, but
> as far I can remember, some query cache data was invalidated during the
> execution of some queries before other portion of query was executed. The
> queries were complex and  execution plan was not simple, so was reasonable to
> pgpool, that do not have the execution plan, to not process the query the same
> way backend do.
> Since then, as our cache hit was reduced to 0.55, we disabled query cache (as
> stated on docs if below 0.7)  and penalty on backend, giving our present
> environment  (running on RAM) was negligible.
> I have one of such queries recorded and  its present execution plan and may
> attach the txt file if list allows it.
> 
> Regards.
> Andre Felipe
> http://www.techforce.com.br
> 
> 
>  
> 
> 
> On 18/Sep/2013 20:09 Tatsuo Ishii <ishii at postgresql.org> wrote ..
>> I am a little bit confused.
>> 
>> You said:
>> > Using pgpool 3.3.0 as connection pooling (1 backend) with shared memory query
>> and:
>> > The pgpool at each web frontend does not have query cache enabled.
>> 
>> So you are "cascading" pgpool-II?
>> 
>> Also I am confused by this:
>> > Most of queries are below 0,5 seconds.
>> and this:
>> > answer psql requests under 100 ms, about 12% cpu load. Zabbix monitoring.
>> 
>> In the mean time I'm interested in this:
>> > complex queries, because postgresql used a non literal execution and query cache
>> > caused errors.
>> 
>> What kind of errors exactly do you have?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>> 
>> > Hello,
>> > Using pgpool 3.3.0 as connection pooling (1 backend) with shared memory query
>> > cache, and connection pooling only, we are trying to spot a misconfiguration
>> > that is causing a hesitation, latency, staggering under load.
>> > During these hesitation periods, sgbd transactions per second drops.
>> > If we connect web server directly to sgbd backend (too many processes open) 
>> > obviously %sys cpu is too high, but there are not staggering latencies.
>> > Sometimes a delay of up o 2.5 seconds to get a new connection for the web app.
>> > Also, we had to use 2 level pooling (one level at each web server, other in 
>> > front of sgbd) because when connections exceeded about 10% of num_init_children,
>> > it hesitated for about 30 seconds. 
>> > num_init_children = 160
>> > max_pool = 1
>> > When leaving child_max_connections at 0, the problem is reduced, but still 
>> > happens.
>> > connection_life_time = 360
>> > client_idle_limit = 360
>> > The backend is executing about 2000 transactions per second, around 300 
>> > connections per second.
>> > Most of queries are below 0,5 seconds.
>> > Query cache hit at same machine is 0.55. We had to blacklist some tables with
>> > complex queries, because postgresql used a non literal execution and query cache
>> > caused errors.
>> > The pgpool at each web frontend does not have query cache enabled.
>> > The postgresql 9.2.4 backend on debian linux 7.1, kernel 3.2, 40 cores 80 
>> > threads, 512 GB ram, raid controller 1GB NVRAM,  is running db on RAM and always
>> > answer psql requests under 100 ms, about 12% cpu load. Zabbix monitoring.
>> > max_connections = 256
>> > 
>> > How could we track our misconfiguration that cause pgpool latency staggering?
>> > 
>> > Regards.
>> > Andre Felipe Machado
>> > http://www.techforce.com.br


More information about the pgpool-general mailing list