[pgpool-general: 498] Re: load balancing seems to be bottlenecked by performance of master

Lonni J Friedman netllama at gmail.com
Fri May 25 01:36:38 JST 2012


On Mon, May 21, 2012 at 6:13 PM, Lonni J Friedman <netllama at gmail.com> wrote:
> On Mon, May 21, 2012 at 5:24 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>> On Mon, May 21, 2012 at 3:57 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>>>> Greetings,
>>>>> I'm running pgpool-II-3.1.2 purely for load balancing in front of a 4
>>>>> node postgresql-9.1.3 cluster (all running on Linux-x86_64).  I'm
>>>>> using streaming replication with 3 hot standby servers.  I have the
>>>>> weighting distributed such that the master weight is set to '1' and
>>>>> the three standby servers are set to '33' each.  My intent was that
>>>>> this would nearly eliminate all read queries going to the master,
>>>>> however what I'm observing is that the overall performance seems to be
>>>>> bottlenecked by the performance of the master, even when queries
>>>>> aren't ultimately getting sent to the master.
>>>>>
>>>>> For example, there are times when the load on the master is much
>>>>> higher than on the standby servers, and the master's overall
>>>>> performance is noticeably degraded.  When this occurs, I've found that
>>>>> establishing a database connection through pgpool with a SQL query
>>>>> that should normally go to one of the standby servers has very poor
>>>>> performance.  It seems like pgpool is silently doing something with
>>>>> the master (which is under load and slower to respond), before passing
>>>>> the query itself off to a standby server.
>>>>>
>>>>> Is this expected behavior?
>>>>
>>>> Yes. Pgpool needs to send query to system catalogs on primary to
>>>> analyze user queries (for example, if the refereed table is a
>>>> temporary table or not).
>>>
>>> Thanks, that's good to know, but also disappointing.  This behavior is
>>> serving as a huge bottleneck in my environment.  It basically means
>>> that no standby server can respond any faster than the master, which
>>> defeats much of the benefit of load balancing.
>>>
>>> Is there any way to work around this, or mitigate it to some degree?
>>> For example, I'm not using temp tables for anything in my environment,
>>> so not performing that check wouldn't have any impact on me.
>>
>> I thought the access to system catalogs are not big pain because
>> pgpool has session life time caches for this. However it seems they
>> might be bottle neck in your case. I would like to know what kind of
>> quries are major bottle neck in your case. Could you show me the query
>> log on the primary?
>
> I don't have a query log, as we're not logging queries.  I guess I
> could enable query logging in pgpool if you think that's going to help
> you investigate this further, but I'm concerned that's going to create
> significant disk activity on the pgpool server.
>
> But the issue isn't that specific queries are a bottleneck but rather
> that if there's a large amount of write activity, the load on the
> master will be much greater than on the standby/slaves, thereby
> reducing the performance of both the master & the standby/slaves.

Does the lack of reply mean that I'm basically out of luck on this?
Personally, I see this behavior as a significant flaw in pgpool's
design.  Its not true load balancing if every query is gated by the
performance of 1 server.


More information about the pgpool-general mailing list