[pgpool-general: 556] Re: load balancing seems to be bottlenecked by performance of master

Lonni J Friedman netllama at gmail.com
Wed May 30 07:55:49 JST 2012


On Tue, May 29, 2012 at 3:47 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>> On 05/29/2012 07:40 AM, Lonni J Friedman wrote:
>>> On Mon, May 28, 2012 at 10:28 PM, Matt Solnit<msolnit at soasta.com>
>>> wrote:
>>>> On May 28, 2012, at 6:53 PM, Lonni J Friedman<netllama at gmail.com>
>>>> wrote:
>>>>> On Mon, May 28, 2012 at 5:23 PM, Tatsuo Ishii<ishii at postgresql.org>
>>>>> wrote:
>>>>>>> On Mon, May 28, 2012 at 3:54 PM, Tatsuo Ishii<ishii at sraoss.co.jp>
>>>>>>> wrote:
>>>>>>>>> What are the reasons for analysing system catalogs on primary server?
>>>>>>>>
>>>>>>>> For example, if a table is a temporary one or not.
>>>>>>>
>>>>>>> Yes, but as I noted, I don't use temp tables at all. ?If this is the
>>>>>>> primary justification, then its not doing me any good, and causing
>>>>>>> unnecessary negative performance impact.
>>>>>>
>>>>>> But how does pgpool know that you are not going to use temporary
>>>>>> tables beforehand?
>>>>>
>>>>> Provide a new pgpool.conf option that tells it to ignore them (with
>>>>> the assumption that they do not exist).
>>>>
>>>> +1 for new pgpool.conf setting, although I think the default should be
>>>> the
>>>> current behavior, for backward compatibility.
>>>>
>>>> Maybe something like "enable_temp_tables", although that might be too
>>>> vague.  "on" means current behavior, all system catalog queries go to
>>>> the
>>>> master.  "off" means that system catalog queries are load-balanced
>>>> just
>>>> like any other.
>>>
>>> agreed.
>>>
>>> Another idea is to create a new pgpool.conf option which makes the
>>> system catalogue check a configurable interval, as one of the
>>> following:
>>> * every query (current/default behavior)
>>> * when pgpool is started only
>>> * once every X minutes or X queries
>>>
>>> This would provide people with a tradeoff between performance and
>>> accuracy.  Anyone who never or rarely makes changes (and/or never uses
>>> temp tables) would likely prefer the 2nd or 3rd setting, while those
>>> who often make changes and/or use temp tables, would stick with the
>>> default (current behavior).
>>
>> Yeah that's sort of where I was going with it, it's apparently not
>> easy to move those checks.
>>
>> enable_temp_tables=y/n seems like a possible workaround for now, but I
>> have to assume that there are people out there that want to use
>> temp/unlogged tables and also will be write heavy and want to scale
>> out in this manner.
>
> Before jump into any of these ideas, I would like to confirm if the
> those accesses are really a bottle neck or not. Pgpool caches results
> of accessing system catalogs and the cache persists as long as the
> connection pool remains. So unless your set the life time of
> connection pool very short, I guess those system catalog accesses are
> not the bottle neck in the real world.

My settings are as follows:
child_life_time = 300
child_max_connections = 2990
connection_life_time = 0
client_idle_limit = 180

and I have the weighting split 33/33/33/1 with 33 going to the read
servers, and 1 going to the master.

All that I'm certain of is that the responsiveness of the database
cluster to any query degrades as the load on the master increases.
I've seen numerous cases where the standby/slave servers had a load of
less than 1.00, yet I was still waiting seconds to get a response from
a trivial query.  I've even had cases where I've manually run the same
query on the standby/slave directly, and it came before the query
going through pgpool.


More information about the pgpool-general mailing list