[pgpool-general: 545] Re: load balancing seems to be bottlenecked by performance of master

Tatsuo Ishii ishii at postgresql.org
Tue May 29 11:45:05 JST 2012


> On 05/28/2012 06:52 PM, Tatsuo Ishii wrote:
>>> On 05/28/2012 06:43 PM, Tatsuo Ishii wrote:
>>>>> On 05/28/2012 05:55 PM, Tatsuo Ishii wrote:
>>>>>>> On Mon, May 28, 2012 at 5:23 PM, Tatsuo Ishii<ishii at postgresql.org>
>>>>>>> wrote:
>>>>>>>>> On Mon, May 28, 2012 at 3:54 PM, Tatsuo Ishii<ishii at sraoss.co.jp>
>>>>>>>>> wrote:
>>>>>>>>>>> What are the reasons for analysing system catalogs on primary server?
>>>>>>>>>>
>>>>>>>>>> For example, if a table is a temporary one or not.
>>>>>>>>>
>>>>>>>>> Yes, but as I noted, I don't use temp tables at all.  If this is the
>>>>>>>>> primary justification, then its not doing me any good, and causing
>>>>>>>>> unnecessary negative performance impact.
>>>>>>>>
>>>>>>>> But how does pgpool know that you are not going to use temporary
>>>>>>>> tables beforehand?
>>>>>>>
>>>>>>> Provide a new pgpool.conf option that tells it to ignore them (with
>>>>>>> the assumption that they do not exist).
>>>>>>>
>>>>>>>>
>>>>>>>>> I understand if this isn't something you can fix right now, but I'm
>>>>>>>>> not even getting the impression that you consider this to be a design
>>>>>>>>> flaw.  A high write volume on the master should never impact the
>>>>>>>>> response time of any standby/slave with a read query.  This literally
>>>>>>>>> means that pgpool doesn't scale well in write heavy environments.
>>>>>>>>
>>>>>>>> That's why I asked you any idea to solve the problem.
>>>>>>>
>>>>>>> I guess I don't understand why pgpool needs to look up the system
>>>>>>> catalogs on the write server.   Shouldn't they be identical on all
>>>>>>> servers?
>>>>>>
>>>>>> You cannot assume that because streaming replication or slony are
>>>>>> async replication.
>>>>>>
>>>>>> Also remember that temp tables can only be used on primary.
>>>>>
>>>>> could you create a list of valid tables at startup and periodically
>>>>> poll for new tables? if it's an unlogged table you'll know, and if
>>>>> it's a temp table (or a very new table) you just wouldn't allow it to
>>>>> be load balanced.
>>>>
>>>> Interesting idea. However I am afraid you will not know if it's a temp
>>>> table or not until you query the system catalog.
>>>
>>> you wouldn't need to, if it's not in your list of "ok" tables it
>>> doesn't get load balanced. temp and unlogged tables would never be in
>>> the list. Worst case would be that a valid new table wouldn't be
>>> allowed to be load balanced until the list was refreshed.
>>
>> Ok, so your idea is, any new table (regardless it's a regular, temp or
>> unlogged one) will not be load balanced if it's not in the predefined
>> list.
> 
> right, and you could have a hook into pgpool reload to refresh the
> table, or have pgpool poll the system catalog on a regular (and
> tuneable) interval.
> 
>> Another issue with this approach is, this may not work if user change
>> schema search path on the fly. For example, user could have table
>> "foo" in public schema *and* have a table whose name is also "foo" in
>> schema "bar". If the list contains "foo" in public only, we need to
>> distinguish "foo" in public from "foo" in bar. For this purpose we
>> need to check schema search path of this session, it is only doable on
>> the fly.
> 
> hmm, how do you handle that situation now? as long as the in-memory
> list had schema defined it seems like it would be still doable.

We inquire backend by using pgpool_regclass() (variant of regcalss()
of PostgreSQL).  Doing the same job in pgpool is not easy as far as I
know, because we need to consider access permission of schema.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


More information about the pgpool-general mailing list