[pgpool-hackers: 3143] Re: Proposal: allow to share query cache by multiple Pgpool-II instances
ishii at sraoss.co.jp
Tue Nov 20 11:25:13 JST 2018
[This is a repost of [pgpool-hackers: 2845] to follow the original
thread just for an archiving purpose.]
It turned out that the bottle neck was the lock acquisition. Since
there's no lock API provided by memcached, I simulated it by creating
an object in memcached. If someone else already created the object,
pgpool child sleep 100 ms and retries. If I decrease the sleep time to
as small as 1 micro sec, the TPS increased from 100 to 9600. However,
current master branch performs 16000 TPS. So still performance is too
low. Moreover, I constantly get error from memcached saying that
creating object is too large. In this implementation, I have key =
db_oid/table_oid, data = array of query cache hash. The array could
So I tried to change the implementation so that key =
db_oid/table_oid/sequence_number, data = single_query_cache_hash. With
this, I didn't have the error any more. But the TPS is as low as 6000.
Moreover, memcached seems to feel free purge the data. (I checked it
by using memccat). It's fine that the query result cache is
gone. Since in that case we can retrieve fresh data from PostgreSQL
anyway. However, losing cache invalidating data will lead to cache
consistency problem. So this is not acceptable.
In summary, my challenge to implement cross-pgpool query result cache
seems going into a dead end: less performance and loss of cache
Probably time to abandon the attempt.
SRA OSS, Inc. Japan
> I have tested this using pgbench -S and the result was disappointing.
> I guess the bottle neck is, large data updating, which is necessary to
> maintain query cache hash vs. table oid list. I will work on improving
> this, but if I still get poor performance, maybe I should abandon
From: Tatsuo Ishii <ishii at sraoss.co.jp>
Subject: [pgpool-hackers: 2839] Proposal: allow to share query cache by multiple Pgpool-II instances
Date: Fri, 22 Jun 2018 13:48:03 +0900 (JST)
Message-ID: <20180622.134803.1464886995477839487.t-ishii at sraoss.co.jp>
> Currently, Pgpool-II does not share query cache data among Pgpool-II
> servers. If the query cache is on shared memory, without doubt this is
> impossible or at least vary hard. However if we use memcached for the
> query cache storage, we can already share the cache among Pgpool-II
> servers today. The only problem is cache validation. Since Pgpool-II
> relies on table oid data stored in the local files, Pgpool-II server
> which does not have the file will not invalidate query cache even if
> DML is sent from clients and this leads to cache inconsistency state.
> To solve the problem, I propose to store the data used for cache
> invalidation in the memcached server, rather than local files when
> using memcached for the cache storage.
> The data used for invalidating query cache in the memcached will be:
> Key: database oid (32 bits)+table oid (32 bits)
> Data: array of cache key (32 bytes) which use the table in the SELECT query.
> Memcached allows up to 1MB of data for each cache entry, so we could
> have up to 32768 cache entries for a table specified by the data above.
> In addition to above, we need to have index data when drop database
> command gets executed. The for the data will be:
> Key: database oid (32 bits)
> Data: array of table oids (32 bits)
> BTW before implementing this, I would like to split the one giant
> query related source file (src/query_cache/pool_memqcache.c, about 5k
> lines) into multiple smaller files to make implementations and future
> maintenance works easier.
> Best regards,
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> pgpool-hackers mailing list
> pgpool-hackers at pgpool.net
More information about the pgpool-hackers