[pgpool-hackers: 3124] New feature proposal: shared relation cache

Tatsuo Ishii ishii at sraoss.co.jp
Thu Nov 15 13:20:44 JST 2018


* Proposal background

Unlike PostgreSQL Pgpool-II cannot directly access system catalogs, it
issues SQL to PostgreSQL system catalog whenever necessary. For
example to execute a simple query

select count(*) from pgbench_accounts;

Pgpool-II issues 9 SELECTs to system catalogs:

2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_proc AS p WHERE p.proname = 'count' AND p.provolatile = 'i'
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c, pg_attribute AS a WHERE c.relname = 'pg_class' AND a.attrelid = c.oid AND a.attname = 'relistemp'
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c, pg_namespace AS n WHERE c.relname = 'pgbench_accounts' AND c.relnamespace = n.oid AND n.nspname ~ '^pg_temp_'
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) from (SELECT has_function_privilege('t-ishii', 'pg_catalog.to_regclass(cstring)', 'execute') WHERE EXISTS(SELECT * FROM pg_catalog.pg_proc AS p WHERE p.proname = 'to_regclass')) AS s
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.relname = 'pg_namespace'
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_class AS c, pg_namespace AS n WHERE c.oid = pg_catalog.to_regclass('"pgbench_accounts"') AND c.relnamespace = n.oid AND n.nspname = 'pg_catalog'
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pg_catalog.to_regclass('"pgbench_accounts"') AND (c.relkind = 'v' OR c.relkind = 'm')
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c, pg_catalog.pg_attribute AS a WHERE c.relname = 'pg_class' AND a.attrelid = c.oid AND a.attname = 'relpersistence'
2018-11-15 12:57:34: pid 14885: LOG:  DB node id: 0 backend pid: 14904 statement: SELECT count(*) FROM pg_catalog.pg_class AS c WHERE c.oid = pg_catalog.to_regclass('"pgbench_accounts"') AND c.relpersistence = 'u'

Of course this is only done for the first time the table (in this case
pgbench_accounts) is accessed. The query result is cached and
subsequent queries use the cache.

However the cache is in pgpool child process local memory. That means:

1) the cache is not shared among pgpool child process

2) if the child process dies, newly created pgpool child process needs
to re-select the system catalogs.

Especially #1 is a headache if num_init_children is large since each
pgpool child needs to issue query. For example if num_init_children
is 100, as many as 900 queries will be sent to PostgreSQL for the
first time.

And I only show the data for single table. If user needs to access 100
different tables, 900*1000 = 90,000 SELECTs are necessary.

* The proposal

So I propose to move the relation cache (relcache) to shared
memory. This should eliminate the problem #1 since now the cache can
be shared different pgpool child process. Moreover the cache remains
even after the pgpool child exits, and problem #2 will be eliminated
together. In the example above the number of SELECTs issued with this
approch will reduce from 90,000 to 9.

* Implementation

Instead of re-invent a relcache in shared memory, I propose to use
existing infrastructure, namely the in memory query cache. By adding a
few low level API, the relcache data can be easily placed on query
cache storage. A bonus of this plan is, the relache can be on
memcached without any additional coding.

Comments and/or suggestuons are welcome.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list