pgpool-II Tutrial [ On Memory Query Cache ]

About this document

In this Tutrial, I explain a simple way to try "On memory query cache" with shared memory.

What you need is a Linux box in which PostgreSQL and pgpool-II are installed. You can use on memory query cache with pgpool in any mode: replication mode, master/slave mode and raw mode.

On memory query cache

Configuring pgpool.conf

Basic parameters

Set "memory_cache_enabled" on to enable the functionality of on memory query cache.

# If on, use the memory cache functionality, off by default
memory_cache_enabled = on

Pgpool creates "OID map" files which indicates that each cache are for which database and table. Specify the directory for it as you like.

# Temporary work directory to record table oids
memqcache_oiddir = '/var/log/pgpool/oiddir'

About shared mamory

The parameter "memqcache_method" is the choice of the cache storage, and it defaults to "shmem".

# Cache storage method. either 'shmem'(shared memory) or 'memcached'. 'shmem' by default
memqcache_method = 'shmem'

About log

The parameter "log_per_node_statement" helps you to know in which backend node a query is executed. Set this parameter on.

log_per_node_statement = on
                                   # Log all statements
                                   # with node and backend informations

Starting pgpool

Start pgpool-II with "-n" (which means not-daemon mode) and redirect log messages into pgpool.log.

$ {installed_dir}/bin/pgpool -n -f {installed_dir}/etc/pgpool.conf > pgpool.log 2>&1

Create a database and a table.

$ createdb -p 9999 test

$ psql -p 9999 test
test=> CREATE TABLE table_cache_ok (a INT);
CREATE TABLE

test=> INSERT INTO table_cache_ok VALUES (1), (2), (3);
INSERT 0 3

test=> SELECT * FROM table_cache_ok ORDER BY a;
 a
---
 1
 2
 3
(3 rows)

Trying out query cache

Execute a SELECT.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
 a
---
 1
(1 row)

The output of log_per_node_statement shows that the SELECT is executed exactly.

LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

Execute the same SELECT.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
 a
---
 1
(1 row)

This time the log message shows that the result of SELECT is from caches.

LOG:   query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;

More fun with query cache

Auto invalidation

When a table is updated

When a table is updated, the cached results of SELECT from the table can be outdated. Pgpool discards all the cache of that table automatically by default.

Configuration

Specify the action by "memqcache_auto_cache_invalidation". It's set "on" by default.

# If on, invalidation of query cache is triggered by corresponding
# DDL/DML/DCL(and memqcache_expire).  If off, it is only triggered
# by memqcache_expire.  on by default.
memqcache_auto_cache_invalidation = on

Restart pgpool after changing the parameter's value.

Trying the cache invalidation

Confirm that you can get the result of a SELECT from cache.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;

Execute INSERT INTO the table.

test=> INSERT INTO table_cache_ok VALUES (5);
INSERT 0 1

Execute the same SELECT, and the SELECT is executed exactly.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

Cache expiration

Pgpool discards caches older than the specified expiration period.

Configuration

Specify the expiration period by "memqcache_expire". Default is 0 seconds, but set 5 seconds to confirm for trial this time.

# Memory cache entry life time specified in seconds.
# 0 means infinite life time. 0 by default.
memqcache_expire = 5

Restart pgpool after changing the parameter's value.

Try

Confirm that you can get the result of a SELECT from cache.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;

Wait in 5 seconds...

Execute the same SELECT, and the SELECT is executed exactly.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

White list and black list

Cache the result of some tables only

Configuration

If you want to cache the result of only some tabeles, specify the tables by "white_memqcache_table_list".

You can use regular expression (to which added automatically ^ and $).

white_memqcache_table_list = '.*_cache_ok'
                                   # Comma separated list of table names to memcache
                                   # that don't write to database
                                   # Regexp are accepted

Reload pgpool after changing the parameter's value.

$ {installed_dir}/bin/pgpool reload

Try

Create a table not to cache.

$ psql -p 9999 test
test=> CREATE TABLE table_cache_ng (a INT);
CREATE TABLE

test=> INSERT INTO table_cache_ng VALUES (1), (2), (3);
INSERT 0 3

Confirm if the SELECT FROM a table in white_memqcache_table_list is cached.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT oid FROM pg_database WHERE datname = 'test'

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;

Confirm if the SELECT FROM a table NOT in white_memqcache_table_list is NOT cached.

test=> SELECT * FROM table_cache_ng WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

test=> SELECT * FROM table_cache_ng WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

Not cache the result of some tables

Config

If you don't want to cache the result of some tabeles, specify the tables by "black_memqcache_table_list".

You can use regular expression (to which added automatically ^ and $).

black_memqcache_table_list = '.*_cache_ng'
                                   # Comma separated list of table names not to memcache
                                   # that don't write to database
                                   # Regexp are accepted

Reload pgpool after changing the parameter's value.

$ {installed_dir}/bin/pgpool reload

Try

Confirm if the SELECT FROM a table in black_memqcache_table_list is NOT cached.

test=> SELECT * FROM table_cache_ng WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

test=> SELECT * FROM table_cache_ng WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT * FROM table_cache_ok WHERE a = 1;

Confirm if the SELECT FROM a table NOT in black_memqcache_table_list is cached.

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   DB node id: 0 backend pid: 11203 statement: SELECT oid FROM pg_database WHERE datname = 'test'

test=> SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;

What else?

Too big results

The size of each SELECT result is limited by "memqcache_maxcache".

# Maximum SELECT result size in bytes.
# Must be smaller than memqcache_cache_block_size. Defaults to 400KB.
memqcache_maxcache = 409600

If a result is bigger than memqcache_maxcache, the log message tells us that. This is an example when memqcache_maxcache is set 1024 byte and SELECTed table is bigger than 10 MB.

LOG:   DB node id: 0 backend pid: 17749 statement: SELECT * FROM pgbench_accounts ;
LOG:   pool_add_temp_query_cache: data size exceeds memqcache_maxcache.
                                  current:983 requested:110 memq_maxcache:1024

Specify each query not to cache

If white list and black list are not enough for you, put the text "/* NO QUERY CACHE */" at the head of the queries.

LOG:   statement: SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   query result fetched from cache. statement: SELECT * FROM table_cache_ok WHERE a = 1;

LOG:   statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   DB node id: 0 backend pid: 18070 statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1;
LOG:   DB node id: 0 backend pid: 18070 statement: /* NO QUERY CACHE */ SELECT * FROM table_cache_ok WHERE a = 1;

This way requires to change your program of course.