[Pgpool-hackers] patch "Caching query results in pgpool-II"

Tatsuo Ishii ishii at sraoss.co.jp
Tue Jun 28 00:35:40 UTC 2011


Other points after reviewing your code.

First of all I want to make sure that what are not included in the
patches:

1) Shared memory cache
2) Cache data when extended protocol is used
3) Cache invalidation when receives DDL/DML related to query cache
4) Cache should not be registered if transaction aborts
5) Cache should not be registered if the table is temporary table
6) configure does not recognize "--with-memcached" option. This part,
   I will write a patch.
7) All memory cache codes using memcached should be surrounded by
   #ifdef USE_MEMCACHED ... #endif
8) directive "memqcache_dir" is missing

Second, questionable codes in the patches:

1) No memory allocation error check in add_buf.

2) Excess memory allocation in add_buf. If it needs n byte, it allways
   allocate memqcache_total_size + n bytes. Why do we need extra
   memqcache_total_size?

3) What is this code doing? "con" is never
   used. memqcache_total_size's 3rd arg is "oid", while you are
   passing a file descriptor.

	/* save the received result to buf for each kind */
	if (pool_config->memory_cache_enabled)
	{
		char* con;
		memqcache_register(kind, frontend, frontend->fd, p1, len1);
	}

4) write_tableoid()

   - directory should be obtained from memqcache_dir directive

   - file extention ."txt" is not appropreate becauase the file does
     not need to be a text file.

   - writing md5 hash key as a string is wate of space. Just use fwrite().

   - The function should store md5 hash key + cache ids. Without cache
     ids, how can you invalidate cache entries?

5) About cache key

   Cache key should be md5(username+query_string+database_name), not
   just query_string. Also we need to store username at the same time
   to make sure that the cache data is not accessed by someone
   else. Otherwise we will have a security concern. See discussions in
   pgpool-hackers.

   http://lists.pgfoundry.org/pipermail/pgpool-hackers/2011-June/000817.html
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> Hello
> 
> I have implemented memcached's query cache function.
> I am sending the patch file.
> 
> It can be compile, but no complete, so all function I am going to implement
> is not included
> because I have some questions and I am at a loss what to do about them now.
> It's so nice to give me some advices.
> 
> patch file detail.
> 
> [Modify files]
> - Makefile.am
> - Makefile.in
> - pgpool.conf.sample
> - pool_config.h
> - pool_config.l
> - pool_config.c
> - pool_proto_modules.c
> - pool_process_query.c
> - pool_select_walker.c
> - pool_select_walker.h
> 
> [New File]
> - pool_memqcache.c
> - pool_memqcache.h
> 
> You need to install Memcached or libmemcached(C client library) because
> memcached is used for caching query.
> You have to edit Makefile before "make".
> The way to edit it is the following.
> 
> - Makefile
> CFLAGS = -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations
> -I/usr/local/include -lmemcached
> 
> 
> What bothers me are the following.
> 
> * First, I don't understand the way to edit configure.in for each external
> software(Memcached).
> Although I appended "AC_ARG_WITH", "AC_SUBST" in configure.in, there is a
> warning message
> when I execute "./configure --with-memcached=/usr/local/include".
> 
> [configure.in]
> AC_ARG_WITH(memcached,
>     [--with-memcache=DIR : site library files for Memcached in DIR],
>     [
>     case "$withval" in
>     "" | y | ye | yes | n | no)
>         AC_MSG_ERROR([*** You must supply an argument to the --with-memcache
> option.])
>       ;;
>     esac
>     MEMCACHED_DIR="$withval"
>     ])
> 
> AC_SUBST(MEMCACHED_DIR)
> 
> 
> bash-3.2$ ./configure --with-memcached=/usr/local/include/
> configure: WARNING: unrecognized options: --with-memcached
> ......
> 
> 
> * Next, I don't know about the way to fetch SELECT results in DataRow packet
> or tableoid in RowDescription packet.
> I try to GDB debug, but I cannot see in DataRow packet.
> There are functions of getting SELECT results or tableoid in
> memqcache_register(pool_memqcache.c).
> I thought that this code execute as I intended, but cannot get SELECT
> results or tableoid.
> What should I do owing to correct them.
> 
> ---
> best regards
> Masanori YAMAZAKI
> 
> 
> 
> 
> 
> 2011年6月20日0:14 Masanori Yamazaki <m.yamazaki23 at gmail.com>:
> 
>> pool_memcached.h is included in pool_process_query.c
>> I forgot it.
>>
>>
>>
>> > I got compile errors after applying your patches.
>> > Forgot to include pool_memqcache.h?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>> gcc -DHAVE_CONFIG_H -DDEFAULT_CONFIGDIR=\"/usr/local/etc\" -I.
>> -D_GNU_SOURCE -I /usr/local/pgsql/include   -g -O2 -Wall
>> -Wmissing-prototypes -Wmissing-declarations -MT pool_process_query.o -MD -MP
>> -MF .deps/pool_process_query.Tpo -c -o pool_process_query.o
>> pool_process_query.c
>>
>> pool_process_query.c:55:28: error: pool_memqcache.h: No such file or
>> directory
>> pool_process_query.c: In function 'pool_process_query':
>> pool_process_query.c:112: warning: implicit declaration of function
>> 'init_buf'
>> pool_process_query.c: In function 'SimpleForwardToFrontend':
>> pool_process_query.c:1173: warning: implicit declaration of function
>> 'memqcache_register'
>> make[2]: *** [pool_process_query.o] Error 1
>> make[2]: Leaving directory `/home/t-ishii/work/pgfoundry/GSoC/pgpool-II'
>> make[1]: *** [all-recursive] Error 1
>> make[1]: Leaving directory `/home/t-ishii/work/pgfoundry/GSoC/pgpool-II'
>> make: *** [all] Error 2
>>
>>
>>
>> 2011年6月19日23:44 Masanori Yamazaki <m.yamazaki23 at gmail.com>:
>>
>> > So where is new document patch?
>>>
>>> I'm sending new document patch.
>>> I described the way to install memcached or libmemcached(c client library)
>>> for memcached users.
>>>
>>>
>>> regards
>>> Masanori YAMAZAKI
>>>
>>>
>>>
>>> 2011/6/19 Tatsuo Ishii <ishii at sraoss.co.jp>
>>>
>>>> Forgot to attach patches.
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>> >> 2011/6/14 Tatsuo Ishii <ishii at sraoss.co.jp>
>>>> >>
>>>> >>> Please explain what is implemented and what is not implemented in
>>>> this
>>>> >>> patch.
>>>> >>>
>>>> >>
>>>> >> What is implemented in this patch are as below.
>>>> >> - connect to memcached.
>>>> >> - set RowDescription or DataRow on memcached.
>>>> >> - get RowDescription or DataRow on memcached by using key of
>>>> md5(SELECT
>>>> >> convert).
>>>> >> - delete cache on memcached by using key of md5(SELECT convert). only
>>>> >> module.
>>>> >> - fetch and parse the fellowing data in pgpool.conf.
>>>> >>
>>>> >> memory_cache_enabled
>>>> >> memqcache_method
>>>> >> memqcache_memcached_host
>>>> >> memqcache_memcached_port
>>>> >> memqcache_total_size
>>>> >> memqcache_expire
>>>> >> memqcache_maxcache
>>>> >> memqcache_cache_block_size
>>>> >>
>>>> >> - add data to buffer before set data on memcached.
>>>> >> - get buffer in order to set it on memcached
>>>> >> - initialize buffer
>>>> >>
>>>> >> I am implemented only in Simple Query case.
>>>> >>
>>>> >>
>>>> >> What is not implemented in this patch are as below.
>>>> >> - shmem cache.
>>>> >> structure of  managing table oids.
>>>> >> the space management map.
>>>> >> on memory query cache structure.
>>>> >>
>>>> >> - implementation in Extended Query case.
>>>> >>
>>>> >> - cache validation.
>>>> >> The result of SELECT using none IMMUTABLE functions is not cached.
>>>> >
>>>> > I have made a patch to implement a function which checks if a SELECT
>>>> > statement includes non IMMUTABLE function call.
>>>> >
>>>> > extern bool pool_has_non_immutable_function_call(Node *node);
>>>> >
>>>> > See attached patches.
>>>> >
>>>> > TODO:
>>>> >
>>>> > Some statements including:
>>>> >
>>>> > SELECT current_timestamp;
>>>> >
>>>> > is not detected by the function even if it uses IMMUTABLE function
>>>> > because it is translated to a non function form. Obviously we should
>>>> > not cache the result of current_timestamp. What should we do? Probably
>>>> > we have to have a heuristics to detect the particular form?
>>>> >
>>>> >> - remove cache
>>>> >> If cache is full, the least recently used cache is removed.
>>>> >> If a table is dropped or modified, related cache data is deleted.
>>>> >> If a table definition is modified(ALTER TABLE), related cache data is
>>>> >> deleted.
>>>> >> If a schema or database is dropped, related cache data is deleted.
>>>> >>
>>>> >>
>>>> >> This week I am mainly implementing the shmem cache.
>>>> >>
>>>> >> Regards
>>>> >> Masanori YAMAZAKI
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>> --
>>>> >>> Tatsuo Ishii
>>>> >>> SRA OSS, Inc. Japan
>>>> >>> English: http://www.sraoss.co.jp/index_en.php
>>>> >>> Japanese: http://www.sraoss.co.jp
>>>> >>>
>>>> >>> > Hello, hackers
>>>> >>> >
>>>> >>> > I have implemented a function caching query on memcached.
>>>> >>> > I am sending the patch file and it's so nice to give me some
>>>> opinions.
>>>> >>> >
>>>> >>> > Main program which I implemented is as below.
>>>> >>> > - Search for cache data on memcached by using md5 which converted
>>>> SELECT
>>>> >>> > into.
>>>> >>> > - Set data on memcached. The key is md5 which converted  SELECT
>>>> into.
>>>> >>> > The value is RowDescription, DataRow from Backend.
>>>> >>> >
>>>> >>> > [Modify files]
>>>> >>> > - pool.h
>>>> >>> > - child.c
>>>> >>> > - pool_config.l
>>>> >>> > - pool_config.h
>>>> >>> > - pool_config.c
>>>> >>> > - pool_system.c
>>>> >>> > - pool_proto_modules.c
>>>> >>> > - pool_process_query.c
>>>> >>> >
>>>> >>> > [New File]
>>>> >>> > - pool_memqcache.c
>>>> >>> >
>>>> >>> >
>>>> >>> > --
>>>> >>> > Regards
>>>> >>> > Masanori YAMAZAKI
>>>> >>>
>>>>
>>>
>>>
>>>
>>> --
>>> best regards
>>> Masanori YAMAZAKI
>>> email: m.yamazaki23 at gmail.com
>>>
>>>
>>


More information about the Pgpool-hackers mailing list