[pgpool-general: 9428] Re: Clarification on query results cache visibility
Tatsuo Ishii
ishii at postgresql.org
Sat Apr 26 13:16:21 JST 2025
Hi,
Thank you for the detailed report.
> Dear pgpool people
>
> Yesterday during some tests which involve : java app (wildfly) -->
> pgpool-II 4.6 --> pgsql 17.4 we came to a situation that a new
> connection could see stale or invalid cache data, meaning that the
> actual DB contents (5432) were not depicted on certain pgpool
> connections (9999).
>
> The problem exists seemingly only for inherited tables that share the
> same name, and belong to different schemas.
>
> The tables are :
>
> amantzio@[local]/dynacom=# \d+ public.useroptions
> Table
> "public.useroptions"
> Column | Type | Collation | Nullable | Default |
> Storage | Compression | Stats target | Description
> ----------+------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
>
> username | character varying(200) | | not null | |
> extended | | |
> app | text | | not null | |
> extended | | |
> detail | text | | not null | |
> extended | | |
> urlext | text | | not null | |
> extended | | |
> Indexes:
> "useroptions_pkey" PRIMARY KEY, btree (username, app, detail)
> Child tables: bdelosnav.useroptions,
> bdynacom.useroptions,
> bdynagas.useroptions,
> prominencemaritime.useroptions
> Access method: heap
>
> amantzio@[local]/dynacom=#
>
> In pgpool we have : memory_cache_enabled = on . All other cache
> related params are at default.
>
> In the specific tests we use access to bdynacom.useroptions, but
> accessing without fully qualified via search_path. But access of those
> rows from within public.useroptions also exhibit the problem(s).
>
> We have on the actual DB directly :
>
> achill at smadevnu:~/workspace/gatewaynet % psql
> psql (17.4)
> Type "help" for help.
>
> amantzio@[local]/dynacom=# SELECT tableoid::regclass, app, urlext FROM
> useroptions WHERE username = 'Ioannis Mazarakis' AND app IN
> ('performreport') AND detail = '';
> tableoid | app | urlext
> ----------------------+---------------+------------
> bdynacom.useroptions | performreport | ?zz=foobar
> (1 row)
>
> However querying against pgpool I see a different version of the data:
>
> achill at smadevnu:~/workspace/gatewaynet % psql -p 9999
> psql (17.4)
> Type "help" for help.
>
> amantzio@[local]/dynacom=# SELECT app, urlext FROM useroptions WHERE
> username = 'Ioannis Mazarakis' AND app IN ('performreport') AND detail
> = '';
> app | urlext
> ---------------+----------------------
> performreport | ?group=yes&groupno=7
> (1 row)
>
> amantzio@[local]/dynacom=#
>
> repeating with including tableoid::regclass or just /*FORCE QUERY
> CACHE*/ to beat the cache :
>
> /*FORCE QUERY CACHE*/ SELECT app, urlext FROM useroptions WHERE
> username = 'Ioannis Mazarakis' AND app IN ('performreport') AND detail
> = '';
> app | urlext
> ---------------+------------
> performreport | ?zz=foobar
> (1 row)
>
> Using a another new user does not demo the problem :
>
> achill at smadevnu:~/workspace/gatewaynet % psql -p 9999 -U stsoukalas
> psql (17.4)
> Type "help" for help.
>
> stsoukalas@[local]/dynacom=# SELECT app, urlext FROM useroptions WHERE
> username = 'Ioannis Mazarakis' AND app IN ('performreport') AND detail
> = '';
> app | urlext
> ---------------+------------
> performreport | ?zz=foobar
> (1 row)
>
> now logging with another user :
>
> achill at smadevnu:~ % psql -p 9999 -U imazarakis dynacom
> psql (17.4)
> Type "help" for help.
>
> imazarakis@[local]/dynacom=> SELECT app, urlext FROM useroptions WHERE
> username = 'Ioannis Mazarakis' AND app IN ('performreport') AND detail
> = '';
> app | urlext
> ---------------+------------
> performreport | ?zz=foobar
> (1 row)
>
> imazarakis@[local]/dynacom=> /*FORCE QUERY CACHE*/ SELECT app, urlext
> FROM useroptions WHERE username = 'Ioannis Mazarakis' AND app IN
> ('performreport') AND detail = '';
> app | urlext
> ---------------+---------------
> performreport | ?zz=achillbar
> (1 row)
>
> imazarakis@[local]/dynacom=> update useroptions set urlext=urlext
> WHERE username = 'Ioannis Mazarakis' AND app IN ('performreport') AND
> detail = '';
> UPDATE 1
> imazarakis@[local]/dynacom=> SELECT app, urlext FROM useroptions WHERE
> username = 'Ioannis Mazarakis' AND app IN ('performreport') AND detail
> = '';
> app | urlext
> ---------------+---------------
> performreport | ?zz=achillbar
> (1 row)
>
> As we said, non inherited tables do not seem to suffer from the same
> issue.
I think the issue is not related to the table inheritance, but schema
search path. Since you use the same table name in the queries, you
should have changed the schema search path so that you can access the
appropriate table. I can reproduce similar issue just changing search
path.
create schema s1;
CREATE SCHEMA
create schema s2;
CREATE SCHEMA
create table s1.t1(i int);
CREATE TABLE
create table s2.t1(i int);
CREATE TABLE
insert into s1.t1 values(1);
INSERT 0 1
insert into s2.t1 values(2);
INSERT 0 1
set search_path to s1,public;
SET
select * from t1;
i
---
1
(1 row)
set search_path to s2,public;
SET
select * from t1;
i
---
1
(1 row)
The last select should have returned 2 since it accessed s2.t1, but it
returned 1 because of the cached row produced by the previous "select
* from t1" which accessed s1.t1.
> Now granted , either :
>
> a) disabling the memory cache or
>
> b) placing a specified comment HINT in front of the query or
>
> c) specifying the affected tables in cache_unsafe_memqcache_table_list
> or
>
> d)or using BEGIN ; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ ,
> or above
>
> as you explain here :
> https://pgsqlpgpool.blogspot.com/2021/04/visibility-with-query-cache.html
Sorry but I don't understand why "BEGIN ; SET TRANSACTION ISOLATION
LEVEL REPEATABLE READ ," is useful for your problem.
> (btw, why non inherited tables work so nice with the cache ??? without
> the issue above?)
I don't think inheritance is relevant. See above.
> If the blog is still valid, why not state this explicitly inside the
> docs?
I think the blog is still valid but it's not connected to your
problem.
> And if the blog is no longer relevant , please include such a
> statement.
>
> So the above could help alleviate the problem ... however:
>
> a) disabling the cache ... is not ideal , I mean why not have such a
> powerful feature ?
>
> b) placing comments in 1000s of queries is not an option
>
> c) we would not like to restrict tables based on a bug
>
> d) same as b) rewriting all our transactions is not an option , also
> we would not like to change the default default_transaction_isolation.
>
> So can you please explain what is the state of affairs regarding
> pgpool query cache and inherited tables? or pgpool caching and
> visibility ?
As stated above, the query cache does not take account of schema
search path. I think that's source of the problem. Probably in
addition to user name, database and bind parameters (if any), the
schema search path should be added to the data when pgpool calculates
a MD5 hash (which is the cache key) on it. However this requires
pgpool to obtain the search path using "show search_path", which adds
an additional overhead while fetching query cache. Any idea?
Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
More information about the pgpool-general
mailing list