[pgpool-general: 9463] Re: Clarification on query results cache visibility
Tatsuo Ishii
ishii at postgresql.org
Sun May 4 12:42:06 JST 2025
Achilleas,
Please disregard the patch. I think I have an oeversight with the
patch.
From: Tatsuo Ishii <ishii at postgresql.org>
Subject: [pgpool-general: 9462] Re: Clarification on query results cache visibility
Date: Sun, 04 May 2025 11:05:38 +0900 (JST)
Message-ID: <20250504.110538.1845818308821205486.ishii at postgresql.org>
>> On 3/5/25 11:55, Achilleas Mantzios wrote:
>>
>>> Good day Tatsuo
>>>
>>> On 3/5/25 09:21, Tatsuo Ishii wrote:
>>>>> On 3/5/25 03:41, Tatsuo Ishii wrote:
>>>>>> Dear Achilleas,
>>>>>>
>>>>>> Thank you for the report and logs. I will look into them. BTW, Can I
>>>>>> assume that you did below with patch?
>>>>> Thank you, yes with the second patch.
>>>> So, with no patch (stock 4.6.0) it works with or without
>>>> prepareThreshold=0?
>>>
>>> Yes, with the stock 4.6.0 it works for this scenario, no matter the
>>> prepareThreshold setting, I run this twice, one with
>>> prepareThreshold=0 and then prepareThreshold=5, no problem with this
>>> set of queries.
>>>
>>> So, without the patch, we have the first problem of the update as
>>> shown on test PgPoolTest.java :
>>>
>>> SELECT app, urlext FROM testpgpool WHERE username = ? AND app IN
>>> ('performreport') AND detail = ''
>>>
>>> which kept using the cache improperly because previous update did not
>>> invalidate, and you solved it with 2 different patches, however this
>>> second patch (I dont know about the first) introduced this new bug.
>>>
>>> I will try with the first patch applied and let you know ASAP.
>>>
>>
>> With the first patch, it behaves differently, the first occurrence of
>> the problem is delayed in comparison to the second patch, but the
>> problem is still there.
>>
>> With the second patch at least to my eyes as much as I can imagine
>> what happens is :
>>
>> as long as the prepared statements remain unnamed (nameless portals),
>> (till the insertion of description='sys5' in the logs), so till the
>> 5th call, it works ok, and the cache gets invalidated after each
>> insertion as it should.
>>
>> However, after the 5th run, the prepared statements start to be named,
>> and given a name by jdbc.
>>
>> As shown in the log :
>> pgpool-2025-05-02.log_problem_prepared_threshold_eq_5_no_first_function_nunu
>>
>> the while the "insert into machdefs " is unnamed and the the first
>> time it is parsed and named (S_9) the query
>>
>> select defid from machdefs where parents[1]=$1 order by
>> description,partid
>>
>> runs on the backend.
>>
>> However the very first time that S_9 is used to bind directly (the
>> parse had been done at the previous call) , the cache is not
>> invalidated and the subsequent :
>
> Bind without parse could cause difference: when a parse message
> arrives, pgpool extracts table oids used in the SQL. For example, for
> "INSERT INTO machdefs..." pgpool searches the oid of machdefs and
> keeps it in memory so that later on (at CommandComplete) pgpool
> invalidates query cache belongs to the oid (pgpool stores oids under
> logdir/pgpool to remember the relationship between cache key and
> oid). Unfortunately there's only one memory area to store the oids. If
> other parse message arrives, the area might be overwritten and fails to
> invalidate query cache.
>
> On the other hand, prepareThreshold=0 case always parse is used before
> bind and the problem is not there, I guess.
>
> Attached patch tries to solve the issue by by adding new member to
> "query context" structure (holding various data including parse tree
> etc.) to save the table oids, and restore at CommandComplete time.
>
> The patch is on top the second patch (query_cache-v2.patch). So please
> apply it after applying query_cache-v2.patch.
>
> Note that I don't have an explanation (yet) why 4.6.0 does not have
> the probolem you have. I have a simple test case which demonstrates
> the issue even 4.6.0.
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
>> select defid from machdefs where parents[1]=$1 order by
>> description,partid
>>
>> uses the cache. You can see this in the said log :
>>
>> egrep -e 'S_9|select defid from machdefs where pare|insert into
>> machdefs' -B 1 pgpool-20
>> 25-05-02.log_problem_prepared_threshold_eq_5_no_first_function_nunu
>>
>> I believe this call to S_9 is located here :
>>
>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacom line:2607LOG:
>> Bind message from frontend.
>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacom
>> line:2608DETAIL: portal: "", statement: "S_9"
>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacom line:2609LOG:
>> DB node id: 0 backend pid: 59852 statement: Bind: insert into
>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,parents,specialper,action,application,pms_importance,
>> pms_risk, actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>> where defid=$9),'{}'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>> --
>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacom
>> line:2611DETAIL: portal: ""
>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacom line:2612LOG:
>> statement: insert into
>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,parents,specialper,action,application,pms_importance,
>> pms_risk, actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>> where defid=$9),'{}'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacom line:2613LOG:
>> DB node id: 0 backend pid: 59852 statement: Execute: insert into
>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,parents,specialper,action,application,pms_importance,
>> pms_risk, actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>> where defid=$9),'{}'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>>
>>
>> Please if you need anything else, do not hesitate to contact me.
>>
>>
>>>>>> Best regards,
>>>>>> --
>>>>>> Tatsuo Ishii
>>>>>> SRA OSS K.K.
>>>>>> English:http://www.sraoss.co.jp/index_en/
>>>>>> Japanese:http://www.sraoss.co.jp
>>>>>>
>>>>>>> Dear Tatsuo I was not able to replicate the problem, lots of queries
>>>>>>> lots of tables involved to make it identical.
>>>>>>>
>>>>>>> I wrote a test but no success in making the problem happen.
>>>>>>>
>>>>>>> On all the below I use
>>>>>>>
>>>>>>> parents[1]=?
>>>>>>>
>>>>>>> instead of
>>>>>>>
>>>>>>> first(parents)=?
>>>>>>>
>>>>>>> to eliminate exotic factors.
>>>>>>>
>>>>>>> At least I observed that for some reason , setting in java :
>>>>>>> prepareThreshold = 0, which means use only unnamed statements,
>>>>>>> consistently gives correct results .
>>>>>>>
>>>>>>> But there is clearly a problem :
>>>>>>>
>>>>>>> an insertion happens in machdefs table and a subsequent select fetches
>>>>>>> from memory cache. So I attach to you full logs of both pgsql and
>>>>>>> pgpool for both runs, one with prepareThreshold=5 (problem) and one
>>>>>>> with prepareThreshold=0 (no problem).
>>>>>>>
>>>>>>>
>>>>>>> Since in pgpool log there is no way to get the values of the
>>>>>>> parameters in Bind and Execute (whereas in pgsql this is possible) , I
>>>>>>> was only "joining" the two log files by the exact timestamp . My
>>>>>>> observations, regarding the problematic run are :
>>>>>>>
>>>>>>> on pgpool
>>>>>>>
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2743LOG:
>>>>>>> Parse message from frontend.
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacom
>>>>>>> line:2744DETAIL: statement: "", query: "select defid from machdefs
>>>>>>> where parents[1]=$1 order by description,partid"
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2745LOG:
>>>>>>> DB node id: 0 backend pid: 59852 statement: Parse: select defid from
>>>>>>> machdefs where parents[1]=$1 order by description,partid
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2746LOG:
>>>>>>> Bind message from frontend.
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacom
>>>>>>> line:2747DETAIL: portal: "", statement: ""
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2748LOG:
>>>>>>> DB node id: 0 backend pid: 59852 statement: Bind: select defid from
>>>>>>> machdefs where parents[1]=$1 order by description,partid
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2749LOG:
>>>>>>> Describe message from frontend.
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacom
>>>>>>> line:2750DETAIL: portal: ""
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2751LOG:
>>>>>>> DB node id: 0 backend pid: 59852 statement: D message
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2752LOG:
>>>>>>> Execute message from frontend.
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacom
>>>>>>> line:2753DETAIL: portal: ""
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2754LOG:
>>>>>>> statement: select defid from machdefs where parents[1]=$1 order by
>>>>>>> description,partid
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2755LOG:
>>>>>>> fetch from memory cache
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacom
>>>>>>> line:2756DETAIL: query result fetched from cache. statement: select
>>>>>>> defid from machdefs where parents[1]=$1 order by description,partid
>>>>>>> 0001000100010
>>>>>>> 00000040CCD179D0000
>>>>>>>
>>>>>>> on pgsql
>>>>>>>
>>>>>>> 127.0.0.1(15112) [59852] 6814a7ed.e9cc 2025-05-02 14:10:47.582 EEST
>>>>>>> SMA amantzio at dynacomline:1340 LOG: duration: 0.042 ms parse
>>>>>>> <unnamed>: select defid from machdefs where parents[1]=$1 order by
>>>>>>> description,partid
>>>>>>> 127.0.0.1(15112) [59852] 6814a7ed.e9cc 2025-05-02 14:10:47.582 EEST
>>>>>>> SMA amantzio at dynacomline:1341 LOG: duration: 0.070 ms bind
>>>>>>> <unnamed>: select defid from machdefs where parents[1]=$1 order by
>>>>>>> description,partid
>>>>>>> 127.0.0.1(15112) [59852] 6814a7ed.e9cc 2025-05-02 14:10:47.582 EEST
>>>>>>> SMA amantzio at dynacomline:1342 DETAIL: Parameters: $1 = '214767517'
>>>>>>>
>>>>>>> (no execute)
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> the problem is manifested here :
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacomline:2755LOG:
>>>>>>> fetch from memory cache
>>>>>>> [59728] 2025-05-02 14:10:47.582 SMA amantzio at dynacom
>>>>>>> line:2756DETAIL: query result fetched from cache. statement: select
>>>>>>> defid from machdefs where parents[1]=$1 order by description,partid
>>>>>>> 0001000100010
>>>>>>> 00000040CCD179D0000
>>>>>>>
>>>>>>> because earlier he have :
>>>>>>> [59728] 2025-05-02 14:10:45.038 SMA amantzio at dynacomline:2613LOG:
>>>>>>> DB node id: 0 backend pid: 59852 statement: Execute: insert into
>>>>>>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,p
>>>>>>> arents,specialper,action,application,pms_importance, pms_risk,
>>>>>>> actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>>>>>>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>>>>>>> where defid=$9),'{
>>>>>>> }'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>>>>>>>
>>>>>>> this insertion should have invalidated the cache for this query :
>>>>>>> select defid from machdefs where parents[1]=$1 order by
>>>>>>> description,partid
>>>>>>>
>>>>>>> ========================================
>>>>>>>
>>>>>>>
>>>>>>> please have a look, I am so sorry I could not reprooduce by a simple
>>>>>>> program.
>>>>>>>
>>>>>>> (yes I looked on every detail, no luck).
>>>>>>>
>>>>>>>
>>>>>>> On 5/1/25 06:02, Achilleas Mantzios wrote:
>>>>>>>> On 1/5/25 03:02, Tatsuo Ishii wrote:
>>>>>>>>>> update : the bug (this new bug) is not present prior to the first
>>>>>>>>>> patch. So the bug (the new one) is not present in plain vanilla :
>>>>>>>>>> pgpool-II-4.6.0 . Reverting both patches solves the issue with the new
>>>>>>>>>> bug which has most probably to do with an array function called
>>>>>>>>>> "first" :
>>>>>>>>> "first" is coming with the extension? I am not familiar with it.
>>>>>>>> This (and a couple of others I wrote back then) is a C function
>>>>>>>> written by me, when still intarray was not an extension, I mean before
>>>>>>>> the extensions framework, sometime in 2004 or so. This is a simple C
>>>>>>>> function which returns the first element of an array.
>>>>>>>>
>>>>>>>> dynacom=# \df+ first
>>>>>>>> List
>>>>>>>> of functions
>>>>>>>> Schema | Name | Result data type | Argument data types | Type |
>>>>>>>> Volatility | Parallel | Owner | Security | Access privileges
>>>>>>>> | Language | Source code |
>>>>>>>> Description
>>>>>>>> --------+-------+------------------+---------------------+------+------------+----------+----------+----------+--------------------------+----------+-------------+
>>>>>>>> -------------
>>>>>>>> public | first | integer | integer[] | func |
>>>>>>>> immutable | unsafe | postgres | invoker | postgres=X/postgres
>>>>>>>> +| c | first |
>>>>>>>>
>>>>>>>> | | | | |
>>>>>>>> | | | | powerprom=X/postgres
>>>>>>>> +| | |
>>>>>>>>
>>>>>>>> | | | | |
>>>>>>>> | | | | default_group=X/postgres
>>>>>>>> | | |
>>>>>>>>
>>>>>>>> (1 row)
>>>>>>>>
>>>>>>>> dynacom=#
>>>>>>>>
>>>>>>>>
>>>>>>>>>> select defid, description from machdefs where first(parents)=214766150
>>>>>>>>>> order by description;
>>>>>>>>>>
>>>>>>>>>> I haven't been able to reproduce with SQL (from psql).
>>>>>>>>> Is it possible for you to create a java reproducer?
>>>>>>>> I will try.
>>>>>>>>>> Also reverting both patches, i.e. going back to plain vanilla
>>>>>>>>>> pgpool-II-4.6.0 solves this new bug, but re-introduces the previous (
>>>>>>>>>> with seeing stale and data and not invalidate data on table
>>>>>>>>>> testpgpool)
>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Tatsuo Ishii
>>>>>>>>>>>>>> SRA OSS K.K.
>>>>>>>>>>>>>> English:http://www.sraoss.co.jp/index_en/
>>>>>>>>>>>>>> Japanese:http://www.sraoss.co.jp
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> pgpool-general mailing list
>>>>>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> pgpool-general mailing list
>>>>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> pgpool-general mailing list
>>>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>> _______________________________________________
>>>>>>>> pgpool-general mailing list
>>>>>>>> pgpool-general at pgpool.net
>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>> _______________________________________________
>>>>> pgpool-general mailing list
>>>>> pgpool-general at pgpool.net
>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>
>>> _______________________________________________
>>> pgpool-general mailing list
>>> pgpool-general at pgpool.net
>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
More information about the pgpool-general
mailing list