[pgpool-general: 9464] Re: Clarification on query results cache visibility

Achilleas Mantzios a.mantzios at cloud.gatewaynet.com
Sun May 4 13:55:41 JST 2025


On 4/5/25 06:42, Tatsuo Ishii wrote:

> Achilleas,
>
> Please disregard the patch. I think I have an oeversight with the
> patch.

Good Day Tatsuo. Thank you for all your work!

I will wait !


> From: Tatsuo Ishii <ishii at postgresql.org>
> Subject: [pgpool-general: 9462] Re: Clarification on query results cache visibility
> Date: Sun, 04 May 2025 11:05:38 +0900 (JST)
> Message-ID: <20250504.110538.1845818308821205486.ishii at postgresql.org>
>
>>> On 3/5/25 11:55, Achilleas Mantzios wrote:
>>>
>>>> Good day Tatsuo
>>>>
>>>> On 3/5/25 09:21, Tatsuo Ishii wrote:
>>>>>> On 3/5/25 03:41, Tatsuo Ishii wrote:
>>>>>>> Dear Achilleas,
>>>>>>>
>>>>>>> Thank you for the report and logs. I will look into them.  BTW, Can I
>>>>>>> assume that you did below with patch?
>>>>>> Thank you, yes with the second patch.
>>>>> So, with no patch (stock 4.6.0) it works with or without
>>>>> prepareThreshold=0?
>>>> Yes, with the stock 4.6.0 it works for this scenario, no matter the
>>>> prepareThreshold setting, I run this twice, one with
>>>> prepareThreshold=0 and then prepareThreshold=5, no problem with this
>>>> set of queries.
>>>>
>>>> So, without the patch, we have the first problem of the update as
>>>> shown on test PgPoolTest.java :
>>>>
>>>> SELECT app, urlext FROM testpgpool WHERE username = ? AND app IN
>>>> ('performreport') AND detail = ''
>>>>
>>>> which kept using the cache improperly because previous update did not
>>>> invalidate, and you solved it with 2 different patches, however this
>>>> second patch (I dont know about the first) introduced this new bug.
>>>>
>>>> I will try with the first patch applied and let you know ASAP.
>>>>
>>> With the first patch, it behaves differently, the first occurrence of
>>> the problem is delayed in comparison to the second patch, but the
>>> problem is still there.
>>>
>>> With the second patch at least to my eyes as much as I can imagine
>>> what happens is :
>>>
>>> as long as the prepared statements remain unnamed (nameless portals),
>>> (till the insertion of description='sys5' in the logs), so till the
>>> 5th call, it works ok, and the cache gets invalidated after each
>>> insertion as it should.
>>>
>>> However, after the 5th run, the prepared statements start to be named,
>>> and given a name by jdbc.
>>>
>>> As shown in the log :
>>> pgpool-2025-05-02.log_problem_prepared_threshold_eq_5_no_first_function_nunu
>>>
>>> the while the "insert into machdefs " is unnamed and the the first
>>> time it is parsed and named (S_9) the query
>>>
>>> select defid from machdefs where parents[1]=$1  order by
>>> description,partid
>>>
>>> runs on the backend.
>>>
>>> However the very first time that S_9 is used to bind directly (the
>>> parse had been done at the previous call) , the cache is not
>>> invalidated and the subsequent :
>> Bind without parse could cause difference: when a parse message
>> arrives, pgpool extracts table oids used in the SQL. For example, for
>> "INSERT INTO machdefs..." pgpool searches the oid of machdefs and
>> keeps it in memory so that later on (at CommandComplete) pgpool
>> invalidates query cache belongs to the oid (pgpool stores oids under
>> logdir/pgpool to remember the relationship between cache key and
>> oid). Unfortunately there's only one memory area to store the oids. If
>> other parse message arrives, the area might be overwritten and fails to
>> invalidate query cache.
>>
>> On the other hand, prepareThreshold=0 case always parse is used before
>> bind and the problem is not there, I guess.
>>
>> Attached patch tries to solve the issue by by adding new member to
>> "query context" structure (holding various data including parse tree
>> etc.) to save the table oids, and restore at CommandComplete time.
>>
>> The patch is on top the second patch (query_cache-v2.patch). So please
>> apply it after applying query_cache-v2.patch.
>>
>> Note that I don't have an explanation (yet) why 4.6.0 does not have
>> the probolem you have. I have a simple test case which demonstrates
>> the issue even 4.6.0.
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS K.K.
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>
>>> select defid from machdefs where parents[1]=$1  order by
>>> description,partid
>>>
>>> uses the cache. You can see this in the said log :
>>>
>>> egrep -e 'S_9|select defid from machdefs  where pare|insert into
>>> machdefs' -B 1 pgpool-20
>>> 25-05-02.log_problem_prepared_threshold_eq_5_no_first_function_nunu
>>>
>>> I believe this call to S_9 is located here :
>>>
>>>   [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacom line:2607LOG:
>>> Bind message from frontend.
>>>   [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacom
>>> line:2608DETAIL:  portal: "", statement: "S_9"
>>>   [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacom line:2609LOG:
>>> DB node id: 0 backend pid: 59852 statement: Bind: insert into
>>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,parents,specialper,action,application,pms_importance,
>>> pms_risk, actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>>> where defid=$9),'{}'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>>> --
>>>   [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacom
>>> line:2611DETAIL:  portal: ""
>>>   [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacom line:2612LOG:
>>> statement: insert into
>>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,parents,specialper,action,application,pms_importance,
>>> pms_risk, actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>>> where defid=$9),'{}'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>>>   [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacom line:2613LOG:
>>> DB node id: 0 backend pid: 59852 statement: Execute: insert into
>>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,parents,specialper,action,application,pms_importance,
>>> pms_risk, actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>>> where defid=$9),'{}'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>>>
>>>
>>> Please if you need anything else, do not hesitate to contact me.
>>>
>>>
>>>>>>> Best regards,
>>>>>>> --
>>>>>>> Tatsuo Ishii
>>>>>>> SRA OSS K.K.
>>>>>>> English:http://www.sraoss.co.jp/index_en/
>>>>>>> Japanese:http://www.sraoss.co.jp
>>>>>>>
>>>>>>>> Dear Tatsuo I was not able to replicate the problem, lots of queries
>>>>>>>> lots of tables involved to make it identical.
>>>>>>>>
>>>>>>>> I wrote a test but no success in making the problem happen.
>>>>>>>>
>>>>>>>> On all the below I use
>>>>>>>>
>>>>>>>> parents[1]=?
>>>>>>>>
>>>>>>>> instead of
>>>>>>>>
>>>>>>>> first(parents)=?
>>>>>>>>
>>>>>>>> to eliminate exotic factors.
>>>>>>>>
>>>>>>>> At least I observed that for some reason , setting in java :
>>>>>>>> prepareThreshold = 0, which means use only unnamed statements,
>>>>>>>> consistently gives correct results .
>>>>>>>>
>>>>>>>> But there is clearly a problem :
>>>>>>>>
>>>>>>>> an insertion happens in machdefs table and a subsequent select fetches
>>>>>>>> from memory cache. So I attach to you full logs of both pgsql and
>>>>>>>> pgpool for both runs, one with prepareThreshold=5 (problem) and one
>>>>>>>> with prepareThreshold=0 (no problem).
>>>>>>>>
>>>>>>>>
>>>>>>>> Since in pgpool log there is no way to get the values of the
>>>>>>>> parameters in Bind and Execute (whereas in pgsql this is possible) , I
>>>>>>>> was only "joining" the two log files by the exact timestamp . My
>>>>>>>> observations, regarding the problematic run are :
>>>>>>>>
>>>>>>>> on pgpool
>>>>>>>>
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2743LOG:
>>>>>>>> Parse message from frontend.
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacom
>>>>>>>> line:2744DETAIL:  statement: "", query: "select defid from machdefs
>>>>>>>> where parents[1]=$1  order by description,partid"
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2745LOG:
>>>>>>>> DB node id: 0 backend pid: 59852 statement: Parse: select defid from
>>>>>>>> machdefs  where parents[1]=$1  order by description,partid
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2746LOG:
>>>>>>>> Bind message from frontend.
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacom
>>>>>>>> line:2747DETAIL:  portal: "", statement: ""
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2748LOG:
>>>>>>>> DB node id: 0 backend pid: 59852 statement: Bind: select defid from
>>>>>>>> machdefs  where parents[1]=$1  order by description,partid
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2749LOG:
>>>>>>>> Describe message from frontend.
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacom
>>>>>>>> line:2750DETAIL:  portal: ""
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2751LOG:
>>>>>>>> DB node id: 0 backend pid: 59852 statement: D message
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2752LOG:
>>>>>>>> Execute message from frontend.
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacom
>>>>>>>> line:2753DETAIL:  portal: ""
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2754LOG:
>>>>>>>> statement: select defid from machdefs  where parents[1]=$1  order by
>>>>>>>> description,partid
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2755LOG:
>>>>>>>> fetch from memory cache
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacom
>>>>>>>> line:2756DETAIL:  query result fetched from cache. statement: select
>>>>>>>> defid from machdefs  where parents[1]=$1  order by description,partid
>>>>>>>> 0001000100010
>>>>>>>> 00000040CCD179D0000
>>>>>>>>
>>>>>>>> on pgsql
>>>>>>>>
>>>>>>>> 127.0.0.1(15112) [59852] 6814a7ed.e9cc 2025-05-02 14:10:47.582 EEST
>>>>>>>> SMA  amantzio at dynacomline:1340 LOG:  duration: 0.042 ms parse
>>>>>>>> <unnamed>: select defid from machdefs  where parents[1]=$1  order by
>>>>>>>> description,partid
>>>>>>>> 127.0.0.1(15112) [59852] 6814a7ed.e9cc 2025-05-02 14:10:47.582 EEST
>>>>>>>> SMA  amantzio at dynacomline:1341 LOG:  duration: 0.070 ms bind
>>>>>>>> <unnamed>: select defid from machdefs  where parents[1]=$1  order by
>>>>>>>> description,partid
>>>>>>>> 127.0.0.1(15112) [59852] 6814a7ed.e9cc 2025-05-02 14:10:47.582 EEST
>>>>>>>> SMA  amantzio at dynacomline:1342 DETAIL:  Parameters: $1 = '214767517'
>>>>>>>>
>>>>>>>> (no execute)
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> the problem is manifested here :
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacomline:2755LOG:
>>>>>>>> fetch from memory cache
>>>>>>>>     [59728]  2025-05-02 14:10:47.582 SMA  amantzio at dynacom
>>>>>>>> line:2756DETAIL:  query result fetched from cache. statement: select
>>>>>>>> defid from machdefs  where parents[1]=$1  order by description,partid
>>>>>>>> 0001000100010
>>>>>>>> 00000040CCD179D0000
>>>>>>>>
>>>>>>>> because earlier he have :
>>>>>>>>     [59728]  2025-05-02 14:10:45.038 SMA  amantzio at dynacomline:2613LOG:
>>>>>>>> DB node id: 0 backend pid: 59852 statement: Execute: insert into
>>>>>>>> machdefs(description,partid,machtypeid,rhbec,rhdue,periodbec,perioddue,p
>>>>>>>> arents,specialper,action,application,pms_importance, pms_risk,
>>>>>>>> actionnote , jobnote) values($1,$2,$3,$4,$5,$6,$7,(SELECT
>>>>>>>> intarray_push_array(itoar($8),coalesce((select parents from machdefs
>>>>>>>> where defid=$9),'{
>>>>>>>> }'::integer[]))),$10,$11,$12,$13, $14,$15,$16)
>>>>>>>>
>>>>>>>> this insertion should have invalidated the cache for this query :
>>>>>>>> select defid from machdefs  where parents[1]=$1  order by
>>>>>>>> description,partid
>>>>>>>>
>>>>>>>> ========================================
>>>>>>>>
>>>>>>>>
>>>>>>>> please have a look, I am so sorry I could not reprooduce by a simple
>>>>>>>> program.
>>>>>>>>
>>>>>>>> (yes I looked on every detail, no luck).
>>>>>>>>
>>>>>>>>
>>>>>>>> On 5/1/25 06:02, Achilleas Mantzios wrote:
>>>>>>>>> On 1/5/25 03:02, Tatsuo Ishii wrote:
>>>>>>>>>>> update : the bug (this new bug) is not present prior to the first
>>>>>>>>>>> patch. So the bug (the new one) is not present in plain vanilla :
>>>>>>>>>>> pgpool-II-4.6.0 . Reverting both patches solves the issue with the new
>>>>>>>>>>> bug which has most probably to do with an array function called
>>>>>>>>>>> "first" :
>>>>>>>>>> "first" is coming with the extension? I am not familiar with it.
>>>>>>>>> This (and a couple of others I wrote back then)  is a C function
>>>>>>>>> written by me, when still intarray was not an extension, I mean before
>>>>>>>>> the extensions framework, sometime in 2004 or so. This is a simple C
>>>>>>>>> function which returns the first element of an array.
>>>>>>>>>
>>>>>>>>> dynacom=# \df+ first
>>>>>>>>>                                                                                  List
>>>>>>>>> of functions
>>>>>>>>> Schema | Name  | Result data type | Argument data types | Type |
>>>>>>>>> Volatility | Parallel |  Owner   | Security |    Access privileges
>>>>>>>>>        | Language | Source code |
>>>>>>>>> Description
>>>>>>>>> --------+-------+------------------+---------------------+------+------------+----------+----------+----------+--------------------------+----------+-------------+
>>>>>>>>> -------------
>>>>>>>>> public | first | integer          | integer[]           | func |
>>>>>>>>> immutable  | unsafe   | postgres | invoker  | postgres=X/postgres
>>>>>>>>>        +| c        | first       |
>>>>>>>>>
>>>>>>>>>           |       |                  |                     |      |
>>>>>>>>>               |          |          |          | powerprom=X/postgres
>>>>>>>>>       +|          |             |
>>>>>>>>>
>>>>>>>>>           |       |                  |                     |      |
>>>>>>>>>               |          |          |          | default_group=X/postgres
>>>>>>>>> |          |             |
>>>>>>>>>
>>>>>>>>> (1 row)
>>>>>>>>>
>>>>>>>>> dynacom=#
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> select defid, description from machdefs where first(parents)=214766150
>>>>>>>>>>> order by description;
>>>>>>>>>>>
>>>>>>>>>>> I haven't been able to reproduce with SQL (from psql).
>>>>>>>>>> Is it possible for you to create a java reproducer?
>>>>>>>>> I will try.
>>>>>>>>>>> Also reverting both patches, i.e. going back to plain vanilla
>>>>>>>>>>> pgpool-II-4.6.0 solves this new bug, but re-introduces the previous (
>>>>>>>>>>> with seeing stale and data and not invalidate data on table
>>>>>>>>>>> testpgpool)
>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> Tatsuo Ishii
>>>>>>>>>>>>>>> SRA OSS K.K.
>>>>>>>>>>>>>>> English:http://www.sraoss.co.jp/index_en/
>>>>>>>>>>>>>>> Japanese:http://www.sraoss.co.jp
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> pgpool-general mailing list
>>>>>>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> pgpool-general mailing list
>>>>>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> pgpool-general mailing list
>>>>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>>>>> _______________________________________________
>>>>>>>>> pgpool-general mailing list
>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>>>> _______________________________________________
>>>>>> pgpool-general mailing list
>>>>>> pgpool-general at pgpool.net
>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>>>> _______________________________________________
>>>> pgpool-general mailing list
>>>> pgpool-general at pgpool.net
>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general


More information about the pgpool-general mailing list