[pgpool-general: 8326] Re: Timestamp cast not cached

Mon Jul 11 17:43:21 JST 2022

I am start to think that the first fix you suggested to just remove the
condition to check the type cast is the best.

And I shared the consultation with you because you are the expert and I
wanted to hear your opinion.

Thanks,

Avi

On Mon, 11 Jul 2022 at 11:39 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> Thank you for the thought. However I have no clear idea what you
> suggest here to enhance pgpool in terms of query cache. Can you
> clarify more?
>
> > Hi after a consultation with some pg expert that what they answer to me.
> > Maybe it could help you with your fix.
> >
> > Hi  Avi Raboah,
> >
> > To expand on Andrew’s answer, we need to explore the difference between
> > IMMUTABLE, STABLE and VOLATILE declared function types in PostgreSQL
> >
> > IMMUTABLE functions must adhere to three requirements:
> > # Not change the database during execution.
> > # Not execute any database lookups. Can only use values passed as
> arguments.
> > # Always return the same value when supplied the same arguments.
> >
> > STABLE functions must adhere to two requirements:
> > # Not change the database during execution.
> > # Always return the same values when supplied the same arguments during a
> > single table scan.  However, the return values can change during
> execution
> > of later SQL statements within a transaction.
> >
> > VOLATILE functions have no ridge requirements, and it is the default type
> > for functions.
> >
> > What does the above mean? PostgreSQL uses the three function types to
> > decide on optimization paths to improve performance.
> >
> > Immutable functions offer the most optimization options as PostgreSQL can
> > inline and cache the results of a function call during execution.
> Example:
> >
> > <pre><code class="sql">
> > CREATE TEMPORARY TABLE test (id integer, tt text);
> >
> > INSERT INTO test (SELECT (random() * 10)::integer, random()::text FROM
> > GENERATE_SERIES(0,999));
> >
> > EXPLAIN SELECT true  FROM test  WHERE id = 2+2 ;
> > --returns
> > --Seq Scan on test  (cost=0.00..19.50 rows=88 width=1)
> > --  Filter: (id = 4)
> > </code></pre>
> >
> > We can see the FILTER condition was executed in-lining the filtering
> > condition on the value 4, not on 2 + 2. PostgreSQL rewrote the query to
> use
> > a static value of 4.
> >
> > <pre><code class="sql">
> > SELECT true  FROM test  WHERE id = 4
> > </code></pre>
> >
> > PostgreSQL can do this as the + operator is marked as immutable. This
> saved
> > executing the + operator an additional 999 times when comparing id to
> 2+2.
> >
> > Stable functions have similar properties to immutable functions, which
> > allows PostgreSQL to cache and inline results while executing a table
> scan
> > to avoid executing costly functions multiple times when scanning
> thousands
> > to millions of rows. However, a stable function must be executed again
> > later in a transaction. Example:
> >
> > <pre><code class="sql">
> > BEGIN;
> >
> > SELECT statement_timestamp(), now();
> > SELECT pg_sleep(2);
> > SELECT statement_timestamp(), now();
> >
> > ROLLBACK;
> > </code></pre>
> >
> >
> > With a volatile function, PostgreSQL cannot make optimization decisions
> and
> > must execute the function every time it is called. This makes volatile
> > functions very expensive if used in a table scan, for example.
> >
> >
> > <pre><code class="sql">
> > SELECT timeofday() as volatile_time, NOW() stable_time_during_transaction
> > FROM generate_series(0,999)
> > </code></pre>
> >
> > This creates a thousand row result set where  TIMEOFDAY () is changing
> > while NOW () is stable and does not change during a transaction.
> >
> > PostgreSQL functions marked as stable and immutable will always return
> that
> > same value when the same arguments are supplied. They’re not going to be
> > volatile, returning an unexpected value.
> >
> > The three function types Immutable, Stable, Volatile are used by the
> > planner to decide which execution plan to create to improve performance.
> >
> > When would an IMMUTABLE or STABLE function become unstable? An immutable
> > function will not become unstable as long as the same arguments are
> > supplied. A stable function can become unstable if the function does
> > database lookups that changes its behavior. This is the reason stable
> > functions have to be executed in later SQL commands in a transaction,
> while
> > IMMUTABLE functions do not.
> >
> > How do we know which function PostgreSQL chooses to execute/use?
> PostgreSQL
> > will search through the catalog for matching function name, number of
> > arguments, argument types, if arguments have default values and will
> order
> > the functions by the session’s SEARCH_PATH. PostgreSQL will use the first
> > function it finds that matches all the above requirements.
> >
> > Leaving out the precision for casting operation, PostgreSQL will match
> to a
> > different function as the arguments can change as TIMESTAMP can have
> > variable precision, making the output unstable. This limits optimization
> > paths and will cause PostgreSQL to execute the function more times
> during a
> > table scan vs the immutable version of the function.
> >
> > This leads us to Andrew’s statement about sanitizing the text input for
> > security reasons to protect against SQL injection and prevent PostgreSQL
> > from executing IMMUTABLE and STABLE functions multiple times as the input
> > will be variable.
> >
> > In summary, the immutable, stable, and volatile type functions are used
> by
> > the planner to decide optimization paths; it does not guarantee/guard
> > unexpected behavior. However, if the arguments passed to the function are
> > consistent, the results will be immutable for immutable functions. Stable
> > functions depend on the specific function. Volatile functions should be
> > assumed to be unstable.
> >
> > Thanks,
> >
> > Avi.
> >
> >
> > On Thu, 7 Jul 2022 at 15:06 Avi Raboah <avi.raboah at gmail.com> wrote:
> >
> >> Great, thanks for the update and your efforts.
> >>
> >> really appreciate it
> >>
> >> Avi.
> >>
> >>
> >> On Thu, 7 Jul 2022 at 4:53 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >>
> >>> > I have posted a question to pgsql-hackers regarding to_timestamp.
> >>> >
> >>>
> https://www.postgresql.org/message-id/flat/20220705.172957.2068967435108479827.t-ishii%40sranhm.sra.co.jp
> >>> >
> >>> > It turned out that the provolatile value for the two forms of
> >>> > to_timestamp are correct.  Surely to_timestamp(1 argument) returns
> >>> > different result depending on the time zone but the actual internal
> >>> > value of timestamptz is identical. The output difference is merely
> how
> >>> > the function prints timestamptz according to the time zone.
> >>> >
> >>> > Here are examples provided by Tom Lane.
> >>> >
> >>> > regression=# show timezone;
> >>> >      TimeZone
> >>> > ------------------
> >>> >  America/New_York
> >>> > (1 row)
> >>> >
> >>> > regression=# select to_timestamp(0);
> >>> >       to_timestamp
> >>> > ------------------------
> >>> >  1969-12-31 19:00:00-05
> >>> > (1 row)
> >>> >
> >>> > regression=# set timezone = 'utc';
> >>> > SET
> >>> > regression=# select to_timestamp(0);
> >>> >       to_timestamp
> >>> > ------------------------
> >>> >  1970-01-01 00:00:00+00
> >>> > (1 row)
> >>> >
> >>> > "1969-12-31 19:00:00-05" and "1970-01-01 00:00:00+00" are actually
> >>> > same value as timestamptz data type, which means labeling this form
> of
> >>> > to_timestamp as immutable is correct.
> >>> >
> >>> > However this does not solve the problem of query cache in Pgpool-II.
> >>> >
> >>> > The disccusion above suggests that even if the function is labeled as
> >>> > immutable, there are cases when pgpool should not create a cache for
> >>> > the SELECT which uses the function (you already showed a good
> >>> > example). I thinkk pgpool should not create a cache if a function is
> >>> > labeled other than immutable (we already do it) and if its return
> type
> >>> > is timestamptz or timetz (we have not done it yet).
> >>> >
> >>> > But there's more.
> >>> >
> >>> > PostgreSQL has similar config settings that change the output style
> of
> >>> > functions/expressions: namely datestyle and client_encoding etc.
> >>> >
> >>> > test=# show datestyle;
> >>> >  DateStyle
> >>> > -----------
> >>> >  ISO, MDY
> >>> > (1 row)
> >>> >
> >>> > test=# select '2022-07-06'::date;
> >>> >     date
> >>> > ------------
> >>> >  2022-07-06
> >>> > (1 row)
> >>> >
> >>> > test=# set datestyle to 'Postgres, mdy';
> >>> > SET
> >>> > test=# select '2022-07-06'::date;
> >>> >     date
> >>> > ------------
> >>> >  07-06-2022
> >>> > (1 row)
> >>> >
> >>> > If we execute these using pgpool with query cache enabled, we get:
> >>> >
> >>> > test=# show datestyle;
> >>> >  DateStyle
> >>> > -----------
> >>> >  ISO, MDY
> >>> > (1 row)
> >>> >
> >>> > test=# select '2022-07-06'::date;
> >>> >     date
> >>> > ------------
> >>> >  2022-07-06
> >>> > (1 row)
> >>> >
> >>> > test=# set datestyle to 'Postgres, mdy';
> >>> > SET
> >>> > test=# select '2022-07-06'::date;
> >>> >     date
> >>> > ------------
> >>> >  2022-07-06
> >>> > (1 row)
> >>> >
> >>> > Notice the last result does reflect the datestyle setting change
> >>> > because of query cache.
> >>> >
> >>> > I think there are more config parameters that induce the wrong
> >>> > behavior of pgpool:
> >>> >
> >>> > IntervalStyle, extra_float_digits, lc_messages, lc_monetary,
> >>> lc_numeric, lc_time
> >>> >
> >>> > For now I don't know how to deal with the problem. Maybe we should
> >>> > just add this as a restriction to the doc?
> >>>
> >>> I have pushed commits to partially solve the problem.
> >>>
> >>>
> https://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=47fd46cbcf31b46d93a8ade62fda43b82c39b4e6
> >>>
> >>> With the commit, SELECTs having functions with return type is
> >>> timestamptz or timetz are not cached anymore. This will avoid the
> >>> problem due to the time zone setting difference.  Note that the fix
> >>> has been only applied to Pgpool-II 4.2 or later. Pgpool-II 4.1 an
> >>> before are lacking necessary infrastructures to support the commit.
> >>>
> >>> Remaining problems are:
> >>> > IntervalStyle, extra_float_digits, lc_messages, lc_monetary,
> >>> lc_numeric, lc_time
> >>>
> >>> I think we have to add these to the restriction section of Pgpool-II
> >>> doc. I am going work on this later.
> >>>
> >>> Best reagards,
> >>> --
> >>> Tatsuo Ishii
> >>> SRA OSS, Inc. Japan
> >>> English: http://www.sraoss.co.jp/index_en.php
> >>> Japanese:http://www.sraoss.co.jp
> >>>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20220711/05703887/attachment-0001.htm>