[pgpool-general: 8418] Re: Extended query and cache

Sun Sep 18 21:12:06 JST 2022

> Yes I meant remove the S message.
> And you this is the expected behaviour the fronted should send the Sync
> message.
> 
> My pgproto looks like this
> 
> ‘P'     ""      "SELECT 1"      0
> 'B'     ""      ""      0       0       0
> 'D'     'P'     ""
> 'E'     ""      0
> 'Y'
> 'X'
> 
> I Remove the ‘S’ because I want to prove that the state change from active
> to idle only if we send ‘S’ message.
> When I remove like I show above the state stays active.
> 
> And that’s what I am saying when I use extended query and I get miss from
> cache execute forward  to BE and then,
> Pgpool immediately read FE next message which is ‘S’ and forward to BE.
> 
> But when I get hit from cache pgpool not forward execute to BE and skip
> reading the next frontend message which is ‘S’ and instead start read
> backend response.
> 
> This case make the state stay on active until pgpool finish to read
> everything from BE.

So your question is, why pgpool reads all data from backend when
pgpool hits cache?  Because pgpool needs to gather returning data from
backend (in the example above '1' (parse complete), '2' (bind
complete), 'T' (row description), 'D' (data row)) so that pgpool can
return data plus cache data to frontend.

Note that backend stays on active state anyway until it sends ready
for query message regardless whether pgpool hits cache data or not.
This means that even if pgpool does not hit the query cache, it needs
to read all backend data anyway before the backend goes to idle state
because unless backend returns all data, it will not return ready for
query.

> Thanks,
> 
> Avi
> 
> On Sun, 18 Sep 2022 at 12:35 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> > Ok,
>> >
>> > In your example you explicitly add the ‘S’ message in your
>> > extended-query-cache.data file.
>> > So in that case off course the state changed from active to idle.
>> >
>> > I use the pgproto also in my lab and I found that if  I moved the ‘S’
>> > message the connection stay on active.
>>
>> You mean "remove"? Anyway, that's an expected behavior. Frontend needs
>> to explicitely issue Sync message.
>> >
>> > That’s what I mentioned in my latest mails.
>> > That in case pgpool read the extended query from cache (means that pgpool
>> > not forward execute message)
>> > I can see when debug that instead of read the next message from frontend
>> > which is ‘S’ message and forward to BE.
>> > What happened is that we read a BE response which will be ‘1’ which is
>> > ParseComplete.
>> >
>> > This scenario not happened when the query not in cache, here after
>> execute
>> > send from FE to BE
>> > pgpool read the next message from frontend which is ‘S’ message and
>> forward
>> > to BE what change the state from active to idle.
>> >
>> > Which that make sense because when we send ‘S’ to BE we actually tell him
>> > that I got everything I need and I am ready for query.
>> >
>> > Let me know if more information needed.
>>
>> Can you share the exact pgproto data file for query cache enabled
>> case?  I am asking because I use the same extended-query-cache.data
>> file and got exactly the same result as to when query cache is
>> disabled.
>>
>> 'P'     ""      "SELECT 1"      0
>> 'B'     ""      ""      0       0       0
>> 'D'     'P'     ""
>> 'E'     ""      0
>> 'S'
>> 'Y'
>> 'X'
>>
>> pgproto -d test -p 11000 -f ../extended-query-cache.data
>> FE=> Parse(stmt="", query="SELECT 1")
>> FE=> Bind(stmt="", portal="")
>> FE=> Describe(portal="")
>> FE=> Execute(portal="")
>> FE=> Sync
>> <= BE ParseComplete
>> <= BE BindComplete
>> <= BE RowDescription
>> <= BE DataRow
>> <= BE CommandComplete(SELECT 1)
>> <= BE ReadyForQuery(I)
>> FE=> Terminate
>>
>> (query cache is enabled and the data is fetchd from query cache)
>>
>> > Thanks,
>> >
>> > Avi.
>> >
>> > On Sat, 17 Sep 2022 at 12:34 Avi Raboah <avi.raboah at gmail.com> wrote:
>> >
>> >> Hi,
>> >>
>> >> I’ll try to do a demo later today. And i’ll send the flow as I see it
>> when
>> >> debugging. For sure when sync message forward to backend the state
>> change
>> >> from active to idle.
>> >>
>> >> And I agree in case sync not forwarded readyForQuery change the state
>> from
>> >> active to idle.
>> >>
>> >> Thanks,
>> >>
>> >> Avi.
>> >>
>> >>
>> >>
>> >> On Sat, 17 Sep 2022 at 12:26 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> >>
>> >>> > Hi,
>> >>> >
>> >>> > So I check that, after pgpool read the result from cache I found the
>> >>> next
>> >>> > step to be was message type ‘1’ parse complete from backend and not
>> sync
>> >>> > message from frontend as I expected.
>> >>> >
>> >>> > Notice that after sync message send to the backend the connection
>> >>> becomes
>> >>> > idle. In case the sync message didn’t read from frontend to backend,
>> >>> then
>> >>> > the connection becomes idle only after command complete.
>> >>>
>> >>> Probably you misunderstand here. Command complete message never makes
>> >>> PostgreSQL in idle state. PostgreSQL becomes idle (including idle in
>> >>> transaction) state only after issuing Ready for Query message.
>> >>>
>> >>> > As I mentioned earlier Sync message not send from frontend to
>> backend in
>> >>> > case pgpool read the result from cache.
>> >>> >
>> >>> > Could you check that please ? Because what I saw it’s not what you
>> >>> > described.
>> >>>
>> >>> That's not what I saw here. Here's log emitted by pgproto (testing tool
>> >>> provided by pgpool distribution).
>> >>>
>> >>> t-ishii$ cat ../extended-query-cache.data
>> >>> 'P'     ""      "SELECT 1"      0
>> >>> 'B'     ""      ""      0       0       0
>> >>> 'D'     'P'     ""
>> >>> 'E'     ""      0
>> >>> 'S'
>> >>> 'Y'
>> >>> 'X'
>> >>>
>> >>> t-ishii$ pgproto -d test -p 11000 -f ../extended-query-cache.data
>> >>> FE=> Parse(stmt="", query="SELECT 1")
>> >>> FE=> Bind(stmt="", portal="")
>> >>> FE=> Describe(portal="")
>> >>> FE=> Execute(portal="")
>> >>> FE=> Sync
>> >>> <= BE ParseComplete
>> >>> <= BE BindComplete
>> >>> <= BE RowDescription
>> >>> <= BE DataRow
>> >>> <= BE CommandComplete(SELECT 1)
>> >>> <= BE ReadyForQuery(I)
>> >>> FE=> Terminate
>> >>>
>> >>> "FE=> " means that a message sent from frontend to pgpool.
>> >>> "<= BE" means that a message sent from pgpool to frontend.
>> >>>
>> >>> Note that this output is exactly same as the case when query cache is
>> >>> disabled or pgproto directly connects to PostgreSQL.
>> >>>
>> >>> Also here's output from pg_stat_activity.
>> >>>
>> >>> test=# select * from pg_stat_activity where application_name =
>> 'pgproto';
>> >>> -[ RECORD 1 ]----+------------------------------
>> >>> datid            | 16392
>> >>> datname          | test
>> >>> pid              | 65254
>> >>> leader_pid       |
>> >>> usesysid         | 10
>> >>> usename          | t-ishii
>> >>> application_name | pgproto
>> >>> client_addr      | 127.0.0.1
>> >>> client_hostname  |
>> >>> client_port      | 56758
>> >>> backend_start    | 2022-09-17 18:16:30.854156+09
>> >>> xact_start       |
>> >>> query_start      |
>> >>> state_change     | 2022-09-17 18:16:30.856935+09
>> >>> wait_event_type  | Client
>> >>> wait_event       | ClientRead
>> >>> state            | idle
>> >>> backend_xid      |
>> >>> backend_xmin     |
>> >>> query_id         |
>> >>> query            |
>> >>> backend_type     | client backend
>> >>>
>> >>> > Thanks,
>> >>> >
>> >>> > Avi.
>> >>> >
>> >>> >
>> >>> > On Sat, 17 Sep 2022 at 4:05 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> >>> >
>> >>> >> > The issue with the aws performance insights
>> >>> >> > Is a result of the way pgpool read frontend packets when cache
>> >>> enabled
>> >>> >> and
>> >>> >> > disabled.
>> >>> >> > If we querying pgpool in extended query mode and cache disabled I
>> >>> can see
>> >>> >> > the following -
>> >>> >> > 1. Read parse from frontend and send to backend.
>> >>> >> > 2. Read bind message and send to the backend
>> >>> >> > 3. Read describe message and send to the backend
>> >>> >> > 4. Read execute message and send to the backend —>
>> >>> >> >
>> >>> >> > Until here the connection is in active state under
>> pg_stats_activity
>> >>> >> >
>> >>> >> > 5. Read sync message and send to backend. —>
>> >>> >> > Now the connection moved to idle state.
>> >>> >> >
>> >>> >> > When cache enabled and the query was found in cache :
>> >>> >> >
>> >>> >> > 1. Read parse from frontend and send to backend.
>> >>> >> > 2. Read bind message and send to the backend
>> >>> >> > 3. Read describe message and send to the backend
>> >>> >> > 4. Read execute message and send to the backend —>
>> >>> >> >
>> >>> >> > Now pgpool found the query in cache so execute message not send to
>> >>> the
>> >>> >> > backend.
>> >>> >> > In that case instead of read sync from frontend and update the
>> >>> backend in
>> >>> >> > order to move the connection state from active to idle . Pgpool
>> start
>> >>> >> read
>> >>> >> > packet from backend which will be ‘1’, ‘2’ and so on untill
>> >>> >> > commandComplete. This behaviour cause the connection to be in
>> active
>> >>> >> state
>> >>> >> > until command complete.
>> >>> >>
>> >>> >> Really? pgpool does not try to read command complete from backend in
>> >>> >> this case because the command complete message is part of the cache
>> >>> >> data.
>> >>> >>
>> >>> >> After 4 (actually pgpool does not send the execute message to
>> backend
>> >>> >> because the cache is there)
>> >>> >>
>> >>> >> 1) pgpool read Sync message from frontend and forward to backend.
>> >>> >>
>> >>> >> 2) pgpool read from backend and returns '1' (command complete,
>> >>> >>    response to parse message), '2' (bind complete, response to bind
>> >>> >>    message), Row description (response to describe message) to
>> >>> >>    frontend.
>> >>> >>
>> >>> >> 3) pgpool returns (without reading from backend) Command Complete
>> and
>> >>> row
>> >>> >> data (if any).
>> >>> >>
>> >>> >> 4) pgpool read ready for query message from backend and returns it
>> to
>> >>> >> frontend.
>> >>> >>    at this point, pg_stat_activity should show 'idle' state.
>> >>> >>
>> >>> >> > I am trying to understand if it’s a bug or it should be like that.
>> >>> >> >
>> >>> >> > Thanks,
>> >>> >> >
>> >>> >> > Avi.
>> >>> >> >
>> >>> >> > On Thu, 15 Sep 2022 at 7:34 Avi Raboah <avi.raboah at gmail.com>
>> wrote:
>> >>> >> >
>> >>> >> >> Thank you for the clarification.
>> >>> >> >>
>> >>> >> >> On Thu, 15 Sep 2022 at 3:47 Tatsuo Ishii <ishii at sraoss.co.jp>
>> >>> wrote:
>> >>> >> >>
>> >>> >> >>> > Hi,
>> >>> >> >>> > I am not sure I understood what you mean when you mentioned
>> >>> “close
>> >>> >> >>> message”
>> >>> >> >>> >
>> >>> >> >>> > But i’ll try to ask you in a different way.
>> >>> >> >>> >
>> >>> >> >>> > Let look on the following example:
>> >>> >> >>> >
>> >>> >> >>> > 1. Select * from users where id = $1; using extended query. —>
>> >>> miss
>> >>> >> >>> >
>> >>> >> >>> > 2. Now in case i’ll run the same query i’ll get it back from
>> >>> cache.
>> >>> >> >>> >
>> >>> >> >>> > So I am understanding why the parse message should be pass to
>> the
>> >>> >> >>> backend
>> >>> >> >>> > and not read the result from cache due to that Parse message
>> >>> didn’t
>> >>> >> >>> > contains the query params. That’s make sense.
>> >>> >> >>> >
>> >>> >> >>> > But in Bind message the packet already contains the query
>> >>> params. So
>> >>> >> why
>> >>> >> >>> > you can’t read the result from cache here and we need to wait
>> to
>> >>> the
>> >>> >> >>> > execute message?
>> >>> >> >>>
>> >>> >> >>> Bind message does not return the "result". It just returns it
>> >>> succeeded
>> >>> >> >>> (bind complete message) or failed (error response). The actual
>> >>> result
>> >>> >> >>> (which is called "portal") is only in PostgreSQL's memory. So
>> >>> "caching
>> >>> >> >>> result of bind message" is almost meaningless.
>> >>> >> >>>
>> >>> >> >>> > One more question is why we need to send the bind message
>> anyway
>> >>> to
>> >>> >> the
>> >>> >> >>> > backend in case we have already the result of the execute
>> >>> message in
>> >>> >> >>> cache?
>> >>> >> >>>
>> >>> >> >>> Think about this scenario:
>> >>> >> >>>
>> >>> >> >>> 1. pgpool receives a bind message but finds that the
>> corresponding
>> >>> >> >>> query cache exists. So pgpool does not send the bind message to
>> >>> >> >>> backend.
>> >>> >> >>>
>> >>> >> >>> 2. In other session the table used in the query has been
>> modified
>> >>> and
>> >>> >> >>> the query is gone.
>> >>> >> >>>
>> >>> >> >>> 3. pgpool receives an execute message and tries to extract the
>> >>> query
>> >>> >> >>> cache, but it has already gone. However pgpool cannot send the
>> >>> execute
>> >>> >> >>> message to backend because the result of bind message (portal)
>> does
>> >>> >> >>> not exist in the backend.
>> >>> >> >>>
>> >>> >> >>> > I am asking the above questions because I am looking in the
>> rds
>> >>> >> >>> performance
>> >>> >> >>> > insights.
>> >>> >> >>> > And I can see that in case I am sending a lot of queries
>> >>> concurrently
>> >>> >> >>> when
>> >>> >> >>> > I have cache enabled I can see a very big wait because of a
>> lot
>> >>> of
>> >>> >> >>> > connections in idle ClientRead wait event.
>> >>> >> >>>
>> >>> >> >>> I am not familiar with performance insights and cannot comment
>> on
>> >>> it.
>> >>> >> >>>
>> >>> >> >>> > And I am asking myself why the db needs to know about queries
>> we
>> >>> >> already
>> >>> >> >>> > have in cache?
>> >>> >> >>> >
>> >>> >> >>> > Hope it was clear.
>> >>> >> >>> >
>> >>> >> >>> >
>> >>> >> >>> > Thanks a lot,
>> >>> >> >>> >
>> >>> >> >>> > Avi
>> >>> >> >>> >
>> >>> >> >>> >
>> >>> >> >>> >
>> >>> >> >>> > On Wed, 14 Sep 2022 at 2:42 Tatsuo Ishii <ishii at sraoss.co.jp>
>> >>> wrote:
>> >>> >> >>> >
>> >>> >> >>> >> > Hi,
>> >>> >> >>> >> >
>> >>> >> >>> >> > In case query already cached, why in extended query mode
>> the
>> >>> db
>> >>> >> knows
>> >>> >> >>> >> about
>> >>> >> >>> >> > that query?
>> >>> >> >>> >> > Because in that case pgpool should return the result from
>> the
>> >>> >> cache
>> >>> >> >>> in
>> >>> >> >>> >> > order to save db resources.
>> >>> >> >>> >> > But I found that although we have the query in cache pgpool
>> >>> still
>> >>> >> >>> send
>> >>> >> >>> >> > parse and bind request to the db.
>> >>> >> >>> >> >
>> >>> >> >>> >> > Please share your thoughts 🙏
>> >>> >> >>> >>
>> >>> >> >>> >> Yes, Pgpool-II only caches the result of execute message. The
>> >>> main
>> >>> >> >>> >> reason is, to not return stale cache.  After receiving a
>> close
>> >>> >> message
>> >>> >> >>> >> for the statement or the portal that is bound to the execute
>> >>> >> message,
>> >>> >> >>> >> the cache for the execute message should not be returned.
>> When
>> >>> >> >>> >> Pgpool-II receives close messages, they remove the internal
>> >>> record
>> >>> >> of
>> >>> >> >>> >> previpusly received statement or portal. When an execute
>> message
>> >>> >> >>> >> arrives, Pgpool-II checks whether the record for the execute
>> >>> message
>> >>> >> >>> >> exists. If does not, the request fails. This strategy is
>> simple
>> >>> but
>> >>> >> >>> >> works well.
>> >>> >> >>> >>
>> >>> >> >>> >> I think the saving by caching parse message is not small for
>> >>> >> >>> >> especially complex queries. But in this case users already
>> >>> reuse the
>> >>> >> >>> >> prepared statement anyway.
>> >>> >> >>> >>
>> >>> >> >>> >> I think the saving of bind message is usually small because
>> it
>> >>> does
>> >>> >> >>> >> not involve planning in most cases.
>> >>> >> >>> >>
>> >>> >> >>> >> Best reagards,
>> >>> >> >>> >> --
>> >>> >> >>> >> Tatsuo Ishii
>> >>> >> >>> >> SRA OSS LLC
>> >>> >> >>> >> English: http://www.sraoss.co.jp/index_en/
>> >>> >> >>> >> Japanese:http://www.sraoss.co.jp
>> >>> >> >>> >>
>> >>> >> >>>
>> >>> >> >>
>> >>> >>
>> >>>
>> >>
>>