[pgpool-general: 8416] Re: Extended query and cache

Tatsuo Ishii ishii at sraoss.co.jp
Sun Sep 18 18:35:22 JST 2022


> Ok,
> 
> In your example you explicitly add the ‘S’ message in your
> extended-query-cache.data file.
> So in that case off course the state changed from active to idle.
> 
> I use the pgproto also in my lab and I found that if  I moved the ‘S’
> message the connection stay on active.

You mean "remove"? Anyway, that's an expected behavior. Frontend needs
to explicitely issue Sync message.
> 
> That’s what I mentioned in my latest mails.
> That in case pgpool read the extended query from cache (means that pgpool
> not forward execute message)
> I can see when debug that instead of read the next message from frontend
> which is ‘S’ message and forward to BE.
> What happened is that we read a BE response which will be ‘1’ which is
> ParseComplete.
> 
> This scenario not happened when the query not in cache, here after execute
> send from FE to BE
> pgpool read the next message from frontend which is ‘S’ message and forward
> to BE what change the state from active to idle.
> 
> Which that make sense because when we send ‘S’ to BE we actually tell him
> that I got everything I need and I am ready for query.
> 
> Let me know if more information needed.

Can you share the exact pgproto data file for query cache enabled
case?  I am asking because I use the same extended-query-cache.data
file and got exactly the same result as to when query cache is
disabled.

'P'	""	"SELECT 1"	0
'B'	""	""	0	0	0
'D'	'P'	""
'E'	""	0
'S'
'Y'
'X'

pgproto -d test -p 11000 -f ../extended-query-cache.data
FE=> Parse(stmt="", query="SELECT 1")
FE=> Bind(stmt="", portal="")
FE=> Describe(portal="")
FE=> Execute(portal="")
FE=> Sync
<= BE ParseComplete
<= BE BindComplete
<= BE RowDescription
<= BE DataRow
<= BE CommandComplete(SELECT 1)
<= BE ReadyForQuery(I)
FE=> Terminate

(query cache is enabled and the data is fetchd from query cache)

> Thanks,
> 
> Avi.
> 
> On Sat, 17 Sep 2022 at 12:34 Avi Raboah <avi.raboah at gmail.com> wrote:
> 
>> Hi,
>>
>> I’ll try to do a demo later today. And i’ll send the flow as I see it when
>> debugging. For sure when sync message forward to backend the state change
>> from active to idle.
>>
>> And I agree in case sync not forwarded readyForQuery change the state from
>> active to idle.
>>
>> Thanks,
>>
>> Avi.
>>
>>
>>
>> On Sat, 17 Sep 2022 at 12:26 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>
>>> > Hi,
>>> >
>>> > So I check that, after pgpool read the result from cache I found the
>>> next
>>> > step to be was message type ‘1’ parse complete from backend and not sync
>>> > message from frontend as I expected.
>>> >
>>> > Notice that after sync message send to the backend the connection
>>> becomes
>>> > idle. In case the sync message didn’t read from frontend to backend,
>>> then
>>> > the connection becomes idle only after command complete.
>>>
>>> Probably you misunderstand here. Command complete message never makes
>>> PostgreSQL in idle state. PostgreSQL becomes idle (including idle in
>>> transaction) state only after issuing Ready for Query message.
>>>
>>> > As I mentioned earlier Sync message not send from frontend to backend in
>>> > case pgpool read the result from cache.
>>> >
>>> > Could you check that please ? Because what I saw it’s not what you
>>> > described.
>>>
>>> That's not what I saw here. Here's log emitted by pgproto (testing tool
>>> provided by pgpool distribution).
>>>
>>> t-ishii$ cat ../extended-query-cache.data
>>> 'P'     ""      "SELECT 1"      0
>>> 'B'     ""      ""      0       0       0
>>> 'D'     'P'     ""
>>> 'E'     ""      0
>>> 'S'
>>> 'Y'
>>> 'X'
>>>
>>> t-ishii$ pgproto -d test -p 11000 -f ../extended-query-cache.data
>>> FE=> Parse(stmt="", query="SELECT 1")
>>> FE=> Bind(stmt="", portal="")
>>> FE=> Describe(portal="")
>>> FE=> Execute(portal="")
>>> FE=> Sync
>>> <= BE ParseComplete
>>> <= BE BindComplete
>>> <= BE RowDescription
>>> <= BE DataRow
>>> <= BE CommandComplete(SELECT 1)
>>> <= BE ReadyForQuery(I)
>>> FE=> Terminate
>>>
>>> "FE=> " means that a message sent from frontend to pgpool.
>>> "<= BE" means that a message sent from pgpool to frontend.
>>>
>>> Note that this output is exactly same as the case when query cache is
>>> disabled or pgproto directly connects to PostgreSQL.
>>>
>>> Also here's output from pg_stat_activity.
>>>
>>> test=# select * from pg_stat_activity where application_name = 'pgproto';
>>> -[ RECORD 1 ]----+------------------------------
>>> datid            | 16392
>>> datname          | test
>>> pid              | 65254
>>> leader_pid       |
>>> usesysid         | 10
>>> usename          | t-ishii
>>> application_name | pgproto
>>> client_addr      | 127.0.0.1
>>> client_hostname  |
>>> client_port      | 56758
>>> backend_start    | 2022-09-17 18:16:30.854156+09
>>> xact_start       |
>>> query_start      |
>>> state_change     | 2022-09-17 18:16:30.856935+09
>>> wait_event_type  | Client
>>> wait_event       | ClientRead
>>> state            | idle
>>> backend_xid      |
>>> backend_xmin     |
>>> query_id         |
>>> query            |
>>> backend_type     | client backend
>>>
>>> > Thanks,
>>> >
>>> > Avi.
>>> >
>>> >
>>> > On Sat, 17 Sep 2022 at 4:05 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>> >
>>> >> > The issue with the aws performance insights
>>> >> > Is a result of the way pgpool read frontend packets when cache
>>> enabled
>>> >> and
>>> >> > disabled.
>>> >> > If we querying pgpool in extended query mode and cache disabled I
>>> can see
>>> >> > the following -
>>> >> > 1. Read parse from frontend and send to backend.
>>> >> > 2. Read bind message and send to the backend
>>> >> > 3. Read describe message and send to the backend
>>> >> > 4. Read execute message and send to the backend —>
>>> >> >
>>> >> > Until here the connection is in active state under pg_stats_activity
>>> >> >
>>> >> > 5. Read sync message and send to backend. —>
>>> >> > Now the connection moved to idle state.
>>> >> >
>>> >> > When cache enabled and the query was found in cache :
>>> >> >
>>> >> > 1. Read parse from frontend and send to backend.
>>> >> > 2. Read bind message and send to the backend
>>> >> > 3. Read describe message and send to the backend
>>> >> > 4. Read execute message and send to the backend —>
>>> >> >
>>> >> > Now pgpool found the query in cache so execute message not send to
>>> the
>>> >> > backend.
>>> >> > In that case instead of read sync from frontend and update the
>>> backend in
>>> >> > order to move the connection state from active to idle . Pgpool start
>>> >> read
>>> >> > packet from backend which will be ‘1’, ‘2’ and so on untill
>>> >> > commandComplete. This behaviour cause the connection to be in active
>>> >> state
>>> >> > until command complete.
>>> >>
>>> >> Really? pgpool does not try to read command complete from backend in
>>> >> this case because the command complete message is part of the cache
>>> >> data.
>>> >>
>>> >> After 4 (actually pgpool does not send the execute message to backend
>>> >> because the cache is there)
>>> >>
>>> >> 1) pgpool read Sync message from frontend and forward to backend.
>>> >>
>>> >> 2) pgpool read from backend and returns '1' (command complete,
>>> >>    response to parse message), '2' (bind complete, response to bind
>>> >>    message), Row description (response to describe message) to
>>> >>    frontend.
>>> >>
>>> >> 3) pgpool returns (without reading from backend) Command Complete and
>>> row
>>> >> data (if any).
>>> >>
>>> >> 4) pgpool read ready for query message from backend and returns it to
>>> >> frontend.
>>> >>    at this point, pg_stat_activity should show 'idle' state.
>>> >>
>>> >> > I am trying to understand if it’s a bug or it should be like that.
>>> >> >
>>> >> > Thanks,
>>> >> >
>>> >> > Avi.
>>> >> >
>>> >> > On Thu, 15 Sep 2022 at 7:34 Avi Raboah <avi.raboah at gmail.com> wrote:
>>> >> >
>>> >> >> Thank you for the clarification.
>>> >> >>
>>> >> >> On Thu, 15 Sep 2022 at 3:47 Tatsuo Ishii <ishii at sraoss.co.jp>
>>> wrote:
>>> >> >>
>>> >> >>> > Hi,
>>> >> >>> > I am not sure I understood what you mean when you mentioned
>>> “close
>>> >> >>> message”
>>> >> >>> >
>>> >> >>> > But i’ll try to ask you in a different way.
>>> >> >>> >
>>> >> >>> > Let look on the following example:
>>> >> >>> >
>>> >> >>> > 1. Select * from users where id = $1; using extended query. —>
>>> miss
>>> >> >>> >
>>> >> >>> > 2. Now in case i’ll run the same query i’ll get it back from
>>> cache.
>>> >> >>> >
>>> >> >>> > So I am understanding why the parse message should be pass to the
>>> >> >>> backend
>>> >> >>> > and not read the result from cache due to that Parse message
>>> didn’t
>>> >> >>> > contains the query params. That’s make sense.
>>> >> >>> >
>>> >> >>> > But in Bind message the packet already contains the query
>>> params. So
>>> >> why
>>> >> >>> > you can’t read the result from cache here and we need to wait to
>>> the
>>> >> >>> > execute message?
>>> >> >>>
>>> >> >>> Bind message does not return the "result". It just returns it
>>> succeeded
>>> >> >>> (bind complete message) or failed (error response). The actual
>>> result
>>> >> >>> (which is called "portal") is only in PostgreSQL's memory. So
>>> "caching
>>> >> >>> result of bind message" is almost meaningless.
>>> >> >>>
>>> >> >>> > One more question is why we need to send the bind message anyway
>>> to
>>> >> the
>>> >> >>> > backend in case we have already the result of the execute
>>> message in
>>> >> >>> cache?
>>> >> >>>
>>> >> >>> Think about this scenario:
>>> >> >>>
>>> >> >>> 1. pgpool receives a bind message but finds that the corresponding
>>> >> >>> query cache exists. So pgpool does not send the bind message to
>>> >> >>> backend.
>>> >> >>>
>>> >> >>> 2. In other session the table used in the query has been modified
>>> and
>>> >> >>> the query is gone.
>>> >> >>>
>>> >> >>> 3. pgpool receives an execute message and tries to extract the
>>> query
>>> >> >>> cache, but it has already gone. However pgpool cannot send the
>>> execute
>>> >> >>> message to backend because the result of bind message (portal) does
>>> >> >>> not exist in the backend.
>>> >> >>>
>>> >> >>> > I am asking the above questions because I am looking in the rds
>>> >> >>> performance
>>> >> >>> > insights.
>>> >> >>> > And I can see that in case I am sending a lot of queries
>>> concurrently
>>> >> >>> when
>>> >> >>> > I have cache enabled I can see a very big wait because of a lot
>>> of
>>> >> >>> > connections in idle ClientRead wait event.
>>> >> >>>
>>> >> >>> I am not familiar with performance insights and cannot comment on
>>> it.
>>> >> >>>
>>> >> >>> > And I am asking myself why the db needs to know about queries we
>>> >> already
>>> >> >>> > have in cache?
>>> >> >>> >
>>> >> >>> > Hope it was clear.
>>> >> >>> >
>>> >> >>> >
>>> >> >>> > Thanks a lot,
>>> >> >>> >
>>> >> >>> > Avi
>>> >> >>> >
>>> >> >>> >
>>> >> >>> >
>>> >> >>> > On Wed, 14 Sep 2022 at 2:42 Tatsuo Ishii <ishii at sraoss.co.jp>
>>> wrote:
>>> >> >>> >
>>> >> >>> >> > Hi,
>>> >> >>> >> >
>>> >> >>> >> > In case query already cached, why in extended query mode the
>>> db
>>> >> knows
>>> >> >>> >> about
>>> >> >>> >> > that query?
>>> >> >>> >> > Because in that case pgpool should return the result from the
>>> >> cache
>>> >> >>> in
>>> >> >>> >> > order to save db resources.
>>> >> >>> >> > But I found that although we have the query in cache pgpool
>>> still
>>> >> >>> send
>>> >> >>> >> > parse and bind request to the db.
>>> >> >>> >> >
>>> >> >>> >> > Please share your thoughts 🙏
>>> >> >>> >>
>>> >> >>> >> Yes, Pgpool-II only caches the result of execute message. The
>>> main
>>> >> >>> >> reason is, to not return stale cache.  After receiving a close
>>> >> message
>>> >> >>> >> for the statement or the portal that is bound to the execute
>>> >> message,
>>> >> >>> >> the cache for the execute message should not be returned. When
>>> >> >>> >> Pgpool-II receives close messages, they remove the internal
>>> record
>>> >> of
>>> >> >>> >> previpusly received statement or portal. When an execute message
>>> >> >>> >> arrives, Pgpool-II checks whether the record for the execute
>>> message
>>> >> >>> >> exists. If does not, the request fails. This strategy is simple
>>> but
>>> >> >>> >> works well.
>>> >> >>> >>
>>> >> >>> >> I think the saving by caching parse message is not small for
>>> >> >>> >> especially complex queries. But in this case users already
>>> reuse the
>>> >> >>> >> prepared statement anyway.
>>> >> >>> >>
>>> >> >>> >> I think the saving of bind message is usually small because it
>>> does
>>> >> >>> >> not involve planning in most cases.
>>> >> >>> >>
>>> >> >>> >> Best reagards,
>>> >> >>> >> --
>>> >> >>> >> Tatsuo Ishii
>>> >> >>> >> SRA OSS LLC
>>> >> >>> >> English: http://www.sraoss.co.jp/index_en/
>>> >> >>> >> Japanese:http://www.sraoss.co.jp
>>> >> >>> >>
>>> >> >>>
>>> >> >>
>>> >>
>>>
>>


More information about the pgpool-general mailing list