[pgpool-hackers: 4299] Re: Occasional 005.jdbc test failure

Tue Apr 4 21:23:02 JST 2023

> We occasionaly see 005.jdbc test failure. Typical error is something
> like this:
> 
> 2023-02-22 08:51:47.704: PostgreSQL JDBC Driver pid 12420: LOG:  DB node id: 0 backend pid: 12488 statement: Parse: COMMIT
> 2023-02-22 08:51:47.705: PostgreSQL JDBC Driver pid 12420: LOG:  pool_send_and_wait: Error or notice message from backend: : DB node id: 0 backend pid: 12488 statement: "COMMIT" message: "prepared statement "S_1" already exists"
> 2023-02-22 08:51:47.705: PostgreSQL JDBC Driver pid 12420: LOG:  Parse: Error or notice message from backend: : DB node id: 0 backend pid: 12488 statement: "COMMIT" message: "prepared statement "S_1" already exists"
> 
> I found possible cause of the error two seconds ago:
> 
> 2023-02-22 08:51:45.241: PostgreSQL JDBC Driver pid 12420: LOG:  Sync message from frontend.
> 2023-02-22 08:51:45.241: PostgreSQL JDBC Driver pid 12420: LOG:  Terminate message from frontend.
> 2023-02-22 08:51:45.241: PostgreSQL JDBC Driver pid 12420: LOG:  DB node id: 0 backend pid: 12488 statement: DISCARD ALL
> 2023-02-22 08:51:45.242: PostgreSQL JDBC Driver pid 12420: LOG:  pool_send_and_wait: Error or notice message from backend: : DB node id: 0 backend pid: 12488 statement: "DISCARD ALL" message: "DISCARD ALL cannot be executed within a pipeline"
> 
> "DISCARD ALL" was generated by pgpool (reset_query_list) to discard
> some objects including prepared statements created in the
> session. Since DISCARD ALL failed, the prepared statement S_1 was not
> removed. Thus the next session failed because S_1 already existed.
> 
> Question is why "DISCARD ALL cannot be executed within a pipeline"
> error?
> 
> So I consult PostgreSQL document:
> https://www.postgresql.org/docs/15/protocol-flow.html#PROTOCOL-FLOW-PIPELINING
> 
> "However, there are a few DDL commands (such as CREATE DATABASE) that
> cannot be executed inside a transaction block. If one of these is
> executed in a pipeline, it will fail unless it is the first command in
> the pipeline."
> 
> "DISCARD ALL" is on the same boat as CREATE DATABASE (i.e. cannot be
> executed inside a transaction block). It seems we have to wait for
> pipeline being closed before issuing DISCARD ALL. How? Our PostgreSQL
> doc says:
> 
> "When using this method, completion of the pipeline must be determined
> by counting ReadyForQuery messages and waiting for that to reach the
> number of Syncs sent."
> 
> This means that we should not issue DISCARD ALL before receiving at
> least one ReadyForQuery from backend.
> 
> Waiting for ReadyForQuery is not a big deal. But pgpool cannot
> unconditionaly wait for ReadyForQuery. What if ReadyForQuery was
> already sent? I am going to study this issue.

After studying the issue I came to a conclusion that there's no
reliable way to wait or not wait for ReadyForQuery in a reliable
way. So attached is a patch trying to attack the issue in completely
different way: instead of reading ReadyForQuery, discard the
connection pool if pipleline or any other error occurred.  For this
purpose a global variable "reset_query_error" is introduced. In the
reset query loop, SimpleQuery() is called, then it calls
pool_send_and_wait(), and it calls
per_node_error_log(). per_node_error_log() checks whether backend
returns error response. If so, reset_query_error" is set to true in
per_node_error_log(). If it is set to true, backend_cleanup() discards
the connection pool. This is a little bit ugly but I cannot think of
better implementation at this point.

Comments/suggestions are welcome.
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: reset_query_error.patch
Type: text/x-patch
Size: 3470 bytes
Desc: not available
URL: <http://www.pgpool.net/pipermail/pgpool-hackers/attachments/20230404/cd61bbbb/attachment.bin>