[pgpool-general: 8267] Re: Pg_stat_activity hung queries issue

Tatsuo Ishii ishii at sraoss.co.jp
Tue Jun 28 15:57:26 JST 2022


> Thanks for your email.
> 
> I am not sure why everything run as expected in that scenario if the cache
> is off.
>
> This issue happened only when the cache is on.

Really? The log you posted was with the cache being on?

2022-06-21 12:34:38.857: [unknown] pid 238: DETAIL:  statement: "__asyncpg_stmt_5__", query: "select * from "group" limit 1"
2022-06-21 12:34:38.858: [unknown] pid 238: FATAL:  Backend throw an error message
2022-06-21 12:34:38.858: [unknown] pid 238: DETAIL:  Exiting current session because of an error from backend
2022-06-21 12:34:38.858: [unknown] pid 238: HINT:  BACKEND Error: "current transaction is aborted, commands ignored until end of transaction block"
2022-06-21 12:34:38.858: [unknown] pid 238: CONTEXT:  while searching system catalog, When relcache is missed
2022-06-21 12:34:38.859: main pid 1: LOG:  child process with pid: 238 exits with status 256
2022-06-21 12:34:38.859: main pid 1: LOG:  fork a new child process with pid: 270

> And from my understanding the only difference between these scenarios is
> only the catalog queries.
> 
> In both cases the extended queries working the same and got the same error
> in the transaction about the email_unique. But when the cache on the
> session aborted and when the cache off it’s not. Any thoughts?

My scenario brings session abort even without query cache.

> On Tue, 28 Jun 2022 at 7:33 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
>> It turned out that flush message is irrevant. Here is a scenraio file
>> for pgproto to raise such that error.
>>
>> 'Q'     "DROP TABLE IF EXISTS t1 "
>> 'Y'
>> 'Q'     "CREATE TABLE t1(i int primary key)"
>> 'Y'
>> 'Q'     "INSERT INTO t1 VALUES(1)"
>> 'Y'
>> 'Q'     "BEGIN"
>> 'Y'
>> 'P'     ""      "INSERT INTO t1 VALUES(1)"      0
>> 'B'     ""      ""      0       0       0
>> 'E'     ""      0
>> 'Q'     "SELECT count(*) from pg_class" 0
>> 'P'     ""      "END"   0
>> 'B'     ""      ""      0       0       0
>> 'E'     ""      0
>> 'S'
>> 'Y'
>> 'X'
>>
>> And the result is here:
>>
>> FE=> Query (query="DROP TABLE IF EXISTS t1 ")
>> <= BE CommandComplete(DROP TABLE)
>> <= BE ReadyForQuery(I)
>> FE=> Query (query="CREATE TABLE t1(i int primary key)")
>> <= BE CommandComplete(CREATE TABLE)
>> <= BE ReadyForQuery(I)
>> FE=> Query (query="INSERT INTO t1 VALUES(1)")
>> <= BE CommandComplete(INSERT 0 1)
>> <= BE ReadyForQuery(I)
>> FE=> Query (query="BEGIN")
>> <= BE CommandComplete(BEGIN)
>> <= BE ReadyForQuery(T)
>> FE=> Parse(stmt="", query="INSERT INTO t1 VALUES(1)")
>> FE=> Bind(stmt="", portal="")
>> FE=> Execute(portal="")
>> FE=> Query (query="SELECT count(*) from pg_class")
>> FE=> Parse(stmt="", query="END")
>> FE=> Bind(stmt="", portal="")
>> FE=> Execute(portal="")
>> FE=> Sync
>> <= BE ErrorResponse(S FATAL C XX000 M Backend throw an error message D
>> Exiting current session because of an error from backend H BACKEND Error:
>> "duplicate key value violates unique constraint "t1_pkey"" F
>> pool_process_query.c L 2130 )
>> read_it: EOF detected (session aborted).
>>
>> Suppose an extended query causes an error ("duplicate key value
>> violates unique constraint "t1_pkey").
>>
>> INSERT INTO t1 VALUES(1);
>>
>> At the parse/describe/bind phase, this is fine as long as the table t1
>> exists. However at the execute phase this could fail, for example
>> because of duplicate key violation error. If we are in an explicit
>> transaction, there's no way to know such an error occurred and the
>> transaction is in abort state until sync message is issued because the
>> sync message forces backend to issue a "ready for query" message,
>> which comprises the transaction state information (and "ready for
>> query" is the only message which returns the transaction state). If
>> the transaction state is "abort", no query will be accepted by
>> backend.  Thus, if client sends a query after the transaction is in
>> abort status, the query always fails. Unfortunately pgpool needs to
>> issue such that quries to inspect PostgreSQL's system catalogs.
>> Such queries will fail in this situation and pgpool has no choice other
>> than raising a fatal error and disconnects to frontend (that's what
>> you are seeing).
>>
>> In simple query protocol, each statement always is followed by a
>> "Ready for query" message and pgpool can know how is like the
>> transaction state and the issue does not happen.
>>
>> I think I was not able to find any solution for this. Sorry.
>>
>> > Ok, thanks for the update.
>> >
>> > On Mon, 27 Jun 2022 at 3:12 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> >
>> >> I am investigating this issue using pgproto with the data you
>> >> provided.  It seems there are lots of "Flush" messages in the data,
>> >> and I think the Flush message are causing troubles in pgpool. I guess
>> >> it will take some time before I find a solution.
>> >>
>> >> > Any thoughts?
>> >> > On Wed, 22 Jun 2022 at 8:45 Avi Raboah <avi.raboah at gmail.com> wrote:
>> >> >
>> >> >> Yea you right
>> >> >>
>> >> >> On Wed, 22 Jun 2022 at 8:45 Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>> >> >>
>> >> >>> The default for backend_clustering_mode varies depending on the
>> pgpool
>> >> >>> versions. Can I assume you are using 4.3?
>> >> >>>
>> >> >>> > All the configurations I didn’t provided are using the default.
>> >> >>> >
>> >> >>> > One more thing, the issue is occurring only if query cache enabled
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> >
>> >> >>> > Avi
>> >> >>> >
>> >> >>> > On Wed, 22 Jun 2022 at 4:10 Tatsuo Ishii <ishii at sraoss.co.jp>
>> wrote:
>> >> >>> >
>> >> >>> >> What is backend_clustering_mode?
>> >> >>> >>
>> >> >>> >> > Pgpool.conf-
>> >> >>> >> >
>> >> >>> >> > listen_addresses = '*'
>> >> >>> >> >
>> >> >>> >> > port = '9999'
>> >> >>> >> >
>> >> >>> >> > socket_dir = '/var/run/pgpool'
>> >> >>> >> >
>> >> >>> >> > pcp_socket_dir = '/var/run/pgpool'
>> >> >>> >> >
>> >> >>> >> > wd_ipc_socket_dir = '/var/run/pgpool'
>> >> >>> >> >
>> >> >>> >> > pid_file_name = '/var/run/pgpool/pgpool.pid'
>> >> >>> >> >
>> >> >>> >> > backend_hostname0 = db_instance
>> >> >>> >> >
>> >> >>> >> > backend_port0 = '5432'
>> >> >>> >> >
>> >> >>> >> > backend_flag0 = 'ALLOW_TO_FAILOVER'
>> >> >>> >> >
>> >> >>> >> > backend_weight0 = '1'
>> >> >>> >> >
>> >> >>> >> > connection_cache = on
>> >> >>> >> >
>> >> >>> >> > num_init_children = 200
>> >> >>> >> >
>> >> >>> >> > child_life_time = 0
>> >> >>> >> >
>> >> >>> >> > max_pool = 4
>> >> >>> >> >
>> >> >>> >> > log_destination = 'stderr'
>> >> >>> >> >
>> >> >>> >> > log_line_prefix = '%m: %a pid %p: '
>> >> >>> >> >
>> >> >>> >> > log_connections = off
>> >> >>> >> >
>> >> >>> >> > log_disconnections = off
>> >> >>> >> >
>> >> >>> >> > log_statement = off
>> >> >>> >> >
>> >> >>> >> > log_client_messages = on
>> >> >>> >> >
>> >> >>> >> > client_min_messages = info
>> >> >>> >> >
>> >> >>> >> > log_min_messages = debug
>> >> >>> >> >
>> >> >>> >> > logging_collector = on
>> >> >>> >> >
>> >> >>> >> > log_directory = '/tmp/pgpool_logs'
>> >> >>> >> >
>> >> >>> >> > log_filename = 'pgpool-%Y-%m-%d_%H%M%S.log'
>> >> >>> >> >
>> >> >>> >> > log_file_mode = 0600
>> >> >>> >> >
>> >> >>> >> > log_truncate_on_rotation = off
>> >> >>> >> >
>> >> >>> >> > log_rotation_age = 1d
>> >> >>> >> >
>> >> >>> >> > log_rotation_size = 200MB
>> >> >>> >> >
>> >> >>> >> > sr_check_user = 'user'
>> >> >>> >> >
>> >> >>> >> > sr_check_password = 'password'
>> >> >>> >> >
>> >> >>> >> > sr_check_period = '10'
>> >> >>> >> >
>> >> >>> >> > memory_cache_enabled = 'on'
>> >> >>> >> >
>> >> >>> >> > memqcache_total_size = 64MB
>> >> >>> >> >
>> >> >>> >> > memqcache_max_num_cache = 1000000
>> >> >>> >> >
>> >> >>> >> > memqcache_maxcache = 400kB
>> >> >>> >> >
>> >> >>> >> > memqcache_cache_block_size = 1MB
>> >> >>> >> >
>> >> >>> >> > memqcache_oiddir = '/var/log/pgpool/oiddir'
>> >> >>> >> >
>> >> >>> >> > cache_safe_memqcache_table_list = ''
>> >> >>> >> >
>> >> >>> >> > cache_unsafe_memqcache_table_list = ''
>> >> >>> >> >
>> >> >>> >> > check_unlogged_table = on
>> >> >>> >> >
>> >> >>> >> > health_check_period = '10'
>> >> >>> >> >
>> >> >>> >> > health_check_user = 'user'
>> >> >>> >> >
>> >> >>> >> > failover_on_backend_error = 'off'
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > On Tue, 21 Jun 2022 at 10:51 Tatsuo Ishii <ishii at sraoss.co.jp>
>> >> >>> wrote:
>> >> >>> >> >
>> >> >>> >> >> Please share the log with log_client_messages = on so that I
>> can
>> >> see
>> >> >>> >> >> what your client actually is sending.  Also I need
>> pgpool.conf.
>> >> >>> >> >>
>> >> >>> >> >> > We are using asyncPg module in python in order to send our
>> >> >>> queries the
>> >> >>> >> >> > difference between that module to psycopg is that asyncpg
>> using
>> >> >>> >> extended
>> >> >>> >> >> > queries.
>> >> >>> >> >> > And I can see that the issue is reproduced in case
>> transaction
>> >> >>> aborted
>> >> >>> >> >> and
>> >> >>> >> >> > the catalog queries are launched.
>> >> >>> >> >> >
>> >> >>> >> >> > I checked in the code and found that
>> >> >>> check_transaction_state_and_abort
>> >> >>> >> >> > happens only in simpleQuery process.
>> >> >>> >> >> >
>> >> >>> >> >> > Thanks,
>> >> >>> >> >> >
>> >> >>> >> >> > Avi
>> >> >>> >> >> >
>> >> >>> >> >> > On Tue, 21 Jun 2022 at 6:43 Tatsuo Ishii <
>> ishii at sraoss.co.jp>
>> >> >>> wrote:
>> >> >>> >> >> >
>> >> >>> >> >> >> Can you elaborate more?
>> >> >>> >> >> >>
>> >> >>> >> >> >> I tried following case and pgpool ignores all queries from
>> >> client
>> >> >>> >> >> >> after a parse error occurs as expected.
>> >> >>> >> >> >>
>> >> >>> >> >> >> pgproto -d test -p 11000 -f ../pgproto.data
>> >> >>> >> >> >> FE=> Parse(stmt="", query="BEGIN")
>> >> >>> >> >> >> FE=> Bind(stmt="", portal="")
>> >> >>> >> >> >> FE=> Execute(portal="")
>> >> >>> >> >> >> FE=> Parse(stmt="", query="INSERT INTO non_existing_table
>> >> >>> values(1)")
>> >> >>> >> >> >> FE=> Bind(stmt="", portal="")
>> >> >>> >> >> >> FE=> Execute(portal="")
>> >> >>> >> >> >> FE=> Parse(stmt="", query="END")
>> >> >>> >> >> >> FE=> Bind(stmt="", portal="")
>> >> >>> >> >> >> FE=> Execute(portal="")
>> >> >>> >> >> >> FE=> Sync
>> >> >>> >> >> >> <= BE ParseComplete
>> >> >>> >> >> >> <= BE BindComplete
>> >> >>> >> >> >> <= BE CommandComplete(BEGIN)
>> >> >>> >> >> >> <= BE ErrorResponse(S ERROR V ERROR C 42P01 M relation
>> >> >>> >> >> >> "non_existing_table" does not exist P 13 F
>> parse_relation.c L
>> >> >>> 1381 R
>> >> >>> >> >> >> parserOpenTable )
>> >> >>> >> >> >> <= BE ReadyForQuery(E)
>> >> >>> >> >> >> FE=> Terminate
>> >> >>> >> >> >>
>> >> >>> >> >> >> > Hi,
>> >> >>> >> >> >> >
>> >> >>> >> >> >> > After investigating the code I found that the patch you
>> >> >>> provided is
>> >> >>> >> >> >> fixing
>> >> >>> >> >> >> > the problem only for simple query. In extended query the
>> >> issue
>> >> >>> >> still
>> >> >>> >> >> >> exist.
>> >> >>> >> >> >> >
>> >> >>> >> >> >> > Do you know where is the right place to add the
>> transaction
>> >> >>> failure
>> >> >>> >> >> >> > verification in extended query process?
>> >> >>> >> >> >> > On Mon, 9 May 2022 at 16:17 Tatsuo Ishii <
>> >> ishii at sraoss.co.jp>
>> >> >>> >> wrote:
>> >> >>> >> >> >> >
>> >> >>> >> >> >> >> The patch (plus more fix) has been committed to master
>> and
>> >> 4.3
>> >> >>> >> stable
>> >> >>> >> >> >> >> branches.
>> >> >>> >> >> >> >>
>> >> >>> >> >> >> >>
>> >> >>> >> >>
>> >> >>>
>> >> https://www.pgpool.net/pipermail/pgpool-committers/2022-May/008592.html
>> >> >>> >> >> >> >>
>> >> >>> >> >>
>> >> >>>
>> >> https://www.pgpool.net/pipermail/pgpool-committers/2022-May/008591.html
>> >> >>> >> >> >> >>
>> >> >>> >> >> >> >> The patches will appear in the next release (May 19,
>> 2022).
>> >> >>> >> >> >> >>
>> >> >>> >> >> >> >> I have not pushed to other stable branches because the
>> fix
>> >> is
>> >> >>> not
>> >> >>> >> >> >> >> small and I wanted to minimize risks.
>> >> >>> >> >> >> >>
>> >> >>> >> >> >> >> > Thanks a lot for the patch have a great weekend
>> >> >>> >> >> >> >> >
>> >> >>> >> >> >> >> > On Sat, 7 May 2022 at 13:35 Tatsuo Ishii <
>> >> >>> ishii at sraoss.co.jp>
>> >> >>> >> >> wrote:
>> >> >>> >> >> >> >> >
>> >> >>> >> >> >> >> >> > Hi,
>> >> >>> >> >> >> >> >> >
>> >> >>> >> >> >> >> >> > Still thinking.
>> >> >>> >> >> >> >> >> >
>> >> >>> >> >> >> >> >> > Best reagards,
>> >> >>> >> >> >> >> >> > --
>> >> >>> >> >> >> >> >> > Tatsuo Ishii
>> >> >>> >> >> >> >> >> > SRA OSS, Inc. Japan
>> >> >>> >> >> >> >> >> > English: http://www.sraoss.co.jp/index_en.php
>> >> >>> >> >> >> >> >> > Japanese:http://www.sraoss.co.jp
>> >> >>> >> >> >> >> >> >
>> >> >>> >> >> >> >> >> >
>> >> >>> >> >> >> >> >> >> Hi,
>> >> >>> >> >> >> >> >> >>
>> >> >>> >> >> >> >> >> >> Any news about this thread?
>> >> >>> >> >> >> >> >> >>
>> >> >>> >> >> >> >> >> >> Thanks,
>> >> >>> >> >> >> >> >> >> Avi.
>> >> >>> >> >> >> >> >> >>
>> >> >>> >> >> >> >> >> >> On Mon, 25 Apr 2022 at 11:13 Avi Raboah <
>> >> >>> >> avi.raboah at gmail.com>
>> >> >>> >> >> >> >> wrote:
>> >> >>> >> >> >> >> >> >>
>> >> >>> >> >> >> >> >> >>> It will be perfect thanks a lot!
>> >> >>> >> >> >> >> >> >>>
>> >> >>> >> >> >> >> >> >>> On Mon, 25 Apr 2022 at 10:47 Tatsuo Ishii <
>> >> >>> >> ishii at sraoss.co.jp
>> >> >>> >> >> >
>> >> >>> >> >> >> >> wrote:
>> >> >>> >> >> >> >> >> >>>
>> >> >>> >> >> >> >> >> >>>> > No, it doesn't.
>> >> >>> >> >> >> >> >> >>>> > In that case when you make the select
>> statement,
>> >> >>> piggy
>> >> >>> >> >> >> internal
>> >> >>> >> >> >> >> >> queries
>> >> >>> >> >> >> >> >> >>>> > failed and the session rebooted
>> >> >>> >> >> >> >> >> >>>>
>> >> >>> >> >> >> >> >> >>>> I see it now.
>> >> >>> >> >> >> >> >> >>>>
>> >> >>> >> >> >> >> >> >>>> test=# begin;
>> >> >>> >> >> >> >> >> >>>> BEGIN
>> >> >>> >> >> >> >> >> >>>> test=*# insert into ttt values(1);
>> >> >>> >> >> >> >> >> >>>> ERROR:  relation "ttt" does not exist
>> >> >>> >> >> >> >> >> >>>> LINE 1: insert into ttt values(1);
>> >> >>> >> >> >> >> >> >>>>                     ^
>> >> >>> >> >> >> >> >> >>>> test=!# select * from t1;
>> >> >>> >> >> >> >> >> >>>> FATAL:  Backend throw an error message
>> >> >>> >> >> >> >> >> >>>> DETAIL:  Exiting current session because of an
>> >> error
>> >> >>> from
>> >> >>> >> >> >> backend
>> >> >>> >> >> >> >> >> >>>> HINT:  BACKEND Error: "current transaction is
>> >> aborted,
>> >> >>> >> >> commands
>> >> >>> >> >> >> >> >> ignored
>> >> >>> >> >> >> >> >> >>>> until end of transaction block"
>> >> >>> >> >> >> >> >> >>>> server closed the connection unexpectedly
>> >> >>> >> >> >> >> >> >>>>         This probably means the server
>> terminated
>> >> >>> >> abnormally
>> >> >>> >> >> >> >> >> >>>>         before or while processing the request.
>> >> >>> >> >> >> >> >> >>>> The connection to the server was lost.
>> Attempting
>> >> >>> reset:
>> >> >>> >> >> >> Succeeded.
>> >> >>> >> >> >> >> >> >>>>
>> >> >>> >> >> >> >> >> >>>> I think pgpool could remember that current
>> >> >>> transaction is
>> >> >>> >> in
>> >> >>> >> >> >> abort
>> >> >>> >> >> >> >> >> >>>> status and does not issue the piggyback query
>> when
>> >> the
>> >> >>> >> selec
>> >> >>> >> >> is
>> >> >>> >> >> >> >> >> >>>> issued. Let me see what we can do for this.
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >> >> I have come up with a patch. In this patch pgpool
>> >> checks if
>> >> >>> >> >> current
>> >> >>> >> >> >> >> >> transaction is in abort status. If so, it does not
>> >> accept
>> >> >>> new
>> >> >>> >> >> query
>> >> >>> >> >> >> at
>> >> >>> >> >> >> >> >> all and just reply back frontend with the message:
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >> >> "current transaction is aborted, commands ignored
>> until
>> >> >>> end of
>> >> >>> >> >> >> >> transaction
>> >> >>> >> >> >> >> >> block"
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >> >> until commit or abort command is sent from frontend.
>> >> After
>> >> >>> >> >> applying
>> >> >>> >> >> >> >> >> the patch the example session above looks like this:
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >> >> test=# begin;
>> >> >>> >> >> >> >> >> BEGIN
>> >> >>> >> >> >> >> >> test=*# insert into ttt values(1);
>> >> >>> >> >> >> >> >> ERROR:  relation "ttt" does not exist
>> >> >>> >> >> >> >> >> LINE 1: insert into ttt values(1);
>> >> >>> >> >> >> >> >>                     ^
>> >> >>> >> >> >> >> >> test=!# select * from t1;
>> >> >>> >> >> >> >> >> ERROR:  current transaction is aborted, commands
>> ignored
>> >> >>> until
>> >> >>> >> >> end of
>> >> >>> >> >> >> >> >> transaction block
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >> >> The patch was tested in master branch, but I believe
>> it
>> >> >>> can be
>> >> >>> >> >> >> >> >> applied to 4.3 stable branch as well.
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >> >> Best reagards,
>> >> >>> >> >> >> >> >> --
>> >> >>> >> >> >> >> >> Tatsuo Ishii
>> >> >>> >> >> >> >> >> SRA OSS, Inc. Japan
>> >> >>> >> >> >> >> >> English: http://www.sraoss.co.jp/index_en.php
>> >> >>> >> >> >> >> >> Japanese:http://www.sraoss.co.jp
>> >> >>> >> >> >> >> >>
>> >> >>> >> >> >> >>
>> >> >>> >> >> >>
>> >> >>> >> >>
>> >> >>> >>
>> >> >>>
>> >> >>
>> >>
>>


More information about the pgpool-general mailing list