[pgpool-general: 685] Re: Transaction never finishes

Tatsuo Ishii ishii at postgresql.org
Sat Jun 30 08:54:33 JST 2012


> On Fri, Jun 29, 2012 at 12:09 PM, Luiz Pasqual <luiz at pasquall.com> wrote:
>> On Thu, Jun 28, 2012 at 8:06 PM, Tatsuo Ishii <ishii at postgresql.org> wrote:
>>>> Hello,
>>>>
>>>> I'm using pgpool-II 3.1.2 with streaming replication and it's working
>>>> pretty well. But I'm dealing with a weird situation and I don't know
>>>> how to debug:
>>>>
>>>> Sometimes, some transactions never finishes in the master, here is an
>>>> example, the following query:
>>>> select * from pg_stat_activity where xact_start < current_timestamp -
>>>> '10 minutes'::interval
>>>>
>>>> Results:
>>>> 20994;"****";2445;16385;"****";"";"192.168.**.**";"";44083;"2012-06-27
>>>> 05:55:39.525881-03";"2012-06-27 11:17:46.475347-03";"2012-06-27
>>>> 11:18:10.044718-03";f;"<IDLE> in transaction"
>>>>
>>>> This transaction gets AccessShareLock in the relations:
>>>> pg_class_relname_nsp_index
>>>> pg_class
>>>> pg_class_oid_index
>>>> pg_namespace
>>>> pg_namespace_oid_index
>>>> pg_namespace_nspname_index
>>>>
>>>> And one ExclusiveLock that I couldn't identify the relation.
>>>>
>>>> Sometimes, depending on the relations that are locked everything fails
>>>> (timeout) and a pgpool restart must be done. Anyone knows what is
>>>> going on?
>>>
>>> You want to identify the process id of pgpool which is dealing with
>>> PostgreSQL backend process id(in the case above it's 2445).
>>>
>>> Here is the step to find pgpool process id:
>>>
>>> 1) Execute pcp_proc_count to find pgpool process list. This command
>>>   returns all pgpool process ids.
>>>
>>> 2) For each process id in #1, execute pcp_proc_info. This will tell
>>>   what are the PostgreSQL process ids. Note that the command returns
>>>   multiple rows sorting by node id. Usually node id 0 (thus the first
>>>   line) is for primary.
>>>
>>> 3) Look for 2445 in #2 to find the pgpool process id.
>>>
>>> 4) If you could find the pgpool process id (say 12345), grep pgpool
>>>   log by using 12345. This will show what's going on with 12345.
>>
>> Well, they are all "idle in transaction".
>
> With a little more debugging, I discovered that the client who uses
> this specific process already disconnects from pgpool:
> 
> # ps aux | grep 25003
> pgpool: *** *** 127.0.0.1(51456) idle in transaction
> 
> # lsof -i -P -n | grep 25003
> pgpool    25003       root    5u  IPv4 39609135      0t0  TCP *:5432 (LISTEN)
> pgpool    25003       root    6u  IPv4 39621031      0t0  TCP
> 127.0.0.1:5432->127.0.0.1:51456 (CLOSE_WAIT)
> pgpool    25003       root    7u  IPv4 39621043      0t0  TCP
> 192.168.10.55:33684->192.168.10.101:5432 (ESTABLISHED)
> pgpool    25003       root    8u  IPv4 39621044      0t0  TCP
> 192.168.10.55:51761->192.168.10.11:5432 (CLOSE_WAIT)
> 
> There is no connection between the client (127.0.0.1:51456) and ( <- )
> pgpool (127.0.0.1:5432).
> 
> For some reason, this transaction was aborted, client disconnected
> from pgpool, pgpool aborted the transaction on the slave server
> (192.168.10.11) but not on the master (192.168.10.55).
> 
> I'm running out of ideas.

> 192.168.10.55:33684->192.168.10.101:5432 (ESTABLISHED)

What is on 192.168.10.101:5432?

Also can you attach gdb to 25003 and take back trace?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


More information about the pgpool-general mailing list