[Pgpool-general] pgpool 2.2.4: DEALLOCATED children
Agustin Almonte Ferrada
aalmonte at antica.cl
Fri Sep 25 08:11:11 UTC 2009
Hi Tatsuo,
filtered logs are attached.
Can you validate the patches applied?
Thanks,
Agustín Almonte F.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pgpool_pid11723.log
Type: application/octet-stream
Size: 40815 bytes
Desc: not available
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20090925/b8c2b6f6/attachment-0001.obj>
-------------- next part --------------
El 25-09-2009, a las 4:00, Tatsuo Ishii escribió:
> Xavier,
>
> Thanks for analyzing and patches! I don't know what 0x0049050000 is
> either. Can you send me the log?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
>
>> Tatsuo,
>>
>> I think we found what the problem was. During the reset of a backend
>> the pgpool process send a BEGIN command to start a transaction and
>> expects to receive a message kind 'N', 'E', 'C' or 'Z', but in our
>> case the backend sends something different ( 0x0049050000 ). The
>> process interprets part of what it received as the length of the data
>> it needs to read from the backend, and so blocks itself indefinitely
>> while waiting to read that much data.
>>
>> I don't know what it is that the backend is sending, but it seems to
>> be always the same data (0x0049050000), and the first byte of it is
>> not any known message kind ('N', 'E', 'C', etc...).
>>
>> I've attached a patch which aborts the reset operation if what was
>> read from the backend is none of the expected message kinds.
>>
>> We also have some logs which might make it easier to understand the
>> code flow in case you want to examine them.
>>
>> Cheers
>>
>>
>> On Thu, Sep 24, 2009 at 9:41 AM, Xavier Noguer <xnoguer at antica.cl>
>> wrote:
>>> Tatsuo,
>>>
>>> Our test case was this: two backends running postgres 8.1; a few
>>> differences between them, with the master node always having more
>>> registers.
>>>
>>> We tried to reproduce the effect on our development environment,
>>> but
>>> it didn't work the first time. I'll try again to see if I can
>>> provide
>>> you with the necessary database dumps to reproduce it.
>>>
>>> Cheers
>>>
>>> On Thu, Sep 24, 2009 at 4:05 AM, Tatsuo Ishii <ishii at sraoss.co.jp>
>>> wrote:
>>>> Thanks for investigation.
>>>>
>>>> But I could not reproduce Agustín's problem. I ran test/jdbc for
>>>> testing. If you have a self contained test case, please let me
>>>> know. I
>>>> would like to know why my patches did not work and should help me
>>>> in
>>>> future bug shooting.
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>>
>>>>> Hello Tatsuo,
>>>>>
>>>>> I'm working with Agustín Almonte on this same issue, and after
>>>>> trying
>>>>> the latest patch you provided we realized that when a DEALLOCATE
>>>>> was
>>>>> being sent for a prepared statement, that prepared statement was
>>>>> not
>>>>> being taken off prepared_list. This meant that prepared_list was
>>>>> not
>>>>> updated and the same DEALLOCATE was sent over and over again.
>>>>>
>>>>> Attached you'll find a patch that takes the prepared statement
>>>>> off
>>>>> prepared_list after having sent the DEALLOCATE for that prepared
>>>>> statement. We tested it and it seems to work fine.
>>>>>
>>>>> Cheers
>>>>
>>>
>>
>> --- pool_process_query.c 2009-09-24 01:56:59.000000000 -0400
>> +++ pool_process_query.c.new 2009-09-25 03:00:23.000000000 -0400
>> @@ -2619,6 +2619,12 @@
>> return POOL_END;
>> }
>> len = ntohl(len) - 4;
>> +
>> + if (kind != 'N' && kind != 'E' && kind != 'C')
>> + {
>> + pool_error("do_command: error, kind is not N, E or C");
>> + return POOL_END;
>> + }
>> string = pool_read2(backend, len);
>> if (string == NULL)
>> {
More information about the Pgpool-general
mailing list