[pgpool-hackers: 4405] Re: [pgpool-hackers] ERROR: unable to read message kind from backend

Tatsuo Ishii ishii at sraoss.co.jp
Wed Oct 4 12:01:23 JST 2023


Hi SadhuPrasad,

> Hi Tatsuo Ishii,
> 
>  Issue is happening in PgPool 4.4.2 version.
> And this issue is happening with EDB PgPool actually. Customer
> testcase has some EDB specific msg. But the suspected code is same as
> the Community PgPool code.
> 
> Some guidance on how & what to check further will also may help..

You'd better to provide self contained test case which can be applied
to the commumity Pgpool-II, or ask help to EDB support team.

> Thanks & Regards
> SadhuPrasad
> 
> On Fri, Sep 29, 2023 at 12:21 PM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
>>
>> > Hi Everyone,
>> >
>> > I am running a Java customer application with PgPool LB feature with 2 DB nodes.
>> > After some analysis, I figured out something as below:
>> >
>> > In pool_pending_message_query_context_dest_set(), it sets
>> > where_to_send field in query context, which tells to which node the
>> > msg was sent.
>> >
>> > /* Save where_to_send map */
>> > memcpy(s->where_to_send_save, query_context->where_to_send,
>> > sizeof(s->where_to_send_save));
>> > s->need_to_restore_where_to_send = true;
>> >
>> >  /* Rewrite where_to_send map */
>> > memset(query_context->where_to_send, 0, sizeof(query_context->where_to_send));
>> >
>> > for (i = 0; i < MAX_NUM_BACKENDS; i++)
>> > {
>> >       if (message->node_ids[i])
>> >            query_context->where_to_send[i] = 1;
>> > }
>> >
>> > Here where_to_send field has been stored in session_context and memset
>> > to 0 in query_context. Then we are setting it back only w.r.t nodeid
>> > for which the msg has been sent.
>> >
>> > At the end of the function, where_to_send field will be set to 1 only
>> > for Node0, but for all other nodes, it will be 0.
>> >
>> > LOG:  Message is SHOW TRANSACTION ISOLATION LEVEL
>> > LOG:  pool_pending_message_query_context_dest_set:1251: Where to send
>> > for Node0 is 1 & Where to send for Node1 is 0
>> >
>> > Next it is going to “read_kind_from_one_backend()“, with MAIN_NODE_ID
>> > as 1, for which where_to_send is already set to 0 as explained above.
>> > Here we are considering the backend node 1 is not valid and raising
>> > the error as:
>> > ERROR: unable to read message kind from backend
>> > Detail: 1 th backend is not valid
>> >
>> > I am focusing further if the MAIN_NODE_ID is set wrong somehow. But
>> > any help or suggestions ? Please let me know..

One possibility is, the load balance node (which directly affects
determing which node is the MAIN_NODE_ID) had been changed between the
message parsing time and when read_kind_from_one_backend() got
called. Various load balancing related parameters could affect this. I
suggest you to look into those settings.

Another possibility is, EDB may have changed load balancing code but
this is just my guess because I don't have access EDB pgpool.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


More information about the pgpool-hackers mailing list