[Pgpool-general] "FATAL: invalid frontend message type 4"

Xavier Noguer xnoguer at antica.cl
Fri Jan 8 19:32:42 UTC 2010


 I found out more about this situation. The 0x04 is always read after
receiving a terminate ('X') message from the frontend, which makes me
think that the frontend might be sending garbage after finishing the
session, or somehow reading the length of the terminate message
(0x00000004) fails in line 1529 of pool_proto_modules.c and leaves the
0x04 part in the read buffer (I don't really know how that might
happen).

 As part of the termination of the session pool_process_query() gets
called, who in turn calls reset_backend() in order to send the commands
in reset_query_list. reset_backend() returns 1 and so the block of code
which starts with the "else if" in line 136 of pool_process_query.c gets
executed, setting the "state" variable with a value of 1. This is
necessary so that the code in line 471 is executed after we read a
ReadyForQuery ('Z') message from the backends and reset_backend() gets
called again and the termination process can continue.

 The problem is that after the call to reset_backend(), but before
attempting to read a new message from the backend, pgpool may attempt to
read a new message from the frontend (lines 307 and 326) and read
garbage, which is then forwarded to the backends.

 I attach a patch that makes sure pgpool will not try to read from the
frontend while the "state" variable is 0 (waiting for a ReadyForQuery
message).

 I'm sorry about the long explanation, but I thought it was necessary to
understand what the patch tries to accomplish. 


 Regards

On Sun, 2010-01-03 at 21:11 +0900, Tatsuo Ishii wrote:
> I looked into this but couldn't find why pgpool-II sent 4 to
> PostgreSQL. Any test case to reproduce the error please.
> 
> In the mean time I start to think that pgpool-II should not trigger
> failover if failed to read data from backend. Looking into PostgreSQL
> I found lots of places where FATAL errors are generated and PostgreSQL
> exits. If it exits, pgpool-II will detect a read error from socket
> that will cause failover. I feel this behavior is too fragile. I'm
> looking forward to tweak pool_stream.c so that it does not failover.
> Comments?
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> 
> >  pgpool-II version is 2.2.4 compiled against libpq 8.1.15, and postgres
> > version is 8.1.15.
> > 
> >  Both run on Debian etch amd64.
> > 
> >  I haven't been able to create a test case yet, but I've been able to
> > verify that it always happens after a COMMIT (like my previous example)
> > or a BEGIN.
> > 
> > 
> >  Regards
> > 
> > On Wed, 2009-12-30 at 17:41 +0900, Tatsuo Ishii wrote:
> > > PostgreSQL, pgpool-II version?
> > > 
> > > Do you have self contained test case?
> > > --
> > > Tatsuo Ishii
> > > SRA OSS, Inc. Japan
> > > 
> > > >  On this same issue, I found that pgpool detected an error just before
> > > > failing a backend as a response to a COMMIT (this is on a different
> > > > setup than the one from my previous post). This is the relevant part
> > > > from the pgpool log:
> > > > 
> > > > 
> > > > 2009-12-26 00:13:04 DEBUG: pid 16329: ReadyForQuery: transaction state:
> > > > T
> > > > 2009-12-26 00:13:04 DEBUG: pid 16329: do_command: Query: COMMIT
> > > > 2009-12-26 00:13:04 DEBUG: pid 16329: wait_for_query_response: waiting
> > > > for backend 1 completing the query
> > > > 2009-12-26 00:13:04 DEBUG: pid 16329: detect_error: kind: E
> > > > 2009-12-26 00:13:04 DEBUG: pid 16329: do_command: kind: E
> > > > 2009-12-26 00:13:04 ERROR: pid 16329: pool_read: read failed (Connection
> > > > reset by peer)
> > > > 2009-12-26 00:13:04 LOG:   pid 16329: notice_backend_error: 1 fail over
> > > > request from pid 16329
> > > > 
> > > > 
> > > >  I can see the commit on the postgres log of the failed backend,
> > > > followed by the "invalid frontend message type 4" which closes the
> > > > session and the connection to the client. That would explain why the
> > > > pool_read call on pgpool fails, but I still can't figure out why does
> > > > the backend receive that invalid message.
> > > > 
> > > > 
> > > > 
> > > >  Regards
> > > > 
> > > > 
> > > > On Thu, Dec 24, 2009 at 11:38 AM, Xavier Noguer <xnoguer at antica.cl>
> > > > wrote:
> > > > >
> > > > >  Hello,
> > > > >
> > > > >  I encountered a problem with a pgpool installation. Pgpool failed and
> > > > > detached a backend, but I don't know the reason because that
> > > > particular
> > > > > pgpool server is not writing to its log (problem with cronolog, and I
> > > > > can't restart pgpool since its a live system).
> > > > >
> > > > >  What I did find was a suspicious message in the postgres log of the
> > > > > failed backend. Here are the relevant lines of the log:
> > > > >
> > > > >
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  duration: 39.809 ms  statement:
> > > > > EXECUTE <unnamed>  [PREPARE:  COMMIT]
> > > > > 2009-12-23 15:07:20 CLST, sess FATAL:  invalid frontend message type 4
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > > 2009-12-23 15:07:20 CLST, sess LOG:  unexpected EOF on client
> > > > connection
> > > > >
> > > > >
> > > > >  Since this happened at the same time as the failing of the backend
> > > > I'm
> > > > > assuming this condition made postgres close the connection to pgpool
> > > > and
> > > > > so triggered the failing and detaching of the backend.
> > > > >
> > > > >  Does anybody know what message type 4 is ?
> > > > >
> > > > >
> > > > >  Regards
> > > > >
> > > > >
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Pgpool-general mailing list
> > > > Pgpool-general at pgfoundry.org
> > > > http://pgfoundry.org/mailman/listinfo/pgpool-general
> > 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch_reset_backend_20100108.diff
Type: text/x-patch
Size: 577 bytes
Desc: not available
URL: <http://pgfoundry.org/pipermail/pgpool-general/attachments/20100108/57f1c329/attachment.bin>


More information about the Pgpool-general mailing list