View Issue Details

IDProjectCategoryView StatusLast Update
0000035Pgpool-IIBugpublic2013-04-04 11:31
ReportertuomasAssigned Toanzai 
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Platformamd64OSUbuntuOS Version12.04
Product Version 
Target VersionFixed in Version 
Summary0000035: Authentication is timeout
DescriptionHi,

I get this error message in the logs all the time, tens of thousands of times a day. The postgresql servers respond fast and there's no problems when manually connecting through pgpool or directly.

In the config authentication_timeout is set to 60.

Any ideas?
Additional InformationPGPool version is 3.2.1 but it happened with earlier versions too
Postgresql version is 9.1.x and streaming replication is used.
Load balancing is off, paraller mode is off, query cache is off, watchdog is off.
TagsNo tags attached.

Activities

anzai

2012-11-26 17:41

developer   ~0000169

Could you show us the log messages of PostgreSQL and pgpool-II in detail when timeout occurs?

tuomas

2012-11-26 23:16

reporter  

pgpool.log (56,259 bytes)

tuomas

2012-11-26 23:17

reporter  

primary_postgresql.log (48,338 bytes)

tuomas

2012-11-26 23:32

reporter  

pgpool.conf (28,404 bytes)

tuomas

2012-11-26 23:36

reporter   ~0000170

I attached the pgpool log and config and the log from primary postgresql.

The timeouts come in pretty much evenly during the day, 1000-1100 per hour or so.

The application's connections (user=app, db=app_production in the logs) are long lasting and there's usually 50-70 connections open to pgpool (and to the postgresql server), usually only 1-3 are processing something and others are idle. Max is 200. New connection is only opened when a new application thread is started (not so often).

anzai

2012-12-13 11:34

developer   ~0000183

The message of "authentication is timeout" means that pgpool's child process can't read a startup packet.
Could you confirm your system's networking again?

tuomas

2012-12-13 15:46

reporter   ~0000184

I haven't seen any issues with the network, this is also happening both in qa and production environments and I think it would be visible in other ways too, if there was network issues..

We've done performance testing to it and I don't get any connection failures through pgpool or directly to postgresql so this looks more of like a false error, something caused by pgpool's internal connections (health checks?) or an error that is retried by pgpool.

The application doesn't get any db errors and it also records the time spent in db operations and there hasn't been any slowness there.

tuomas

2012-12-16 10:27

reporter   ~0000191

I debugged this further. It seems that the timeout message occurs because pool_read fails in child.c:825 (/* read startup packet length */) and it just returns null without clearing the timer.

in pool_read len=4, readlen=0 and cp->isbackend is false, so it returns -1
in pool_stream.c:203 (/* if backend offers authentication method, frontend could close connection */)

I thought it was a problem communicating with the backend but it seems to be with frontend.

readlen is 0, so is there no data coming in? could it be just a client opening the connection but sending no data? Or are we maybe not waiting long enough for it?

The frontend connection is closed nearly instantly in child.c:187 (/* connection request from frontend timed out */), within the second from the connection and long ago before the timeout message comes.

tuomas

2012-12-16 11:20

reporter   ~0000192

Ok, I know what is causing those, it's a haproxy doing health checks by opening the tcp connection and closing it. I guess the authentication timeout timer should be cleared if/when the frontend closes the connection.

t-ishii

2012-12-16 12:45

developer   ~0000193

How? As far as I know, there's no way to know reading socket is closed in UNUX/Linux API.

tuomas

2012-12-16 22:51

reporter   ~0000194

Well, currently it seems to be detected (nearly?) instantly in child.c:187. I added some debug messages to see what happens:
Dec 16 02:10:03 pgpool[31272]: I am 31272 accept fd 12
Dec 16 02:10:03 pgpool[31272]: connection received: host=app01 port=41484
Dec 16 02:10:03 pgpool[31272]: pool_read debug: len: 4, consume_size: 0
Dec 16 02:10:03 pgpool[31272]: pool_read debug: ssl_active: 0
Dec 16 02:10:03 pgpool[31272]: pool_read debug: readlen: 0
Dec 16 02:10:03 pgpool[31272]: pool_read debug: returning -1 in pool_stream.c:203
Dec 16 02:10:03 pgpool[31272]: read_startup_packet debug: pool_read failed in child.c:825
Dec 16 02:10:03 pgpool[31272]: do_child debug: failed to read the startup packet, close frontend connection in child.c:187
Dec 16 02:11:03 pgpool[31272]: authentication is timeout
Dec 16 02:11:03 pgpool[2935]: child 31272 exits with status 256

The read failed and pgpool closed it's side nearly instantly when the client had closed the tcp-connection and the "authentication is timeout" message came about 60 seconds later (60 seconds is the authentication timeout). I suppose the child moved back to accept loop then.

anzai

2012-12-17 13:55

developer   ~0000195

Thanks for your debugging.
I fixed read_startup_packet() to reset timeout alarm correctly.
Could you try this snapshot?

http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=646c1eba2b38a4d4577fee0a956451ca55b38729

tuomas

2012-12-22 05:08

reporter   ~0000206

Thanks, no longer getting the "Authentication is timeout". I would probably change the log line from error to debug, though. If a monitoring is polling the port, then it will cause quite a bit of noise just like the authentication timeout message.

Issue History

Date Modified Username Field Change
2012-11-20 11:54 tuomas New Issue
2012-11-26 15:20 anzai Assigned To => anzai
2012-11-26 15:20 anzai Status new => assigned
2012-11-26 17:41 anzai Note Added: 0000169
2012-11-26 23:16 tuomas File Added: pgpool.log
2012-11-26 23:17 tuomas File Added: primary_postgresql.log
2012-11-26 23:32 tuomas File Added: pgpool.conf
2012-11-26 23:36 tuomas Note Added: 0000170
2012-12-13 11:34 anzai Note Added: 0000183
2012-12-13 15:46 tuomas Note Added: 0000184
2012-12-16 10:27 tuomas Note Added: 0000191
2012-12-16 11:20 tuomas Note Added: 0000192
2012-12-16 12:45 t-ishii Note Added: 0000193
2012-12-16 22:51 tuomas Note Added: 0000194
2012-12-17 13:55 anzai Note Added: 0000195
2012-12-22 05:08 tuomas Note Added: 0000206
2013-04-04 11:31 anzai Status assigned => resolved
2013-04-04 11:31 anzai Resolution open => fixed