View Issue Details

IDProjectCategoryView StatusLast Update
0000395Pgpool-IIBugpublic2018-06-04 10:07
ReporterJustin.GillAssigned Tot-ishii 
PriorityhighSeveritycrashReproducibilityrandom
Status resolvedResolutionopen 
Product Version3.7.2 
Target VersionFixed in Version 
Summary0000395: Degenerate backend request for node_id: 0
DescriptionPGPool randomly can't connect to node_id:0. This triggers PGPool to think all the databases are unavailable and consequently our client becomes unresponsive. We seem to trigger this issue when stopping the Sidekiq jobs but it can occur without the stopping Sidekiq. A bounce of the pgpool2 service will resolve this issue immediately. I've ensure the host backend_host (1 postgres server) is available and that we can get through the firewall. What would cause PGPool to get in this state? Thank you.

LOG: received degenerate backend request for node_id: 0 from pid [5668]
WARNING: write on backend 0 failed with error :"Success"
DETAIL: while trying to write data from offset: 0 wlen: 5
LOG: Pgpool-II parent process has received failover request
LOG: starting degeneration. shutdown host XX.XX.XX.XXX(5432)
WARNING: All the DB nodes are in down status and skip writing status file.
LOG: failover: no valid backends node found
LOG: Restart all children
....
WARNING: All the DB nodes are in down status and skip writing status file.
ERROR: unable to connect to backend
DETAIL: connection cache is full
TagsNo tags attached.

Activities

t-ishii

2018-05-03 06:47

developer   ~0002007

The message probably indicates that write(2) to socket returned 0. I am not sure this is normal or not, but current code assumes that it never happens. If this actually happens, attached patch should help by retrying write(2). Can you please try it out?

write.diff (523 bytes)
diff --git a/src/utils/pool_stream.c b/src/utils/pool_stream.c
index bd67971..ac709b5 100644
--- a/src/utils/pool_stream.c
+++ b/src/utils/pool_stream.c
@@ -546,7 +546,7 @@ static int pool_write_flush(POOL_CONNECTION *cp, void *buf, int len)
 			sts = write(cp->fd, buf+offset, wlen);
 		}
 
-		if (sts > 0)
+		if (sts >= 0)
 		{
 			wlen -= sts;
 
@@ -633,7 +633,7 @@ int pool_flush_it(POOL_CONNECTION *cp)
 		  sts = write(cp->fd, cp->wbuf + offset, wlen);
 		}
 
-		if (sts > 0)
+		if (sts >= 0)
 		{
 			wlen -= sts;
 
write.diff (523 bytes)

Justin.Gill

2018-05-03 13:09

reporter   ~0002008

Thanks @t-ishii for the quick response. I will try this change and let you know.

Justin.Gill

2018-05-22 02:59

reporter   ~0002029

Please close this ticket. The issue is resolved.

When we migrated from Resque to Sidekiq, PGPool began having issues. It's as if Sidekiq was using ‘pg_terminate_backend’ ? When we migrated to rails 5.2 the issue went away.

t-ishii

2018-05-22 07:53

developer   ~0002030

Thanks for the report! Issue resolved.

Issue History

Date Modified Username Field Change
2018-05-03 05:15 Justin.Gill New Issue
2018-05-03 06:44 t-ishii Assigned To => t-ishii
2018-05-03 06:44 t-ishii Status new => assigned
2018-05-03 06:47 t-ishii File Added: write.diff
2018-05-03 06:47 t-ishii Note Added: 0002007
2018-05-03 06:47 t-ishii Status assigned => feedback
2018-05-03 13:09 Justin.Gill Note Added: 0002008
2018-05-03 13:09 Justin.Gill Status feedback => assigned
2018-05-22 02:59 Justin.Gill Note Added: 0002029
2018-05-22 07:53 t-ishii Note Added: 0002030
2018-05-22 07:54 t-ishii Status assigned => resolved