[pgpool-hackers: 1161] Re: pgpool-II 3.4 problems
Muhammad Usama
m.usama at gmail.com
Tue Nov 17 21:50:37 JST 2015
Hi Ishii-San
Thanks for reviewing the patch.
On Tue, Nov 17, 2015 at 7:31 AM, Tatsuo Ishii <ishii at postgresql.org> wrote:
> Hi Usama,
>
> I have taken a look at your patch.
>
> 1) Below diff seems to be unnecessary because even without the "+ if
> (cache_connection)", the result would be same.
>
Yes this is correct the result will be same after this new if statement.
The reason I made this change was to bypass if statement with four
strcmp()s when we already knew the cache_connection will remain false.
Although might not make much difference but done this to improve
performance of this condition statement.
> - /*
> - * For those special databases, and when frontend client exits
> abnormally,
> - * we don't cache connection to backend.
> - */
> - if ((sp &&
> - (!strcmp(sp->database, "template0") ||
> - !strcmp(sp->database, "template1") ||
> - !strcmp(sp->database, "postgres") ||
> - !strcmp(sp->database, "regression"))) ||
> - (*frontend != NULL &&
> - ((*frontend)->socket_state == POOL_SOCKET_EOF ||
> - (*frontend)->socket_state == POOL_SOCKET_ERROR)))
> - cache_connection = false;
> -
> + if (cache_connection)
> + {
> + /*
> + * For those special databases, and when frontend client
> exits abnormally,
> + * we don't cache connection to backend.
> + */
> + if ((sp &&
> + (!strcmp(sp->database, "template0") ||
> + !strcmp(sp->database, "template1") ||
> + !strcmp(sp->database, "postgres") ||
> + !strcmp(sp->database, "regression"))) ||
> + (*frontend != NULL &&
> + ((*frontend)->socket_state == POOL_SOCKET_EOF ||
> + (*frontend)->socket_state == POOL_SOCKET_ERROR)))
> + cache_connection = false;
> + }
>
> 2) Shouldn't frontend_invalid variable be bool because it takes only 0
> or 1. If so, geterrposition()'s return type should be declared as
> bool.
>
Yes, you are right. frontend_invalid should've been bool. I have updated
this in attached version of the patch.
Thanks
Best regards
Muhammad Usama
>
> Best regards,
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese:http://www.sraoss.co.jp
>
> > Hi Ishii San
> >
> > Can you please have a look at the attached patch for 3.4 branch.
> > Basically it introduces a new error level FRONTEND_ERROR for ereport(),
> to
> > inform the error handler not to cache the current backend connection.
> Other
> > than that, this FRONTEND_ERROR error level behaves and works similar to
> the
> > ereport(ERROR..)
> >
> > Thanks
> > Best regards
> > Muhammad Usama
> >
> >
> > On Thu, Oct 15, 2015 at 4:45 PM, Muhammad Usama <m.usama at gmail.com>
> wrote:
> >
> >>
> >>
> >> On Thu, Oct 15, 2015 at 2:23 AM, Tatsuo Ishii <ishii at postgresql.org>
> >> wrote:
> >>
> >>> Usama,
> >>>
> >>> Can you please propose a patch to implement similar thing with 3.4? I
> >>> think it will make 3.4 more robust. I have no idea how to implement
> >>> it with 3.4's exception mechanism.
> >>>
> >>
> >> sure I will work on this
> >>
> >> Thanks
> >> best regards
> >> Muhammad Usama
> >>
> >>>
> >>> Best regards,
> >>> --
> >>> Tatsuo Ishii
> >>> SRA OSS, Inc. Japan
> >>> English: http://www.sraoss.co.jp/index_en.php
> >>> Japanese:http://www.sraoss.co.jp
> >>>
> >>> > Hi Ishii San
> >>> >
> >>> > I think this difference of behavior between 3.3. And 3.4 is caused by
> >>> the
> >>> > commit "cb735f22441001b6afdbb5bac72808db66094ca9" that was not
> ported to
> >>> > the 3.4 branch.
> >>> >
> >>> >
> >>>
> http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=cb735f22441001b6afdbb5bac72808db66094ca9
> >>> >
> >>> > Thanks
> >>> > Best regards
> >>> > Muhammad Usama
> >>> >
> >>> >
> >>> >
> >>> > On Tue, Oct 13, 2015 at 2:45 PM, Tatsuo Ishii <ishii at postgresql.org>
> >>> wrote:
> >>> >
> >>> >> > Usama,
> >>> >> >
> >>> >> > While I have been working on:
> >>> >> > http://www.pgpool.net/mantisbt/view.php?id=147
> >>> >>
> >>> >> Oops. This is wrong. Here is the correct one.
> >>> >>
> >>> >> http://www.pgpool.net/mantisbt/view.php?id=145
> >>> >>
> >>> >> > I found there was an interesting behavior change between pgpool-II
> >>> 3.3
> >>> >> > and 3.4.
> >>> >> >
> >>> >> > It turns out that the bug was related to the action when
> >>> >> > client_idle_limit is reached. In 3.3, when client_idle_limit is
> >>> >> > reached, following statement is executed (pool_process_query.c):
> >>> >> >
> >>> >> > if (idle_count > pool_config->client_idle_limit)
> >>> >> > {
> >>> >> > pool_log("pool_process_query: child connection forced to
> >>> terminate
> >>> >> due to client_idle_limit (%d) reached,
> >>> >> > pool_config->client_idle_limit);
> >>> >> > pool_send_error_message(frontend, MAJOR(backend),
> >>> >> > "57000", "connection terminated due to client idle
> >>> limit
> >>> >> reached",
> >>> >> > "","", __FILE__, __LINE__);
> >>> >> > return POOL_END_WITH_FRONTEND_ERROR;
> >>> >> > }
> >>> >> >
> >>> >> > then goes to this (child.c)
> >>> >> >
> >>> >> > if (status == POOL_END_WITH_FRONTEND_ERROR ||
> >>> >> > pool_config->connection_cache == 0 ||
> >>> >> > !strcmp(sp->database, "template0") ||
> >>> >> > !strcmp(sp->database, "template1") ||
> >>> >> > !strcmp(sp->database, "postgres") ||
> >>> >> > !strcmp(sp->database, "regression"))
> >>> >> > {
> >>> >> > reset_connection();
> >>> >> > pool_close(frontend);
> >>> >> > pool_send_frontend_exits(backend);
> >>> >> > pool_discard_cp(sp->user, sp->database, sp->major);
> >>> >> > }
> >>> >> >
> >>> >> > This results in closing connection to backend.
> >>> >> >
> >>> >> > In 3.4,
> >>> >> >
> >>> >> > if (idle_count > pool_config->client_idle_limit)
> >>> >> > {
> >>> >> > ereport(ERROR,
> >>> >> > (pool_error_code("57000"),
> >>> >> > errmsg("unable to read data"),
> >>> >> > errdetail("child connection forced to terminate
> due
> >>> to
> >>> >> client_idle_limit:%d is reached",
> >>> >> > pool_config->client_idle_limit)));
> >>> >> > }
> >>> >> >
> >>> >> > Then jump to backend_cleanup() and call pool_process_query() to
> >>> >> > execute queries defined in reset_query_list. This results in
> sending
> >>> >> > typically "DISCARD ALL" to backend. The problem reported in the
> bug
> >>> id
> >>> >> > 147 happens in this route (actually the reporter claims that
> >>> pgpool-II
> >>> >> > hangs in sending "DISCARD").
> >>> >> >
> >>> >> > In summary, when client idle limit reaches, pgpool-II 3.3
> disconnects
> >>> >> > connections to backend, while 3.4 does not close the connections
> to
> >>> >> > backend and sends DISCARD to backend.
> >>> >> >
> >>> >> > We have a few reports that pgpool-II 3.4 hangs while sending
> DISCARD
> >>> >> > and I wonder those problems somewhat related to the behavior
> change
> >>> >> > described above because the code path in 3.4 is executed even if
> >>> >> > client_idle_limit is not reached (elog(ERROR...)).
> >>> >> >
> >>> >> > Usama,
> >>> >> >
> >>> >> > Was this behavior change intentional?
> >>> >> >
> >>> >> > Best regards,
> >>> >> > --
> >>> >> > Tatsuo Ishii
> >>> >> > SRA OSS, Inc. Japan
> >>> >> > English: http://www.sraoss.co.jp/index_en.php
> >>> >> > Japanese:http://www.sraoss.co.jp
> >>> >> > _______________________________________________
> >>> >> > pgpool-hackers mailing list
> >>> >> > pgpool-hackers at pgpool.net
> >>> >> > http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> >>> >> _______________________________________________
> >>> >> pgpool-hackers mailing list
> >>> >> pgpool-hackers at pgpool.net
> >>> >> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
> >>> >>
> >>>
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20151117/115a6c85/attachment-0001.html>
-------------- next part --------------
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index fd7ab1d..0947337 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -118,6 +118,10 @@ typedef enum
#endif
#define FATAL 21 /* fatal error - abort process */
#define PANIC 22 /* take down the other backends with me */
+/* pgpool-II extension. This is same as ERROR but sets the
+ * do not cache connection flag before transforming to ERROR.
+ */
+#define FRONTEND_ERROR 23 /* transformed to ERROR at errstart */
/* #define DEBUG DEBUG1 */ /* Backward compatibility with pre-7.3 */
#define POOL_EXIT_SUCCESS 0 /* failure exit and child gets restarted*/
@@ -280,6 +284,7 @@ extern int errposition(int cursorpos);
#define pg_unreachable() exit(0)
+extern bool getfrontendinvalid(void);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
@@ -431,6 +436,7 @@ typedef struct ErrorData
const char *domain; /* message domain */
const char *context_domain; /* message domain for context message */
int sqlerrcode; /* encoded ERRSTATE */
+ bool frontend_invalid;/* non zero means frontend connection is invalid */
char *pgpool_errcode; /* error code to be sent to client */
char *message; /* primary error message */
char *detail; /* detail error message */
diff --git a/src/protocol/child.c b/src/protocol/child.c
index c0cc066..6f5f7f0 100644
--- a/src/protocol/child.c
+++ b/src/protocol/child.c
@@ -87,7 +87,7 @@ static POOL_CONNECTION *get_connection(int front_end_fd, SockAddr *saddr);
static POOL_CONNECTION_POOL *get_backend_connection(POOL_CONNECTION *frontend);
static StartupPacket *StartupPacketCopy(StartupPacket *sp);
static void print_process_status(char *remote_host,char* remote_port);
-static bool backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend);
+static bool backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend, bool frontend_invalid);
static void free_persisten_db_connection_memory(POOL_CONNECTION_POOL_SLOT *cp);
static int choose_db_node_id(char *str);
static void child_will_go_down(int code, Datum arg);
@@ -211,6 +211,8 @@ void do_child(int *fds)
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
{
POOL_PROCESS_CONTEXT *session;
+ bool frontend_invalid = getfrontendinvalid();
+
disable_authentication_timeout();
/* Since not using PG_TRY, must reset error stack by hand */
error_context_stack = NULL;
@@ -233,7 +235,7 @@ void do_child(int *fds)
if (accepted)
connection_count_down();
- backend_cleanup(&child_frontend, backend);
+ backend_cleanup(&child_frontend, backend, frontend_invalid);
session = pool_get_process_context();
@@ -372,7 +374,7 @@ void do_child(int *fds)
status = pool_process_query(child_frontend, backend, 0);
if(status != POOL_CONTINUE)
{
- backend_cleanup(&child_frontend, backend);
+ backend_cleanup(&child_frontend, backend, 0);
break;
}
}
@@ -415,7 +417,7 @@ void do_child(int *fds)
* return true if backend connection is cached
*/
static bool
-backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend)
+backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend, bool frontend_invalid)
{
StartupPacket *sp;
bool cache_connection = false;
@@ -426,11 +428,11 @@ backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volat
sp = MASTER_CONNECTION(backend)->sp;
/*
- * cach connection if connection is not for
+ * cach connection if frontend is not invalid, connection is not for
* system db and connection cache configuration
* parameter is enabled
*/
- if (sp && pool_config->connection_cache != 0 && sp->system_db == false)
+ if (sp && pool_config->connection_cache != 0 && sp->system_db == false && frontend_invalid == false )
{
if (*frontend)
{
@@ -453,20 +455,22 @@ backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volat
}
}
- /*
- * For those special databases, and when frontend client exits abnormally,
- * we don't cache connection to backend.
- */
- if ((sp &&
- (!strcmp(sp->database, "template0") ||
- !strcmp(sp->database, "template1") ||
- !strcmp(sp->database, "postgres") ||
- !strcmp(sp->database, "regression"))) ||
- (*frontend != NULL &&
- ((*frontend)->socket_state == POOL_SOCKET_EOF ||
- (*frontend)->socket_state == POOL_SOCKET_ERROR)))
- cache_connection = false;
-
+ if (cache_connection)
+ {
+ /*
+ * For those special databases, and when frontend client exits abnormally,
+ * we don't cache connection to backend.
+ */
+ if ((sp &&
+ (!strcmp(sp->database, "template0") ||
+ !strcmp(sp->database, "template1") ||
+ !strcmp(sp->database, "postgres") ||
+ !strcmp(sp->database, "regression"))) ||
+ (*frontend != NULL &&
+ ((*frontend)->socket_state == POOL_SOCKET_EOF ||
+ (*frontend)->socket_state == POOL_SOCKET_ERROR)))
+ cache_connection = false;
+ }
/*
* Close frontend connection
*/
diff --git a/src/protocol/pool_process_query.c b/src/protocol/pool_process_query.c
index ff97bf1..271f19f 100644
--- a/src/protocol/pool_process_query.c
+++ b/src/protocol/pool_process_query.c
@@ -4695,7 +4695,7 @@ SELECT_RETRY:
if (idle_count > pool_config->client_idle_limit)
{
- ereport(ERROR,
+ ereport(FRONTEND_ERROR,
(pool_error_code("57000"),
errmsg("unable to read data"),
errdetail("child connection forced to terminate due to client_idle_limit:%d is reached",
@@ -4708,7 +4708,7 @@ SELECT_RETRY:
if (idle_count_in_recovery > pool_config->client_idle_limit_in_recovery)
{
- ereport(ERROR,
+ ereport(FRONTEND_ERROR,
(pool_error_code("57000"),
errmsg("unable to read data"),
errdetail("child connection forced to terminate due to client_idle_limit_in_recovery:%d is reached",pool_config->client_idle_limit_in_recovery)));
@@ -4720,7 +4720,7 @@ SELECT_RETRY:
* If we are in recovery and client_idle_limit_in_recovery is -1, then
* exit immediately.
*/
- ereport(ERROR,
+ ereport(FRONTEND_ERROR,
(pool_error_code("57000"),
errmsg("connection terminated due to online recovery"),
errdetail("child connection forced to terminate due to client_idle_limitis:-1")));
diff --git a/src/utils/error/elog.c b/src/utils/error/elog.c
index 18ab006..aab24ed 100644
--- a/src/utils/error/elog.c
+++ b/src/utils/error/elog.c
@@ -217,11 +217,17 @@ errstart(int elevel, const char *filename, int lineno,
bool output_to_server;
bool output_to_client = false;
int i;
-
+ int frontend_invalid = 0;
/*
* Check some cases in which we want to promote an error into a more
* severe error. None of this logic applies for non-error messages.
*/
+ if (elevel == FRONTEND_ERROR)
+ {
+ frontend_invalid = true;
+ elevel = ERROR;
+ }
+
if (elevel >= ERROR)
{
/*
@@ -321,6 +327,7 @@ errstart(int elevel, const char *filename, int lineno,
edata = &errordata[errordata_stack_depth];
MemSet(edata, 0, sizeof(ErrorData));
edata->elevel = elevel;
+ edata->frontend_invalid = frontend_invalid;
edata->output_to_server = output_to_server;
edata->output_to_client = output_to_client;
if(elevel == FATAL && PG_exception_stack == NULL) /* This is startup failure. Take down main process with it */
@@ -969,6 +976,21 @@ geterrcode(void)
}
/*
+ * getfrontendinvalid --- return the currently frontend_invalid value
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+bool getfrontendinvalid(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return edata->frontend_invalid;
+}
+/*
* geterrposition --- return the currently set error position (0 if none)
*
* This is only intended for use in error callback subroutines, since there
@@ -1580,7 +1602,7 @@ write_eventlog(int level, const char *line, int len)
if (evtHandle == INVALID_HANDLE_VALUE)
{
- evtHandle = RegisterEventSource(NULL, event_source ? event_source : "PostgreSQL");
+ evtHandle = RegisterEventSource(NULL, event_source ? event_source : "pgpool");
if (evtHandle == NULL)
{
evtHandle = INVALID_HANDLE_VALUE;
More information about the pgpool-hackers
mailing list