[pgpool-hackers: 1151] Re: pgpool-II 3.4 problems
Muhammad Usama
m.usama at gmail.com
Wed Nov 11 01:03:26 JST 2015
Hi Ishii San
Can you please have a look at the attached patch for 3.4 branch.
Basically it introduces a new error level FRONTEND_ERROR for ereport(), to
inform the error handler not to cache the current backend connection. Other
than that, this FRONTEND_ERROR error level behaves and works similar to the
ereport(ERROR..)
Thanks
Best regards
Muhammad Usama
On Thu, Oct 15, 2015 at 4:45 PM, Muhammad Usama <m.usama at gmail.com> wrote:
>
>
> On Thu, Oct 15, 2015 at 2:23 AM, Tatsuo Ishii <ishii at postgresql.org>
> wrote:
>
>> Usama,
>>
>> Can you please propose a patch to implement similar thing with 3.4? I
>> think it will make 3.4 more robust. I have no idea how to implement
>> it with 3.4's exception mechanism.
>>
>
> sure I will work on this
>
> Thanks
> best regards
> Muhammad Usama
>
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese:http://www.sraoss.co.jp
>>
>> > Hi Ishii San
>> >
>> > I think this difference of behavior between 3.3. And 3.4 is caused by
>> the
>> > commit "cb735f22441001b6afdbb5bac72808db66094ca9" that was not ported to
>> > the 3.4 branch.
>> >
>> >
>> http://git.postgresql.org/gitweb/?p=pgpool2.git;a=commit;h=cb735f22441001b6afdbb5bac72808db66094ca9
>> >
>> > Thanks
>> > Best regards
>> > Muhammad Usama
>> >
>> >
>> >
>> > On Tue, Oct 13, 2015 at 2:45 PM, Tatsuo Ishii <ishii at postgresql.org>
>> wrote:
>> >
>> >> > Usama,
>> >> >
>> >> > While I have been working on:
>> >> > http://www.pgpool.net/mantisbt/view.php?id=147
>> >>
>> >> Oops. This is wrong. Here is the correct one.
>> >>
>> >> http://www.pgpool.net/mantisbt/view.php?id=145
>> >>
>> >> > I found there was an interesting behavior change between pgpool-II
>> 3.3
>> >> > and 3.4.
>> >> >
>> >> > It turns out that the bug was related to the action when
>> >> > client_idle_limit is reached. In 3.3, when client_idle_limit is
>> >> > reached, following statement is executed (pool_process_query.c):
>> >> >
>> >> > if (idle_count > pool_config->client_idle_limit)
>> >> > {
>> >> > pool_log("pool_process_query: child connection forced to
>> terminate
>> >> due to client_idle_limit (%d) reached,
>> >> > pool_config->client_idle_limit);
>> >> > pool_send_error_message(frontend, MAJOR(backend),
>> >> > "57000", "connection terminated due to client idle
>> limit
>> >> reached",
>> >> > "","", __FILE__, __LINE__);
>> >> > return POOL_END_WITH_FRONTEND_ERROR;
>> >> > }
>> >> >
>> >> > then goes to this (child.c)
>> >> >
>> >> > if (status == POOL_END_WITH_FRONTEND_ERROR ||
>> >> > pool_config->connection_cache == 0 ||
>> >> > !strcmp(sp->database, "template0") ||
>> >> > !strcmp(sp->database, "template1") ||
>> >> > !strcmp(sp->database, "postgres") ||
>> >> > !strcmp(sp->database, "regression"))
>> >> > {
>> >> > reset_connection();
>> >> > pool_close(frontend);
>> >> > pool_send_frontend_exits(backend);
>> >> > pool_discard_cp(sp->user, sp->database, sp->major);
>> >> > }
>> >> >
>> >> > This results in closing connection to backend.
>> >> >
>> >> > In 3.4,
>> >> >
>> >> > if (idle_count > pool_config->client_idle_limit)
>> >> > {
>> >> > ereport(ERROR,
>> >> > (pool_error_code("57000"),
>> >> > errmsg("unable to read data"),
>> >> > errdetail("child connection forced to terminate due
>> to
>> >> client_idle_limit:%d is reached",
>> >> > pool_config->client_idle_limit)));
>> >> > }
>> >> >
>> >> > Then jump to backend_cleanup() and call pool_process_query() to
>> >> > execute queries defined in reset_query_list. This results in sending
>> >> > typically "DISCARD ALL" to backend. The problem reported in the bug
>> id
>> >> > 147 happens in this route (actually the reporter claims that
>> pgpool-II
>> >> > hangs in sending "DISCARD").
>> >> >
>> >> > In summary, when client idle limit reaches, pgpool-II 3.3 disconnects
>> >> > connections to backend, while 3.4 does not close the connections to
>> >> > backend and sends DISCARD to backend.
>> >> >
>> >> > We have a few reports that pgpool-II 3.4 hangs while sending DISCARD
>> >> > and I wonder those problems somewhat related to the behavior change
>> >> > described above because the code path in 3.4 is executed even if
>> >> > client_idle_limit is not reached (elog(ERROR...)).
>> >> >
>> >> > Usama,
>> >> >
>> >> > Was this behavior change intentional?
>> >> >
>> >> > Best regards,
>> >> > --
>> >> > Tatsuo Ishii
>> >> > SRA OSS, Inc. Japan
>> >> > English: http://www.sraoss.co.jp/index_en.php
>> >> > Japanese:http://www.sraoss.co.jp
>> >> > _______________________________________________
>> >> > pgpool-hackers mailing list
>> >> > pgpool-hackers at pgpool.net
>> >> > http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>> >> _______________________________________________
>> >> pgpool-hackers mailing list
>> >> pgpool-hackers at pgpool.net
>> >> http://www.pgpool.net/mailman/listinfo/pgpool-hackers
>> >>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-hackers/attachments/20151110/59a1ef6c/attachment-0001.html>
-------------- next part --------------
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index fd7ab1d..ecaf801 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -118,6 +118,10 @@ typedef enum
#endif
#define FATAL 21 /* fatal error - abort process */
#define PANIC 22 /* take down the other backends with me */
+/* pgpool-II extension. This is same as ERROR but sets the
+ * do not cache connection flag before transforming to ERROR.
+ */
+#define FRONTEND_ERROR 23 /* transformed to ERROR at errstart */
/* #define DEBUG DEBUG1 */ /* Backward compatibility with pre-7.3 */
#define POOL_EXIT_SUCCESS 0 /* failure exit and child gets restarted*/
@@ -280,6 +284,7 @@ extern int errposition(int cursorpos);
#define pg_unreachable() exit(0)
+extern int getfrontendinvalid(void);
extern int geterrcode(void);
extern int geterrposition(void);
extern int getinternalerrposition(void);
@@ -431,6 +436,7 @@ typedef struct ErrorData
const char *domain; /* message domain */
const char *context_domain; /* message domain for context message */
int sqlerrcode; /* encoded ERRSTATE */
+ int frontend_invalid;/* non zero means frontend connection is invalid */
char *pgpool_errcode; /* error code to be sent to client */
char *message; /* primary error message */
char *detail; /* detail error message */
diff --git a/src/protocol/child.c b/src/protocol/child.c
index c0cc066..7ca1079 100644
--- a/src/protocol/child.c
+++ b/src/protocol/child.c
@@ -87,7 +87,7 @@ static POOL_CONNECTION *get_connection(int front_end_fd, SockAddr *saddr);
static POOL_CONNECTION_POOL *get_backend_connection(POOL_CONNECTION *frontend);
static StartupPacket *StartupPacketCopy(StartupPacket *sp);
static void print_process_status(char *remote_host,char* remote_port);
-static bool backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend);
+static bool backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend, int frontend_invalid);
static void free_persisten_db_connection_memory(POOL_CONNECTION_POOL_SLOT *cp);
static int choose_db_node_id(char *str);
static void child_will_go_down(int code, Datum arg);
@@ -211,6 +211,8 @@ void do_child(int *fds)
if (sigsetjmp(local_sigjmp_buf, 1) != 0)
{
POOL_PROCESS_CONTEXT *session;
+ int frontend_invalid = getfrontendinvalid();
+
disable_authentication_timeout();
/* Since not using PG_TRY, must reset error stack by hand */
error_context_stack = NULL;
@@ -233,7 +235,7 @@ void do_child(int *fds)
if (accepted)
connection_count_down();
- backend_cleanup(&child_frontend, backend);
+ backend_cleanup(&child_frontend, backend, frontend_invalid);
session = pool_get_process_context();
@@ -372,7 +374,7 @@ void do_child(int *fds)
status = pool_process_query(child_frontend, backend, 0);
if(status != POOL_CONTINUE)
{
- backend_cleanup(&child_frontend, backend);
+ backend_cleanup(&child_frontend, backend, 0);
break;
}
}
@@ -415,7 +417,7 @@ void do_child(int *fds)
* return true if backend connection is cached
*/
static bool
-backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend)
+backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volatile backend, int frontend_invalid)
{
StartupPacket *sp;
bool cache_connection = false;
@@ -426,11 +428,11 @@ backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volat
sp = MASTER_CONNECTION(backend)->sp;
/*
- * cach connection if connection is not for
+ * cach connection if frontend is not invalid, connection is not for
* system db and connection cache configuration
* parameter is enabled
*/
- if (sp && pool_config->connection_cache != 0 && sp->system_db == false)
+ if (sp && pool_config->connection_cache != 0 && sp->system_db == false && frontend_invalid == 0 )
{
if (*frontend)
{
@@ -453,20 +455,22 @@ backend_cleanup(POOL_CONNECTION* volatile *frontend, POOL_CONNECTION_POOL* volat
}
}
- /*
- * For those special databases, and when frontend client exits abnormally,
- * we don't cache connection to backend.
- */
- if ((sp &&
- (!strcmp(sp->database, "template0") ||
- !strcmp(sp->database, "template1") ||
- !strcmp(sp->database, "postgres") ||
- !strcmp(sp->database, "regression"))) ||
- (*frontend != NULL &&
- ((*frontend)->socket_state == POOL_SOCKET_EOF ||
- (*frontend)->socket_state == POOL_SOCKET_ERROR)))
- cache_connection = false;
-
+ if (cache_connection)
+ {
+ /*
+ * For those special databases, and when frontend client exits abnormally,
+ * we don't cache connection to backend.
+ */
+ if ((sp &&
+ (!strcmp(sp->database, "template0") ||
+ !strcmp(sp->database, "template1") ||
+ !strcmp(sp->database, "postgres") ||
+ !strcmp(sp->database, "regression"))) ||
+ (*frontend != NULL &&
+ ((*frontend)->socket_state == POOL_SOCKET_EOF ||
+ (*frontend)->socket_state == POOL_SOCKET_ERROR)))
+ cache_connection = false;
+ }
/*
* Close frontend connection
*/
diff --git a/src/protocol/pool_process_query.c b/src/protocol/pool_process_query.c
index ff97bf1..271f19f 100644
--- a/src/protocol/pool_process_query.c
+++ b/src/protocol/pool_process_query.c
@@ -4695,7 +4695,7 @@ SELECT_RETRY:
if (idle_count > pool_config->client_idle_limit)
{
- ereport(ERROR,
+ ereport(FRONTEND_ERROR,
(pool_error_code("57000"),
errmsg("unable to read data"),
errdetail("child connection forced to terminate due to client_idle_limit:%d is reached",
@@ -4708,7 +4708,7 @@ SELECT_RETRY:
if (idle_count_in_recovery > pool_config->client_idle_limit_in_recovery)
{
- ereport(ERROR,
+ ereport(FRONTEND_ERROR,
(pool_error_code("57000"),
errmsg("unable to read data"),
errdetail("child connection forced to terminate due to client_idle_limit_in_recovery:%d is reached",pool_config->client_idle_limit_in_recovery)));
@@ -4720,7 +4720,7 @@ SELECT_RETRY:
* If we are in recovery and client_idle_limit_in_recovery is -1, then
* exit immediately.
*/
- ereport(ERROR,
+ ereport(FRONTEND_ERROR,
(pool_error_code("57000"),
errmsg("connection terminated due to online recovery"),
errdetail("child connection forced to terminate due to client_idle_limitis:-1")));
diff --git a/src/utils/error/elog.c b/src/utils/error/elog.c
index 18ab006..dd8f44a 100644
--- a/src/utils/error/elog.c
+++ b/src/utils/error/elog.c
@@ -217,11 +217,17 @@ errstart(int elevel, const char *filename, int lineno,
bool output_to_server;
bool output_to_client = false;
int i;
-
+ int frontend_invalid = 0;
/*
* Check some cases in which we want to promote an error into a more
* severe error. None of this logic applies for non-error messages.
*/
+ if (elevel == FRONTEND_ERROR)
+ {
+ frontend_invalid = true;
+ elevel = ERROR;
+ }
+
if (elevel >= ERROR)
{
/*
@@ -321,6 +327,7 @@ errstart(int elevel, const char *filename, int lineno,
edata = &errordata[errordata_stack_depth];
MemSet(edata, 0, sizeof(ErrorData));
edata->elevel = elevel;
+ edata->frontend_invalid = frontend_invalid;
edata->output_to_server = output_to_server;
edata->output_to_client = output_to_client;
if(elevel == FATAL && PG_exception_stack == NULL) /* This is startup failure. Take down main process with it */
@@ -969,6 +976,21 @@ geterrcode(void)
}
/*
+ * getfrontendinvalid --- return the currently frontend_invalid value
+ *
+ * This is only intended for use in error callback subroutines, since there
+ * is no other place outside elog.c where the concept is meaningful.
+ */
+int getfrontendinvalid(void)
+{
+ ErrorData *edata = &errordata[errordata_stack_depth];
+
+ /* we don't bother incrementing recursion_depth */
+ CHECK_STACK_DEPTH();
+
+ return edata->frontend_invalid;
+}
+/*
* geterrposition --- return the currently set error position (0 if none)
*
* This is only intended for use in error callback subroutines, since there
@@ -1580,7 +1602,7 @@ write_eventlog(int level, const char *line, int len)
if (evtHandle == INVALID_HANDLE_VALUE)
{
- evtHandle = RegisterEventSource(NULL, event_source ? event_source : "PostgreSQL");
+ evtHandle = RegisterEventSource(NULL, event_source ? event_source : "pgpool");
if (evtHandle == NULL)
{
evtHandle = INVALID_HANDLE_VALUE;
More information about the pgpool-hackers
mailing list