[Pgpool-general] Reloading the server while under load and recovering the degenerated node

Tue Feb 24 02:30:19 UTC 2009

I looked into the problem closely and fixed.

Initialy I thought PostgreSQL will not send parameter status packet
unless some sensitive parameters such as datestyle etc. is
changed. The reality is, PostgreSQL sends those packets virtually
randomly whenever postgresql.conf is reloaded.

Now pgpool ignores those randomly:-) sent packets thus you should be
able to reload postgresql.conf anytime.

Please grab the newest pool_process_query.c(1.141) or apply included
patches and try it out.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> Hi Jaume,
> 
> After reading the below I decided to test the reload configuration  
> functionality,
> whether doing so through pgpool using pg_reload_conf function or by  
> reloading using the init.d script on the master node.
> By using the init.d script to reload the conf I already expect pgpool  
> to not like that and degenerate nodes.
> 
> On my personal opinion I would not reload the configuration using the  
> init.d script on a particular node since this is a replication  
> environment.
> Also the configuration change would be done on both nodes but I see  
> your point and I tested it just modifying one configuration file.
> 
> I've tested this on both pgpool2 version 2.1 (cvs 20090120) and also  
> on pgpool2.2 RC1 (patched).
> At first, I thought that pgpool2 was not handling the process of  
> reloading the configuration on both nodes too well.
> But then I remembered a few things and realized that what I was  
> expecting to happen was wrong and I should not  expect so.
> 
> 1) I knew that a "SELECT pg_reload_conf()"  would only run on one node
> 
> 2) If I had the "SELECT pg_reload_conf()" wrapped within a transaction  
> I was expecting it to run on both nodes (WRONG).
>       Since it's a SELECT it only runs on one node, the transaction  
> block though is indeed opened on both nodes.
> 
> Then I remembered (from a previous email by Tatsuo) that in order for  
> me to make that function run on both nodes I should add the / 
> *REPLICATION*/ comment before the SELECT.
> 
> So, If I change the "log_statement" parameter on one of the nodes,  
> either node 0 or node 1, and reload the configuration file
> of both nodes through pgpool using the command below, everything works  
> ok and nothing gets degenerated.
> 
> /*REPLICATION*/SELECT pg_reload_conf();
> 
> 
> In regards to online recovery, it works fine to me although I haven't  
> tried do it while using pgbench in a while.
> I have though tested it while pgpool/DBs have quite a bit of load  
> created by our in-house application.
> 
> 
> Marcelo
> PostgreSQL DBA
> Linux/Solaris System Administrator
> 
> 
> On Feb 23, 2009, at 5:47 AM, Jaume Sabater wrote:
> 
> > Hi everyone!
> >
> > I am opening this new thread to deal with this problem. It's now
> > pgpool-II 2.2 RC2 under Debian GNU/Linux Lenny AMD64 with PostgreSQL
> > 8.3.6. Two nodes, pgpool-II executing in node 0, which is the master
> > node at the moment.
> >
> > Sentences executed from my laptop:
> >
> > dropdb -h postgresql -p 9999 -U pgpool2 bench_replication
> > createdb -h postgresql -p 9999 -U pgpool2 bench_replication
> > createlang -h postgresql -p 9999 -U pgpool2 -d bench_replication  
> > plpgsql
> > pgbench -i -h postgresql -p 9999 -U pgpool2 -d bench_replication
> > pgbench -h postgresql -p 9999 -U pgpool2 -d bench_replication -c 2 - 
> > t 100
> >
> > During the execution of the last sentence, I am reloading
> > configuration in the master node. The change is in file
> > /etc/postgresql/8.3/main/postgresql.conf and consists of turning
> > log_statement from "all" to "none". This is what appears in the syslog
> > (from pgpool-II):
> >
> > Feb 23 12:12:37 pgsql1 pgpool: ERROR: pid 2632: pool_process_query: 1
> > th kind C does not match with master or majority connection kind S
> > Feb 23 12:12:37 pgsql1 pgpool: ERROR: pid 2632: kind mismatch among
> > backends. Possible last query was: "BEGIN;" kind details are: 0[S]
> > 1[C]
> > Feb 23 12:12:37 pgsql1 pgpool: LOG:   pid 2632: notice_backend_error:
> > 1 fail over request from pid 2632
> > Feb 23 12:12:37 pgsql1 pgpool: LOG:   pid 1116: starting degeneration.
> > shutdown host pgsql2.freyatest.domain(5432)
> > Feb 23 12:12:37 pgsql1 pgpool: LOG:   pid 1116: execute command:
> > /var/lib/postgresql/8.3/main/pgpool-failover 1 pgsql2.freyatest.domain
> > 5432 /var/lib/postgres
> > ql/8.3/main 0 0
> > Feb 23 12:12:37 pgsql1 pgpool[2887]: Executing pgpool-failover as  
> > user postgres
> > Feb 23 12:12:37 pgsql1 pgpool[2888]: Failover of node 1 at hostname
> > pgsql2.freyatest.domain. New master node is 0. Old master node was 0.
> > Feb 23 12:12:37 pgsql1 pgpool: LOG:   pid 1116: failover_handler: set
> > new master node: 0
> > Feb 23 12:12:37 pgsql1 pgpool: LOG:   pid 1116: failover done.
> > shutdown host pgsql2.freyatest.domain(5432)
> >
> >
> > Please note the slave node is being shut down, not the master node
> > (which is the one that was being reloaded).
> >
> > This is seen in the PostgreSQL log of the master node (the one being  
> > reloaded):
> >
> > [..]
> >
> > 2009-02-23 12:12:37 CET LOG:  statement: BEGIN;
> > 2009-02-23 12:12:37 CET LOG:  statement: INSERT INTO history (tid,
> > bid, aid, delta, mtime) VALUES (5, 1, 63021, -3004,
> > CURRENT_TIMESTAMP);
> > 2009-02-23 12:12:37 CET LOG:  statement: UPDATE accounts SET abalance
> > = abalance + 2213 WHERE aid = 82551;
> > 2009-02-23 12:12:37 CET LOG:  statement: SELECT abalance FROM accounts
> > WHERE aid = 82551;
> > 2009-02-23 12:12:37 CET LOG:  statement: UPDATE tellers SET tbalance =
> > tbalance + 2213 WHERE tid = 9;
> > 2009-02-23 12:12:37 CET LOG:  statement: UPDATE branches SET bbalance
> > = bbalance + 2213 WHERE bid = 1;
> > 2009-02-23 12:12:37 CET LOG:  incomplete startup packet
> > 2009-02-23 12:12:37 CET LOG:  statement: END;
> > 2009-02-23 12:12:37 CET LOG:  received SIGHUP, reloading  
> > configuration files
> >
> >
> > And this is from the PostgreSQL log on the slave node:
> >
> > [..]
> > 2009-02-23 12:12:37 CET LOG:  statement: BEGIN;
> > 2009-02-23 12:12:37 CET LOG:  statement: INSERT INTO history (tid,
> > bid, aid, delta, mtime) VALUES (5, 1, 63021, -3004,
> > CURRENT_TIMESTAMP);
> > 2009-02-23 12:12:37 CET LOG:  statement: UPDATE accounts SET abalance
> > = abalance + 2213 WHERE aid = 82551;
> > 2009-02-23 12:12:37 CET LOG:  statement: END;
> > 2009-02-23 12:12:37 CET LOG:  statement: UPDATE tellers SET tbalance =
> > tbalance + 2213 WHERE tid = 9;
> > 2009-02-23 12:12:37 CET LOG:  statement: UPDATE branches SET bbalance
> > = bbalance + 2213 WHERE bid = 1;
> > 2009-02-23 12:12:37 CET LOG:  statement: BEGIN;
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> > 2009-02-23 12:12:37 CET LOG:  unexpected EOF on client connection
> >
> > PCP says this:
> >
> > # pcp_node_info 5 postgresql 9898 root <password> 0
> > pgsql1.freyatest.domain 5432 2 1073741823.500000
> >
> > # pcp_node_info 5 postgresql 9898 root <password> 1
> > pgsql2.freyatest.domain 5432 3 1073741823.500000
> >
> > So it is, in fact, the slave node being degenerated, which was NOT
> > reloaded. I think this is because the error was triggered by the
> > directive "replication_stop_on_mismatch = true", which leads pgpool-II
> > to always staying with the master node.
> >
> > This is the output of the last command executed in my laptop:
> >
> > [..]
> > client 0 sending BEGIN;
> > client 1 receiving
> > client 1 sending INSERT INTO history (tid, bid, aid, delta, mtime)
> > VALUES (9, 1, 82551, 2213, CURRENT_TIMESTAMP);
> > client 0 receiving
> > Client 0 aborted in state 7: ERROR:  kind mismatch among backends.
> > Possible last query was: "BEGIN;" kind details are: 0[S] 1[C]
> > HINT:  check data consistency among db nodes
> > client 1 receiving
> > client 1 sending END;
> > client 1 receiving
> > Client 1 aborted in state 13. Probably the backend died while  
> > processing.
> > transaction type: TPC-B (sort of)
> > scaling factor: 1
> > number of clients: 2
> > number of transactions per client: 100
> > number of transactions actually processed: 32/200
> > tps = 16.542050 (including connections establishing)
> > tps = 16.646950 (excluding connections establishing)
> >
> >
> > After this happened, I copied the above logs from the consoles and
> > pasted them here. Then I stopped the PostgreSQL server on the slave
> > node and next I tried to recover the degenerated node:
> >
> > $ pcp_recovery_node 5 postgresql 9898 root <password> 1
> > BackendError
> >
> > And the process went fine, slave node is now part of the cluster.
> >
> > I've double-checked the PostgreSQL logs on the master node and it
> > always says "received SIGHUP, reloading configuration files", still
> > the above problem is happening.
> >
> > After this, I restarted the slave node. It happened that a user was
> > launching some queries while I did that, so the node went down. This
> > is the syslog of pgpool-II:
> >
> > Feb 23 12:26:05 pgsql1 pgpool: ERROR: pid 3346: pool_read: read failed
> > (Connection reset by peer)
> > Feb 23 12:26:05 pgsql1 pgpool: LOG:   pid 3346:
> > ProcessFrontendResponse: failed to read kind from frontend. frontend
> > abnormally exited
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 3343: notice_backend_error:
> > 1 fail over request from pid 3343
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 3369: notice_backend_error:
> > 1 fail over request from pid 3369
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 1116: starting degeneration.
> > shutdown host pgsql2.freyatest.domain(5432)
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 1116: execute command:
> > /var/lib/postgresql/8.3/main/pgpool-failover 1 pgsql2.freyatest.domain
> > 5432 /var/lib/postgres
> > ql/8.3/main 0 0
> > Feb 23 12:29:27 pgsql1 pgpool[3422]: Executing pgpool-failover as  
> > user postgres
> > Feb 23 12:29:27 pgsql1 pgpool[3423]: Failover of node 1 at hostname
> > pgsql2.freyatest.domain. New master node is 0. Old master node was 0.
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 1116: failover_handler: set
> > new master node: 0
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 1116: failover done.
> > shutdown host pgsql2.freyatest.domain(5432)
> > Feb 23 12:29:27 pgsql1 pgpool: LOG:   pid 1116: failover: no backends
> > are degenerated
> >
> >
> > Okay, here we go again. I stopped the slave PostgreSQL and requested
> > an online recovery, which failed. This is the syslog:
> >
> > Feb 23 12:33:36 pgsql1 pgpool: LOG:   pid 3017: starting recovering  
> > node 1
> > Feb 23 12:33:36 pgsql1 pgpool: LOG:   pid 3017: CHECKPOINT in the 1st
> > stage done
> > Feb 23 12:33:36 pgsql1 pgpool: LOG:   pid 3017: starting recovery
> > command: "SELECT pgpool_recovery('base-backup',
> > 'pgsql2.freyatest.domain', '/var/lib/postgresql/8.3/main')"
> > Feb 23 12:33:36 pgsql1 pgpool[3508]: Executing base-backup as user  
> > postgres
> > Feb 23 12:33:36 pgsql1 pgpool[3509]: Executing pg_start_backup
> > Feb 23 12:33:37 pgsql1 pgpool[3512]: Creating file recovery.conf
> > Feb 23 12:33:37 pgsql1 pgpool[3513]: Rsyncing directory base
> > Feb 23 12:33:37 pgsql1 pgpool[3516]: Rsyncing directory global
> > Feb 23 12:33:37 pgsql1 pgpool[3519]: Rsyncing directory pg_clog
> > Feb 23 12:33:37 pgsql1 pgpool[3522]: Rsyncing directory pg_multixact
> > Feb 23 12:33:37 pgsql1 pgpool[3525]: Rsyncing directory pg_subtrans
> > Feb 23 12:33:37 pgsql1 pgpool[3528]: Rsyncing directory pg_tblspc
> > Feb 23 12:33:37 pgsql1 pgpool[3531]: Rsyncing directory pg_twophase
> > Feb 23 12:33:37 pgsql1 pgpool[3534]: Rsyncing directory pg_xlog
> > Feb 23 12:33:40 pgsql1 pgpool[3537]: Rsyncing file recovery.conf (with
> > source deletion)
> > Feb 23 12:33:40 pgsql1 pgpool[3540]: Executing pg_stop_backup
> > Feb 23 12:33:41 pgsql1 pgpool: LOG:   pid 3017: 1st stage is done
> > Feb 23 12:33:41 pgsql1 pgpool: LOG:   pid 3017: starting 2nd stage
> > Feb 23 12:34:11 pgsql1 pgpool: LOG:   pid 3444: pool_process_query:
> > child connection forced to terminate due to
> > client_idle_limit_in_recovery(30) reached
> > Feb 23 12:34:11 pgsql1 pgpool: LOG:   pid 3425: pool_process_query:
> > child connection forced to terminate due to
> > client_idle_limit_in_recovery(30) reached
> > Feb 23 12:34:14 pgsql1 pgpool: ERROR: pid 3017:
> > wait_connection_closed: existing connections did not close in 30 sec.
> > Feb 23 12:34:14 pgsql1 pgpool: ERROR: pid 3017: start_recovery:
> > timeover for waiting connection closed
> >
> >
> > Users were using the database while I did the recovery which, in my
> > opinion, made it fail. So I tried again, another recovery, and it
> > failed again. Same error.
> >
> > And I tried again, and in this occasion it succeeded. There were no
> > connections or queries on the PostgreSQL log while it happened, but I
> > am still doubting about the real problem.
> >
> > So I went for something else: I repeated all the tests with pgbench
> > and, again, did a reload of the master node while under load, but
> > there were no changes on the postgresql.conf. And it failed again: the
> > slave node was degenerated.
> >
> > I stopped the PostgreSQL in the slave node immediately and requested
> > an online recovery. Luckily, no connections or queries were sent to
> > the cluster while doing the recovery, and it went fine.
> >
> > Conclusion: I am still not sure what makes it fail, but it does. I am
> > almost capable of reproducing the error, but it does not happen always
> > (7 times out of 10, more or less). There is the "reloading while under
> > load" issued, and there is the "not being able to recover the node"
> > issued.
> >
> > Tatsuo, could you please try to do what I do here on your
> > computer/servers and see what happens? I could understand that the
> > /etc/init.d/postgresql-8.3 script of Debian does more than reloading,
> > but the line in the PostgreSQL log says otherwise. And there is also
> > this "wait_connection_closed: existing connections did not close in 30
> > sec" error which I don't understand :'(
> >
> > Shall you need any more help with this, any more information, tests,
> > whatever, please let me know.
> >
> > Thanks in advance.
> >
> > -- 
> > Jaume Sabater
> > http://linuxsilo.net/
> >
> > "Ubi sapientas ibi libertas"
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> 
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general
-------------- next part --------------
Index: pool_process_query.c
===================================================================
RCS file: /cvsroot/pgpool/pgpool-II/pool_process_query.c,v
retrieving revision 1.140
retrieving revision 1.141
diff -c -r1.140 -r1.141
*** pool_process_query.c	20 Feb 2009 01:34:15 -0000	1.140
--- pool_process_query.c	24 Feb 2009 02:20:40 -0000	1.141
***************
*** 1,6 ****
  /* -*-pgsql-c-*- */
  /*
!  * $Header: /cvsroot/pgpool/pgpool-II/pool_process_query.c,v 1.140 2009/02/20 01:34:15 t-ishii Exp $
   *
   * pgpool: a language independent connection pool server for PostgreSQL 
   * written by Tatsuo Ishii
--- 1,6 ----
  /* -*-pgsql-c-*- */
  /*
!  * $Header: /cvsroot/pgpool/pgpool-II/pool_process_query.c,v 1.141 2009/02/24 02:20:40 t-ishii Exp $
   *
   * pgpool: a language independent connection pool server for PostgreSQL 
   * written by Tatsuo Ishii
***************
*** 3434,3444 ****

  		if (VALID_BACKEND(i))
  		{
! 			if (pool_read(CONNECTION(backend, i), &kind, 1) < 0)
  			{
! 				pool_error("pool_process_query: failed to read kind from %d th backend", i);
! 				return POOL_ERROR;
! 			}

  			kind_list[i] = kind;

--- 3434,3472 ----

  		if (VALID_BACKEND(i))
  		{
! 			do
  			{
! 				char *p, *value;
! 				int len;
! 
! 				if (pool_read(CONNECTION(backend, i), &kind, 1) < 0)
! 				{
! 					pool_error("read_kind_from_backend: failed to read kind from %d th backend", i);
! 					return POOL_ERROR;
! 				}
! 
! 				/*
! 				 * Read and discard parameter status
! 				 */
! 				if (kind != 'S')
! 				{
! 					break;
! 				}
! 					
! 				if (pool_read(CONNECTION(backend, i), &len, sizeof(len)) < 0)
! 				{
! 					pool_error("read_kind_from_backend: failed to read parameter status packet length from %d th backend", i);
! 					return POOL_ERROR;
! 				}
! 				len = htonl(len) - 4;
! 				p = pool_read2(CONNECTION(backend, i), len);
! 				if (p == NULL)
! 				{
! 					pool_error("read_kind_from_backend: failed to read parameter status packet from %d th backend", i);
! 				}
! 				value = p + strlen(p) + 1;
! 				pool_log("read_kind_from_backend: parameter name: %s value: %s", p, value);
! 			} while (kind == 'S');

  			kind_list[i] = kind;

***************
*** 3502,3508 ****
  			if (kind_list[i] != 0 && trust_kind != kind_list[i])
  			{
  				/* degenerate */
! 				pool_error("pool_process_query: %d th kind %c does not match with master or majority connection kind %c",
  						   i, kind_list[i], trust_kind);
  				degenerate_node[degenerate_node_num++] = i;
  			}
--- 3530,3536 ----
  			if (kind_list[i] != 0 && trust_kind != kind_list[i])
  			{
  				/* degenerate */
! 				pool_error("read_kind_from_backend: %d th kind %c does not match with master or majority connection kind %c",
  						   i, kind_list[i], trust_kind);
  				degenerate_node[degenerate_node_num++] = i;
  			}