[Pgpool-general] client_ilde_limit_in_recovery

Tue Nov 11 00:13:49 UTC 2008

Marcelo,

Could you apply follwing patches and try again so that we could more
debug info?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> hmm unfortunately it does not seem to be working for me or perhaps I   
> have a setting that you may not have turned on Tiago.
> 
> Basically I created one connection from my apache (10.1.101.152) going  
> to pgpool and then let it set idle.
> pgpool has two connections open to the backends 0 and 1 and I'm trying  
> to recover node 2.
> 
> It pretty much hangs on staging 2. On pgpool start up I do see that  
> the new setting is indeed turned on.
> What I think it's strange is that on the pgpool LOG I don't even see  
> it trying to kill the clients by checking the  
> client_idle_limit_in_recovery setting
> 
> 
> Could one of these settings be the problem ?
> 
> num_init_children = 70
> max_pool = 1
> child_life_time = 300
> connection_life_time = 300
> child_max_connections = 0
> client_idle_limit = 0
> authentication_timeout = 60
> replication_mode = true
> load_balance_mode = true
> replication_stop_on_mismatch = false
> connection_cache = true
> health_check_timeout = 20
> health_check_period = 15
> health_check_user = 'postgres'
> recovery_user = 'postgres'
> recovery_password = ''
> recovery_1st_stage_command = 'copy-base-backup'
> recovery_2nd_stage_command = 'pgpool_recovery_pitr'
> recovery_timeout = 90
> client_idle_limit_in_recovery = 10
> 
> 
> 
> 
> LOG:
> -----
> Nov 10 12:29:59 debian-db1 pgpool: 2008-11-10 12:29:59 LOG:   pid  
> 30668: pgpool successfully started
> Nov 10 12:29:59 debian-db1 pgpool: 2008-11-10 12:29:59 LOG:   pid  
> 30668: set 2 th backend down status
> Nov 10 12:29:59 debian-db1 pgpool: 2008-11-10 12:29:59 LOG:   pid  
> 30668: starting degeneration. shutdown host debian-db4(5432)
> Nov 10 12:30:00 debian-db1 pgpool: 2008-11-10 12:30:00 LOG:   pid  
> 30668: failover_handler: set new master node: 0
> Nov 10 12:30:01 debian-db1 pgpool: 2008-11-10 12:30:01 LOG:   pid  
> 30668: failover done. shutdown host debian-db4(5432)
> Nov 10 12:30:02 debian-db1 pgpool: 2008-11-10 12:30:02 LOG:   pid  
> 30821: ProcessFrontendResponse: failed to read kind from frontend.  
> frontend abnormally exited
> Nov 10 12:34:07 debian-db1 pgpool: 2008-11-10 12:34:07 LOG:   pid  
> 30741: starting recovering node 2
> Nov 10 12:34:08 debian-db1 pgpool: 2008-11-10 12:34:08 LOG:   pid  
> 30741: CHECKPOINT in the 1st stage done
> Nov 10 12:34:08 debian-db1 pgpool: 2008-11-10 12:34:08 LOG:   pid  
> 30741: starting recovery command: "SELECT pgpool_recovery('copy-base- 
> backup', 'debian-db4', '/var/lib/postgresql/8.3/main')"
> Nov 10 12:34:09 debian-db1 pgpool: 2008-11-10 12:34:09 LOG:   pid  
> 30741: 1st stage is done
> Nov 10 12:34:09 debian-db1 pgpool: 2008-11-10 12:34:09 LOG:   pid  
> 30741: starting 2nd stage
> Nov 10 12:34:17 debian-db1 pgpool: 2008-11-10 12:34:17 DEBUG: pid  
> 30668: starting health checking
> Nov 10 12:34:32 debian-db1 pgpool: 2008-11-10 12:34:32 DEBUG: pid  
> 30668: starting health checking
> Nov 10 12:34:47 debian-db1 pgpool: 2008-11-10 12:34:47 DEBUG: pid  
> 30668: starting health checking
> Nov 10 12:35:02 debian-db1 pgpool: 2008-11-10 12:35:02 DEBUG: pid  
> 30668: starting health checking
> Nov 10 12:35:17 debian-db1 pgpool: 2008-11-10 12:35:17 DEBUG: pid  
> 30668: starting health checking
> Nov 10 12:35:32 debian-db1 pgpool: 2008-11-10 12:35:32 DEBUG: pid  
> 30668: starting health checking
> Nov 10 12:35:42 debian-db1 pgpool: 2008-11-10 12:35:42 ERROR: pid  
> 30741: wait_connection_closed: existing connections did not close in  
> 90 sec.
> Nov 10 12:35:42 debian-db1 pgpool: 2008-11-10 12:35:42 ERROR: pid  
> 30741: start_recovery: timeover for waiting connection closed
> Nov 10 12:35:42 debian-db1 pgpool: 2008-11-10 12:35:42 DEBUG: pid  
> 30741: pcp_child: received PCP packet type of service 'X'
> Nov 10 12:35:42 debian-db1 pgpool: 2008-11-10 12:35:42 DEBUG: pid  
> 30741: pcp_child: client disconnecting. close connection
> 
> 
> 
> 
> NETSTAT
> -------------
> tcp        0      0 0.0.0.0:5432            0.0.0.0:*                
> LISTEN     30668/pgpool
> tcp        0      0 10.1.100.213:5432       10.1.101.152:2936        
> ESTABLISHED30821/pgpool:
> tcp        0      0 10.1.100.213:1947       10.1.100.217:5432        
> ESTABLISHED30821/pgpool:
> tcp        1      0 127.0.0.1:4652          127.0.0.1:5432           
> CLOSE_WAIT 1973/python
> tcp        0      0 10.1.100.213:4814       10.1.100.116:5432        
> ESTABLISHED30821/pgpool:
> 
> 
> 
> 
> Thank you,
> Marcelo
> 
> 
> On Nov 8, 2008, at 9:05 AM, Tiago Macedo wrote:
> 
> > Hi,
> >
> > I've tested this and it works perfectly. Exactly what I needed.
> >
> > Thank you so much for your hard work,
> > Tiago Macedo
> >
> > On Fri, Nov 7, 2008 at 9:28 AM, Tatsuo Ishii <ishii at sraoss.co.jp>  
> > wrote:
> > Hi,
> >
> > I have implemented new directive "client_ilde_limit_in_recovery" per
> > discussion. This is usefull for on line recovery. From the doc:
> >
> >    client_ilde_limit_in_recovery
> >
> >           Similar to client_idle_limit but only takes effect in  
> > recovery 2nd
> >           stage. Disconnect the connection to a client being idle for
> >           client_idle_limit_in_recovery seconds since the last query  
> > has
> >           been sent.  This is usefull for preventing for pgpool  
> > recovery
> >           disturbed by a lazy client or TCP/IP connection between  
> > client and
> >           pgpool is accidentally down. The default value for
> >           client_idle_limit_in_recovery is 0, which means the  
> > functionality is turned
> >           off. You need to reload pgpool.conf if you change
> >           client_idle_limit_in_recovery.
> >
> > Note that now client_idle_limit only takes effect *other than* in the
> > second stage of recovery. I think this makes more sense. Please let me
> > know if you have any questions/comments.
> >
> > Thanks,
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> >
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> 
-------------- next part --------------
Index: pool_process_query.c
===================================================================
RCS file: /cvsroot/pgpool/pgpool-II/pool_process_query.c,v
retrieving revision 1.114
diff -c -r1.114 pool_process_query.c
*** pool_process_query.c	10 Nov 2008 00:58:24 -0000	1.114
--- pool_process_query.c	11 Nov 2008 00:12:18 -0000
***************
*** 376,381 ****
--- 376,386 ----
  				 (*InRecovery && pool_config->client_idle_limit_in_recovery > 0)) && fds == 0)
  			{
  				frontend_idle_count++;
+ 
+ 				pool_debug("idle count:%d InRecovery:%d client_idle_limit:%d client_idle_limit_in_recovery:%d",
+ 						   frontend_idle_count, pool_config->client_idle_limit,
+ 						   pool_config->client_idle_limit_in_recovery);
+ 
  				if (*InRecovery == 0 && (frontend_idle_count > pool_config->client_idle_limit))
  				{
  					pool_log("pool_process_query: child connection forced to terminate due to client_idle_limit(%d) reached", pool_config->client_idle_limit);