[Pgpool-general] pcp_attach_node problem?

Wed Jan 21 23:04:54 UTC 2009

Definitely, my box is not good: I tested UNinstalling pgpool-II-2.1 from
the current servers that I'm using, and INstalled the latest CVS version
(I got it today). Everything worked exactly as Marcelo said. Life is
good.

I have no idea what is wrong with my box, but exactly the same thing
happened when I transitioned from 2.0.1 to 2.1: I had to reimage. The
uninstallation process would work I don't know why. I did this same
uninstallation process on the servers I'm working on, and everything is
working.

Thanks a lot, Marcelo, for enlighten me.

Daniel

> -----Original Message-----
> From: pgpool-general-bounces at pgfoundry.org 
> [mailto:pgpool-general-bounces at pgfoundry.org] On Behalf Of 
> Daniel.Crespo at l-3com.com
> Sent: Wednesday, January 21, 2009 1:33 PM
> To: Marcelo Martins
> Cc: pgpool-general at pgfoundry.org
> Subject: Re: [Pgpool-general] pcp_attach_node problem?
> 
> 
> Thanks for your response, I really appreciate it.
> 
> > First, I don't really agree on just attaching a node back into the  
> > pool the manner your are doing with the steps shown below. If a  
> > postgreSQL backend node goes down, for some reason out of anyone's  
> > control, you should bring that node back into the pool by using  
> > online_recovery, that's why that mechanism is in place.
> > 
> > Now there are times that we may need to purposely take one of the  
> > postgreSQL backend nodes down, (I agree on that)  but when that is  
> > the  case one should have in place some maintenance 
> > procedures. There  
> > are several scenarios though depending on your setup.  You 
> > may need to  
> > keep your environment in read/write mode at all times which 
> > means you  
> > would use the pcp utilities to detach the PG node, do whatever you  
> > need to do and then use the pcp online recovery to bring that node  
> > back on the pool. (not pcp attach)
> > If you happen to be able to have your environment in 
> read-only mode  
> > then you could use the pcp detach to take the backend node 
> > out of the  
> > pool and then then use pcp attach to bring that node back 
> > into the pool.
> 
> I understand your point and that's what I think too. But my 
> example only
> shows unit testing.
> 
> My real case is as follows:
> 
> I have 2 or 4 server configuration.
> 
> 2-server configuration:
> Application and DB run in each server
> 
> 4-server configuration:
> Application run in two of the servers
> DB run in the other two servers.
> 
> Pgpool-II would run only in the server where the application 
> is running
> (a total of 2), but only one application would be active at a 
> time. The
> applications would always connect to localhost port 9999.
> 
> In any case, when we are installing the applications and DBs, it's
> always done one at a time (this is the procedure and can not currently
> be changed).
> 
> The worst case scenerio for pgpool is at installation time 
> with 2-server
> setup:
> 1. Install first server (App & pgpool and DB)
> 2. Install second server (App & pgpool and DB)
> 
> For changes to take effect, the installation reboots the server (don't
> ask me... It's the way it has been and takes a lot of time/money to
> replace this procedure). So imagine it:
> At the end of step 1, the system reboots. When it comes up, only the
> first of the two servers is up; the other one does not have even IP
> address set. Pgpool starts and sees that there's no secondary 
> database.
> With failover_command I trigger a script that would look for
> availability of the secondary database.
> At the end of step two, after rebooting, secondary server is up and
> running. Its pgpool will successfully connect to both databases since
> the first one is already up. However, the script running in 
> the primary
> server detects that there's the secondary database running (I 
> check for
> specific tables in the database, so I know it's up and ready 
> for running
> application requests). If specific data in tables are not the same
> between primary and secondary database for any reason, I will do
> *manual* pcp recovery; otherwise (which is the most likely to 
> happen at
> installation time since it has been just installed and both databases
> should have the same data), do pcp attach.
> 
> Why don't I do pcp recovery in all cases? Because pcp 
> recovery requires
> no connections from the application at the second stage of 
> the recovery.
> With the release that is working for me (2.1) I can not disconnect
> clients at second stage only (using 
> client_idle_limit_in_recovery in the
> latest copy of pgpool-II), so I need to close the application on
> purpose. Therefore, I need manual recovery. In this regard, 
> I'm going to
> re-image my development box and install a fresh latest CVS version of
> pgpool-II, because something funny like this happened when I went from
> 2.0.1 to 2.1, so... No clue. The thing is that I'm running 
> out of time.
> 
> In conclusion, it should not behave the way it does when I 
> disconnect a
> backend and do pcp attach after that.
> 
> > I have downloaded the latest CVS version and tried the 
> > following a few  
> > times and did not see any issues.
> 
> I'll push very hard to use it, starting with re-imaging my box.
> 
> > On your last step though, you mentioned that you "re-attached the  
> > primary" backend but I guess you meant the secondary backend since  
> > that was the one you stopped.
> 
> Yes, you are right: I meant 'sceondary'.
> 
> > Marcelo
> > PostgreSQL DBA
> > Linux/Solaris System Administrator
> 
> Thanks, Marcelo
> 
> Daniel
> 
> > 
> > On Jan 20, 2009, at 5:46 PM, Daniel.Crespo at l-3com.com wrote:
> > 
> > > I think the patch is for debugging purposes, but I'm not sure.
> > >
> > > The weird thing that happens to me is the following (I just 
> > tested it
> > > again):
> > >
> > > 1. The two backends start
> > > 2. start pgpool. So both backend statuses are 2.
> > > 3.a stop primary backend,
> > >    The connection is lost with the message "server closed the
> > > connection unexpectedly
> > >        This probably means the server terminated abnormally
> > >        before or while processing the request.
> > > The connection to the server was lost. Attempting reset: 
> > Succeeded.",
> > > every time I try to re-run the query.
> > >    If I re-attach the primary backend, the connection works 
> > just fine
> > > again.
> > > 3.b stop secondary backend.
> > >    The connection keeps going (good).
> > >    If I re-attach the primary backend, the connection blocks.
> > >
> > > It's weird
> > >
> > > Daniel
> > >
> > >
> > >> -----Original Message-----
> > >> From: Marcelo Martins [mailto:pglists at zeroaccess.org]
> > >> Sent: Tuesday, January 20, 2009 6:03 PM
> > >> To: Crespo, Daniel @ SDS
> > >> Cc: pgpool-general at pgfoundry.org
> > >> Subject: Re: [Pgpool-general] pcp_attach_node problem?
> > >>
> > >> yeah just saw your new one when sent mine :)
> > >>
> > >> weird  that it just keeps throwing that error.
> > >> I think I have done the PG shutdown and then 
> re-attaching about 15
> > >> times now and I only get the "server closed the connection
> > >> unexpectedly" once.
> > >>
> > >> I haven't tried to apply the patch that Tatsuo mentioned on 18th
> > >> though to see what difference it makes. might try that today
> > >>
> > >>
> > >> Marcelo
> > >> PostgreSQL DBA
> > >> Linux/Solaris System Administrator
> > >>
> > >> On Jan 20, 2009, at 4:52 PM, Daniel.Crespo at l-3com.com wrote:
> > >>
> > >>> Hi, Marcelo,
> > >>>
> > >>> I just wrote to the mail list something about exactly this.
> > >>>
> > >>> In your description, it doesn't happen to me... I don't 
> > know why...
> > >>> After doing failover, when a query is executed it 
> throws back that
> > >>> "server closed the connection unexpectedly", and keeps
> > >> throwing that
> > >>> for
> > >>> every try I make. No idea about this.
> > >>>
> > >>> Thanks for the information!
> > >>>
> > >>> Daniel
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Marcelo Martins [mailto:pglists at zeroaccess.org]
> > >>>> Sent: Tuesday, January 20, 2009 5:34 PM
> > >>>> To: Crespo, Daniel @ SDS
> > >>>> Subject: Re: [Pgpool-general] pcp_attach_node problem?
> > >>>>
> > >>>> Hi Daniel,
> > >>>>
> > >>>> I have just tested that with pgpool 2.1 and I also have the
> > >>>> same issue.
> > >>>> When I re-attach node 1 (second node) back, the psql
> > >>>> connection that I
> > >>>> had opened hangs  after executing a second query.
> > >>>>
> > >>>> ERROR: pid 31003: pool_read2: EOF encountered with backend
> > >>>>
> > >>>> On the latest CVS version though the hanging issue seems
> > >> to be fixed.
> > >>>> Now when the failover/failback happens though it seems 
> > like pgpool
> > >>>> failover_handler process kills the childs that pgpool 
> > had open with
> > >>>> node 1 (second node - at least that is what I can tell 
> > from what I
> > >>>> see ) therefore when a query is executed it throws back
> > >> that "server
> > >>>> closed the connection unexpectedly" . When I execute 
> the query a
> > >>>> second time then pgpool uses a new child that has connection
> > >>>> opened to
> > >>>> node 0 "new_connection: skipping slot 1 because 
> > backend_status = 3"
> > >>>>
> > >>>>
> > >>>> Marcelo
> > >>>> PostgreSQL DBA
> > >>>> Linux/Solaris System Administrator
> > >>>>
> > >>>> On Jan 13, 2009, at 8:18 AM, Daniel.Crespo at l-3com.com wrote:
> > >>>>
> > >>>>> Sorry for the delay, I haven't had enough time.
> > >>>>>
> > >>>>>> 1. Show us the logs. Full logs, but only the relevant
> > >>>> parts (got tons
> > >>>>>> of things to read every day here). :)
> > >>>>>
> > >>>>> I'll try it again with full logs to give them to you guys
> > >>>>>
> > >>>>>> 2. Check whether PostgreSQL is having some problem 
> of some sort
> > >>>>>> before
> > >>>>>> blaming it on pgpool-II. Can you run the same queries on
> > >> both nodes
> > >>>>>> and get the same results?
> > >>>>>
> > >>>>> PostgreSQL is not having any problems. It's not a 
> query problem.
> > >>>>> When I
> > >>>>> install the latest CVS head, what I showed to you is 
> > what happens.
> > >>>>> However, when I uninstall it and install the 2.1 released
> > >>>> version, it
> > >>>>> doesn't happen anymore. The problem with this 2.1 release
> > >> is that it
> > >>>>> doesn't keep the connection when a node is detached or
> > >>>> attached (if I
> > >>>>> have an already opened connection and do 
> attach/detach node, it
> > >>>>> locks. I
> > >>>>> must disconnect and reconnect in order to keep doing
> > >>>> queries). Another
> > >>>>> problem is that I need the insert lock newly introduced to
> > >>>>> automatically
> > >>>>> apply on serial fields tables.
> > >>>>>
> > >>>>>> 3. Check permissions in both bg_hba.conf files.
> > >>>>> No problem with this.
> > >>>>>
> > >>>>>> 4. Have you considered using version 8.3.5 of PostgreSQL
> > >>>> and see how
> > >>>>>> it goes? Or at least, the last revision of the 8.1 branch.
> > >>>>> No. I can not update PostgreSQL. I'm using 8.2.1.
> > >>>>>
> > >>>>> When I have the logs, I'll post them for sure. Thanks!
> > >>>>>
> > >>>>> Daniel
> > >>>>>
> > >>>>>
> > >>>>>> -----Original Message-----
> > >>>>>> From: pgpool-general-bounces at pgfoundry.org
> > >>>>>> [mailto:pgpool-general-bounces at pgfoundry.org] On Behalf Of
> > >>>>>> Jaume Sabater
> > >>>>>> Sent: Friday, January 09, 2009 2:32 AM
> > >>>>>> To: pgpool-general at pgfoundry.org
> > >>>>>> Subject: Re: [Pgpool-general] pcp_attach_node problem?
> > >>>>>>
> > >>>>>> On Thu, Jan 8, 2009 at 10:14 PM,
> > >> <Daniel.Crespo at l-3com.com> wrote:
> > >>>>>>
> > >>>>>>>     And issue a SQL Select command on a table, like:
> > >>>>>>>         postgres=# select * from pg_stat_activity ;
> > >>>>>>>
> > >>>>>>> It returns:
> > >>>>>>> postgres=# select 1;
> > >>>>>>> server closed the connection unexpectedly
> > >>>>>>>     This probably means the server terminated abnormally
> > >>>>>>>     before or while processing the request.
> > >>>>>>> The connection to the server was lost. Attempting reset:
> > >>>>>> Succeeded.
> > >>>>>>>
> > >>>>>>> postgres=# select 1;
> > >>>>>>
> > >>>>>> Some ideas:
> > >>>>>>
> > >>>>>> 1. Show us the logs. Full logs, but only the relevant
> > >>>> parts (got tons
> > >>>>>> of things to read every day here). :)
> > >>>>>> 2. Check whether PostgreSQL is having some problem 
> of some sort
> > >>>>>> before
> > >>>>>> blaming it on pgpool-II. Can you run the same queries on
> > >> both nodes
> > >>>>>> and get the same results?
> > >>>>>> 3. Check permissions in both bg_hba.conf files.
> > >>>>>> 4. Have you considered using version 8.3.5 of PostgreSQL
> > >>>> and see how
> > >>>>>> it goes? Or at least, the last revision of the 8.1 branch.
> > >>>>>>
> > >>>>>> -- 
> > >>>>>> Jaume Sabater
> > >>>>>> http://linuxsilo.net/
> > >>>>>>
> > >>>>>> "Ubi sapientas ibi libertas"
> > >>>>>> _______________________________________________
> > >>>>>> Pgpool-general mailing list
> > >>>>>> Pgpool-general at pgfoundry.org
> > >>>>>> http://pgfoundry.org/mailman/listinfo/pgpool-general
> > >>>>>>
> > >>>>> _______________________________________________
> > >>>>> Pgpool-general mailing list
> > >>>>> Pgpool-general at pgfoundry.org
> > >>>>> http://pgfoundry.org/mailman/listinfo/pgpool-general
> > >>>>
> > >>>>
> > >>
> > >>
> > 
> > 
> _______________________________________________
> Pgpool-general mailing list
> Pgpool-general at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-general
>