[Pgpool-general] SQL sessions die when a node fails

Tatsuo Ishii ishii at sraoss.co.jp
Mon Jul 13 00:28:38 UTC 2009


> On Mon, Jul 6, 2009 at 3:22 AM, Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> 
> > > I am studying the source code of pgpool trying to add such a
> > functionality
> > > through an option in pgpool.conf (failover_command = false). If this
> > option
> > > = true, in the case that a node was degraded, anyways continue to execute
> > > queries on still alive connections to still alive nodes.. This would
> > > guarantee applications yet to work even if one db node falls..
> > >
> > > But, first, I have to ask again: was not that functionality included in
> > > pgpool yet?
> > >
> > > Also, I ask you for help with this task, as I am not very much skilled
> > with
> > > developing, and there are many things that I dont yet understand in the
> > > code. Some guidelines would save me lots of time.. So, I will subscribe
> > to
> > > pgpool-hackers mail list..
> > > Is it a really difficult feature to implement?
> >
> > Yes, it's pretty hard. The reason includes:
> >
> > - Any query involves number of interactions against PostgreSQL. If you
> >  want to restart the session, you need to remember which data has
> >  been already sent to client and which data has not.
> >
> 
> In case described, we do not want to RESTART the session but rather continue
> the session with the functioning (not failed)  postgress nodes.
> 
> So unless we miss something, we are not clear as to what you refer to by
> "restart the session" - why restart, and which session? (client to pgpool?
> pgpool to active nodes? pgpool to failed nodes?)

1) PostgreSQL fails and stops working while returning a query result
2) pgpool detects #1 and tries to continue the session between client
   and pgpool
3) #2 is difficult because of the reason I mentioned

> > - Degeneration can be happen by just network cable pulled out. In this
> >  case pgpool can detect the failure by using timeout and starts
> >  degeration. However the kernel keeps on retrying with a hope the
> >  cable reconnected again. The only thing we can do is just restart
> >  the pgpool process.
> >
> 
> Wouldn't just closing the failed socket be enough? We do not expect pgpool
> to recover  the failed node automatically if and when it comes back - it's
> enough if the recovery shall be invoked by an admin using the exisiting
> manual scripting.

Is it possible to close the socket while the kernel is retrying? I
doubt.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> > Any ideas to solve these issues are welcome.
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> >
> > > Thanks to Tatsuo and his team for this art of coding piece called pgpool!
> > :)
> > > Sincerely,
> > > J. Carlos Muro
> > >
> > > 2009/6/26 Jeroen Nijhof <jeroenn at sara.nl>
> > >
> > > > Hi,
> > > >
> > > > If I may suggest, is it possible to let the sql session to pgpool still
> > > > alive and only close the sql session to the failed node behind pgpool.
> > In
> > > > that way the client won't notice a postgresql server failed.
> > > >
> > > > Regards,
> > > >
> > > > --
> > > > Jeroen Nijhof
> > > > SARA ICT Services
> > > > Email: jeroen.nijhof at sara.nl
> > > > Telefoon: 036-5238080
> > > > Fax: 036-5238030
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: pgpool-general-bounces at pgfoundry.org [mailto:
> > > > pgpool-general-bounces at pgfoundry.org] On Behalf Of Tatsuo Ishii
> > > > Sent: vrijdag 26 juni 2009 7:11
> > > > To: murojc at gmail.com
> > > > Cc: pgpool-general at pgfoundry.org
> > > > Subject: Re: [Pgpool-general] SQL sessions die when a node fails
> > > >
> > > > Yes, it's pretty normal. What behavior are you expecting?
> > > > --
> > > > Tatsuo Ishii
> > > > SRA OSS, Inc. Japan
> > > >
> > > > > Hi! If I shutdown a postgresql node while I am in a sql session, it
> > will
> > > > > stop working. Is that normal?
> > > > >
> > > > > I have the next scenario:
> > > > >   vapp1a: server with pgpool, listening port 5431
> > > > >   vpgsql1a: server with postgresql, backend 0
> > > > >   vpgsql1b: server with postgresql, backend 1
> > > > >
> > > > > Setups for pgpool:
> > > > >     replication_mode = true
> > > > >     load_balance_mode = true
> > > > >     replicate_select = true
> > > > >
> > > > > Pgpool start logs look this way:
> > > > >   Jun 25 18:31:49 vapp1a pgpool: 2009-06-25 18:31:41 DEBUG: pid
> > 30820:
> > > > > num_backends: 2 num_backends: 2 total_weight: 2.000000
> > > > >   Jun 25 18:31:49 vapp1a pgpool: 2009-06-25 18:31:41 DEBUG: pid
> > 30820:
> > > > > backend 0 weight: 1073741823.500000
> > > > >   Jun 25 18:31:49 vapp1a pgpool: 2009-06-25 18:31:41 DEBUG: pid
> > 30820:
> > > > > backend 1 weight: 1073741823.500000
> > > > >   [....]
> > > > >   Jun 25 18:31:51 app1a pgpool: 2009-06-25 18:31:42 LOG:   pid 30820:
> > > > pgpool
> > > > > successfully started
> > > > >   Jun 25 18:31:51 app1a pgpool: 2009-06-25 18:31:42 DEBUG: pid 30861:
> > I
> > > > am
> > > > > PCP 30861
> > > > >
> > > > > I log-in to the database through  pgpool and execute some inserts:
> > > > >   pgpool at vapp1:~$ psql -h localhost -p 5431 -U auser mybase
> > > > >   mybase=> create table test (id integer);
> > > > >   mybase=> insert into test (1);
> > > > >
> > > > > I see that row on both vpgsql1a and vpgsql1b. I won't exit the psql
> > > > session
> > > > > yet...
> > > > >
> > > > > Now I stop backend 1 and get the next pgpool logs:
> > > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30860:
> > > > > detect_error: kind: E
> > > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30860:
> > > > > detect_stop_postmaster_error: receive admin shutdown error from a
> > node.
> > > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> > 30860:
> > > > > notice_backend_error: 1 fail over request from pid 30860
> > > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > > failover_handler called
> > > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > > failover_handler: starting to select new master node
> > > > >   Jun 25 18:36:59 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> > 30820:
> > > > > starting degeneration. shutdown host vpgsql1b(5432)
> > > > >   Jun 25 18:36:59 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > > failover_handler: kill 30829
> > > > >   [....]
> > > > >   Jun 25 18:37:00 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> > 30820:
> > > > > failover_handler: set new master node: 0
> > > > >   Jun 25 18:37:00 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30852:
> > > > child
> > > > > receives shutdown request signal 3
> > > > >   [....]
> > > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> > 30820:
> > > > > failover done. shutdown host vpgsql1b(5432)
> > > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > > reap_handler called
> > > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > > reap_handler: call wait3
> > > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > child
> > > > > 30829 exits with status 0 by signal 0
> > > > >   [....]
> > > > >   Jun 25 18:37:06 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> > 30820:
> > > > > reap_handler: normally exited
> > > > >
> > > > > In the psql session I try to insert a new row, getting:
> > > > >   mybase=> insert into kk values (2);
> > > > >   server closed the connection unexpectedly
> > > > >           This probably means the server terminated abnormally
> > > > >           before or while processing the request.
> > > > >   The connection to the server was lost. Attempting reset: Succeeded.
> > > > >
> > > > > Now I exit psql, and log-in again, now successfully inserting a new
> > row:
> > > > >   pgpool at vapp1:~$ psql -h localhost -p 5431 -U auser mybase
> > > > >   mybase=> create table test (id integer);
> > > > >   mybase=> insert into test (2);
> > > > >   INSERT 0 1
> > > > >
> > > > > Is that normal?
> > > > > Thanks in advance,
> > > > > J. Carlos Muro
> > > > _______________________________________________
> > > > Pgpool-general mailing list
> > > > Pgpool-general at pgfoundry.org
> > > > http://pgfoundry.org/mailman/listinfo/pgpool-general
> > > >
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> >


More information about the Pgpool-general mailing list