[Pgpool-hackers] [Pgpool-general] SQL sessions die when a node fails

Tue Jul 7 08:56:36 UTC 2009

Hi! I will redirect this thread from now and on to pgpool-hackers
...<pgpool-hackers at pgfoundry.org>

2009/7/6 Tatsuo Ishii <ishii at sraoss.co.jp>

> > Is it a really difficult feature to implement?
>
> Yes, it's pretty hard. The reason includes:

I got it ;) I am anyways yet studying the code... Almost done with
multitasking/frontend/backend magement stuff.. After many years of no Unix
programming (and almost no programming at all), it was nice to come back to
battle.. :)
I dont know yet how, but I will be glad to contribute somehow to the project
if my time and skill allow me to do..

- Any query involves number of interactions against PostgreSQL. If you
>  want to restart the session, you need to remember which data has
>  been already sent to client and which data has not.

Yes, it is still difficult for me.. Well, let's see what future says..

- Degeneration can be happen by just network cable pulled out. In this
>  case pgpool can detect the failure by using timeout and starts
>  degeration. However the kernel keeps on retrying with a hope the
>  cable reconnected again. The only thing we can do is just restart
>  the pgpool process.

Well, pgpool should not have to be restarted unless there are "hard"
configuration changes or system failure (there where pgpool is running).
But, maybe you mean, in case that pgpool detects that a dead backend came
back to life, it will make the child processes die and start new ones (in
this case pgpool does it by itself)? Or do you mean that when a dead backend
came back to life we have to restart the whole pgpool with the intervention
of an administrator..?

So, to make me better understand: could we say that degeneration is executed
by "if (failover_request) failover();"? Can we say that when pgpool detects
wrong conditions in a certain backend (say, in the middel of a query) it
throws the USR1 signal to make the kernel failover, then degenerate faultry
backend?

Thanks!
J. Carlos Muro

> > Thanks to Tatsuo and his team for this art of coding piece called pgpool!
> :)
> > Sincerely,
> > J. Carlos Muro
> >
> > 2009/6/26 Jeroen Nijhof <jeroenn at sara.nl>
> >
> > > Hi,
> > >
> > > If I may suggest, is it possible to let the sql session to pgpool still
> > > alive and only close the sql session to the failed node behind pgpool.
> In
> > > that way the client won't notice a postgresql server failed.
> > >
> > > Regards,
> > >
> > > --
> > > Jeroen Nijhof
> > > SARA ICT Services
> > > Email: jeroen.nijhof at sara.nl
> > > Telefoon: 036-5238080
> > > Fax: 036-5238030
> > >
> > >
> > > -----Original Message-----
> > > From: pgpool-general-bounces at pgfoundry.org [mailto:
> > > pgpool-general-bounces at pgfoundry.org] On Behalf Of Tatsuo Ishii
> > > Sent: vrijdag 26 juni 2009 7:11
> > > To: murojc at gmail.com
> > > Cc: pgpool-general at pgfoundry.org
> > > Subject: Re: [Pgpool-general] SQL sessions die when a node fails
> > >
> > > Yes, it's pretty normal. What behavior are you expecting?
> > > --
> > > Tatsuo Ishii
> > > SRA OSS, Inc. Japan
> > >
> > > > Hi! If I shutdown a postgresql node while I am in a sql session, it
> will
> > > > stop working. Is that normal?
> > > >
> > > > I have the next scenario:
> > > >   vapp1a: server with pgpool, listening port 5431
> > > >   vpgsql1a: server with postgresql, backend 0
> > > >   vpgsql1b: server with postgresql, backend 1
> > > >
> > > > Setups for pgpool:
> > > >     replication_mode = true
> > > >     load_balance_mode = true
> > > >     replicate_select = true
> > > >
> > > > Pgpool start logs look this way:
> > > >   Jun 25 18:31:49 vapp1a pgpool: 2009-06-25 18:31:41 DEBUG: pid
> 30820:
> > > > num_backends: 2 num_backends: 2 total_weight: 2.000000
> > > >   Jun 25 18:31:49 vapp1a pgpool: 2009-06-25 18:31:41 DEBUG: pid
> 30820:
> > > > backend 0 weight: 1073741823.500000
> > > >   Jun 25 18:31:49 vapp1a pgpool: 2009-06-25 18:31:41 DEBUG: pid
> 30820:
> > > > backend 1 weight: 1073741823.500000
> > > >   [....]
> > > >   Jun 25 18:31:51 app1a pgpool: 2009-06-25 18:31:42 LOG:   pid 30820:
> > > pgpool
> > > > successfully started
> > > >   Jun 25 18:31:51 app1a pgpool: 2009-06-25 18:31:42 DEBUG: pid 30861:
> I
> > > am
> > > > PCP 30861
> > > >
> > > > I log-in to the database through  pgpool and execute some inserts:
> > > >   pgpool at vapp1:~$ psql -h localhost -p 5431 -U auser mybase
> > > >   mybase=> create table test (id integer);
> > > >   mybase=> insert into test (1);
> > > >
> > > > I see that row on both vpgsql1a and vpgsql1b. I won't exit the psql
> > > session
> > > > yet...
> > > >
> > > > Now I stop backend 1 and get the next pgpool logs:
> > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30860:
> > > > detect_error: kind: E
> > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30860:
> > > > detect_stop_postmaster_error: receive admin shutdown error from a
> node.
> > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> 30860:
> > > > notice_backend_error: 1 fail over request from pid 30860
> > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > > failover_handler called
> > > >   Jun 25 18:36:58 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > > failover_handler: starting to select new master node
> > > >   Jun 25 18:36:59 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> 30820:
> > > > starting degeneration. shutdown host vpgsql1b(5432)
> > > >   Jun 25 18:36:59 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > > failover_handler: kill 30829
> > > >   [....]
> > > >   Jun 25 18:37:00 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> 30820:
> > > > failover_handler: set new master node: 0
> > > >   Jun 25 18:37:00 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30852:
> > > child
> > > > receives shutdown request signal 3
> > > >   [....]
> > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 LOG:   pid
> 30820:
> > > > failover done. shutdown host vpgsql1b(5432)
> > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > > reap_handler called
> > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > > reap_handler: call wait3
> > > >   Jun 25 18:37:04 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > child
> > > > 30829 exits with status 0 by signal 0
> > > >   [....]
> > > >   Jun 25 18:37:06 vapp1a pgpool: 2009-06-25 18:36:58 DEBUG: pid
> 30820:
> > > > reap_handler: normally exited
> > > >
> > > > In the psql session I try to insert a new row, getting:
> > > >   mybase=> insert into kk values (2);
> > > >   server closed the connection unexpectedly
> > > >           This probably means the server terminated abnormally
> > > >           before or while processing the request.
> > > >   The connection to the server was lost. Attempting reset: Succeeded.
> > > >
> > > > Now I exit psql, and log-in again, now successfully inserting a new
> row:
> > > >   pgpool at vapp1:~$ psql -h localhost -p 5431 -U auser mybase
> > > >   mybase=> create table test (id integer);
> > > >   mybase=> insert into test (2);
> > > >   INSERT 0 1
> > > >
> > > > Is that normal?
> > > > Thanks in advance,
> > > > J. Carlos Muro
> > > _______________________________________________
> > > Pgpool-general mailing list
> > > Pgpool-general at pgfoundry.org
> > > http://pgfoundry.org/mailman/listinfo/pgpool-general
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pgfoundry.org/pipermail/pgpool-hackers/attachments/20090707/92781836/attachment.html>