[Pgpool-general] What are the causes of pgpool failover

Tue Dec 12 17:29:47 UTC 2006

> > Hi there!
> >
> > In my web server farm, I install pgpool in each web server ( 32
> > servers in total) and set configuration as replication without
> > load-balance.
> > And I turn on the health check function, check every 5 seconds, and
> > timeout is 30 seconds.
>
> I suggest health_check_period > health_check_timeout.

I'll modify my configuration with your suggestion, thanks.  :)

>
> > The web server farm has run for more than 2 months, and it seems
> > everything is OK, but I'm still worried that the pgpool health check
> > status is different at every web server.
> > I run some testing program and find the pgpool health check status
> > remains always the same at each web server.  :)
>
> Can you be more specific about "the pgpool health check > status is
> different at every web server"? Maybe you find something regarding
> health checking in the pgpool's log file?

My pgpools on each web server run very fine, I just want to know
possible problem in advance.

The meaning of I say is:
Is it possible just one ( or some ) pgpool(s) turn to degeneration mode?

>
> > Here, I'd like to raise three questions.
> > Firstly, I want to know what causes will let pgpool turn to degeneration mode.
> > Maybe I can do someting to monitor those causes on each server.
>
> Needless to say, if health checking detects something wrong it cause
> the degeneration.
> Second one is pgpool detecting network errors while .
> Unless you turn repreplication_stop_on_mismatch to true, that's all.

The default value of replication_stop_on_mismatch is false, and
http://pgfoundry.org/pipermail/pgpool-general/2005-August/000137.html
point out the reason.
In other words, pgpool's data difference detection maybe raise false
alarm, even the data is consistent between two databases, right?

And, in my framework, which value is better ? true or false ?

>
> > The second question is how many postgresql's folders/files must be
> > synchronized after degeneration mode?
> > All the data folders/files ? Or just some specific folders/files?
>
> Theoretically any file related to your queries in the database cluster
> (i.e. under "data" directory). For example, if you are going to make a
> query for table foo in DB bar, you need to sync data/base/bar/foo and
> foo's index file. Moreover system catalogs related to foo must be in
> sync. But I don't think it's doable above. More realistic way is, sync
> tables you are interested in by using PostgreSQL's standard tools
> sunch as COPY. If you are not sure which table you are going to use, I
> recommend to sync everything under data directory. This is the safest
> way.

I use rsync to synchronize the whole data directory between two
databases in my recovery program.
I also run the recovery program in my test environment, it seems
everything is ok.   :)

>
> > The third question is as follows.
> > I once had the two db have different data on purpose, and the pgpool
> > notified me that the data were mismatched between two db servers, and
> > interrupted the request. (1) How does the pgpool check it?
> > (2) I also want to know If the check function is trustworthy. Because
> > even some pgpool turn into  degeneration mode while others do not,
> > when the pgpool tell me the data were mismatched, I can still perform
> > fail back procedure immediately.
>
> pgpool detects data mismatch by inspecting "packet kind".
>
> For SELECT, data returned by PostgreSQL consists of stream of
> packets. Each packet corresponds to a row and the "final" packet
> follows. If number of rows returned from each server, pgpool will
> receive the final packet from a server while the other server still
> sends data packet.
>
> Another case is the result of the query. For example the result of a
> query represented by packet kind. If a server returns the normal
> packet while the other returns the error packet, pgpool will detect
> data mismatch.

If I turn off the load-balance function in the pgpool, select statment
only sent to master server. Is the detection still workable?

>
> > Thanks a million for your help!
> >
> > =======
> > Appendix:
> > Some configuration modified by me as below:
> > num_init_children = 32
> > replication_mode = true
> > health_check_timeout = 30
> > health_check_period = 5
> > log_statement = true
> > _______________________________________________
> > Pgpool-general mailing list
> > Pgpool-general at pgfoundry.org
> > http://pgfoundry.org/mailman/listinfo/pgpool-general
> >
>