[Pgpool-general] Detecting type of failure

Daniel.Crespo at l-3com.com Daniel.Crespo at l-3com.com
Fri Dec 19 20:32:26 UTC 2008


Hi,

I think most people is going to be interested on this, so I hope I can
find someone in this list who is interested on this as well, so we can
find out a solution together.

I'm actively pursuing a successful installation of two servers with an
instance of a:
   - Custom Server Application (AP)
   - pgpool
   - postgresql
on each server.

The diagram below explains the connections

    Server A                   Server B
+--------------+           +--------------+
|     AP0      |           |     AP1      |
|      |       |           |      |       |
|      V       |           |      V       |
|   pgpool0----|---.   .---|---pgpool1    |
|      |       |    \ /    |      |       |
|      V       |     X     |      V       |
|     DB0 <----|----' '----|---> DB1      |
+--------------+           +--------------+
  172.10.10.2                172.10.10.3

In each pgpool.conf I have:

backend_hostname0 = '172.10.10.2'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/pgsql/data'
backend_hostname1 = '172.10.10.3'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/pgsql/data'


PROBLEM:
--------
The problem comeS when installing the servers, since the installation
must be done (at least in my case) one server at a time.

Server A:
---------
When I install AP0, pgpool0 and DB0 in Server A, Server B has no
installations at all, therefore, pgpool0 in Server A will start with
access to DB0 only.

#pcp_node_info 10 localhost 9898 postgres postgres 0
172.10.10.2 5432 2 1073741823.500000

#pcp_node_info 10 localhost 9898 postgres postgres 1
172.10.10.3 5432 3 1073741823.500000

Server B:
---------
When I install AP1, pgpool1 and DB1 in Server B, Server A will be ready,
specially DB0. In this case, when we start services in Server B, pgpool
will successfully connect to DB0 and DB1.

#pcp_node_info 10 localhost 9898 postgres postgres 0
172.10.10.2 5432 2 1073741823.500000

#pcp_node_info 10 localhost 9898 postgres postgres 1
172.10.10.3 5432 2 1073741823.500000

The goal, of course, is to have both pgpool0 and pgpool1 connected to
both DB0 and DB1. With "connected" I mean "connection status" of a
backend is 0, 1, or 2. A connection status of 3 means "disconnected".
This can be seen by running:

Possible solutionS:
------------------
1. Install DB0 and DB1 first, then pgpool0 and pgpool1. The problem I
have is that I have to do a whole server first, then the other one.
2. After installing Server A and Server B, restart pgpool in Server A,
and therefore restart AP0 (if it does not handle reconnection).

Non of these are good enough to me. What I want is to install Server A,
then Server B, and have everything connected perfectly. For this, a
re-attach of DB1 must be done in pgpool0, but how to know when and
automatically?

An idea:
--------
Upon failover detection, failover_command is triggered by pgpool. This
can be used for:
	
	check periodically the failed backend (e.g. keep pinging, and
when there's positive response, try connecting to database).

	If failed backend is back (let's say you can do "select 1;" from
*this* server)
	{
		check if data is ok (e.g. count(*) on some indicative
tables on local pgpool connection against same quey on failed backend).
		
		If *this* pgpool is the one where the active Server
Application is running (e.g. AP0), then
		{
			If data [in step 2] is ok
			{
				call pcp_attach_command to attach the
failed node.
			}
			Otherwise
			{
				do pcp_recovery_command for the failed
node.
				[Somehow] restart pgpool on the other
server.
			}
		}
	}
	Otherwise, do nothing.

Some suggestions?

Thanks,
Daniel


More information about the Pgpool-general mailing list