[Pgpool-hackers] Major problem in find_primary_node()

Gilles Darold gilles.darold at dalibo.com
Thu Jan 20 10:08:49 UTC 2011


Le 20/01/2011 09:42, Tatsuo Ishii a écrit :
>> Hi,
>>
>> I've found something critical into the way that find_primary_node()
>> detect if a node is a standby host and by deduction the primary node in
>> streaming replication mode.
>>
>> When it call "SELECT pg_is_in_recovery() AND pgpool_walrecrunning()" it
>> always return false if pg_is_in_recovery = 'true' and
>> pgpool_walrecrunning = 'false'. That mean that the host can be promoted
>> as master, the situation will going worse if a failover occurs and
>> PgPool takes this fake host as the master.
>>
>> This is not a problem that can occurs easily but let check the following:
>>
>> - Master has failover to node 1.
>> - Node 0 is reconstructed but the primary_conninfo host in recovery.conf
>> is wrong/broken for some reason.
>>
>> At this stage, both nodes are running with PgPool status 2 and currently
>> the only problem is that node 0 is not replicated from the master node.
> This is not what observe. In this case I see node 0 is in status
> 3(down). If conninfo is broken in recovery.conf, postmaster won't
> start up and keeps on saying "the database system is starting
> up". What did you make a mistake in recovery.conf exactly?
I know that the problem is difficult to reproduce, especially if every
administration task is well done.

What you can try is to reconstruct Node 0 with a valid recovery.conf,
then status of the node will be 2, PgPool will know it's a standby
server, node 1 is the master host and all it's ok at this point.

Next step, just modify the recovery.conf of node 0 with a wrong ip
address/hostname and then restart PostgreSQL. Node 0 in pool_status
should stay in state 2 as secondary node which is still right.

Then run the the SQL query"SELECT pg_is_in_recovery() AND
pgpool_walrecrunning();" in psql in node 0, you will see that it detect
that this node is a master (by returning false) and not a standby
server. So restarting PgPool without the pgpool_status here will break
everything.

Note that you don't need to to a failover, you can simply change the
recovery.conf on any secondary node, restart the PostgreSQL daemon and
the run the SQL query on it. The final case problem can be easily deduce
from here.

I'm agree that this is something that will not happen every day in real
world with the actual way of master promoting scheme (but it can happen).

Using "SELECT pg_is_in_recovery() = 'f' AND pgpool_walrecrunning() =
'f';" fix that problem because it really check if the node is not in
standby mode.

As I'm working on a patch to promote any node in the PgPool line as
master node, this problem can happen more often when PgPool restart has
it will not stop at the first node not in standby mode (which the actual
query can falsely introduce) but at any node in the line that is really
not in recovery mode.

The final goal is to promote the master that has the less replication
lags, so it can be any node in the line.

Hope it's clear enough now.
>> Then if a failover from node 1 occurs, PgPool will try to find a new
>> master by querying "SELECT pg_is_in_recovery() AND
>> pgpool_walrecrunning()" to all nodes, but as node 0 will return false
>> because the wal receiver is not running, PgPool will take it as the new
>> master, and very bad annoying things will happen from this point.
>>
>> What I suggest is to try to find the master by finding a node that is
>> not in recovery mode and where wal receiver is not running. This will
>> allow a simple fix by changing the SQL query :
>>
>> SELECT  pg_is_in_recovery() = 'f' AND pgpool_walrecrunning() = 'f';
>>   or
>> SELECT not pg_is_in_recovery() AND not pgpool_walrecrunning();
>>
>> What's your opinion, if every one is ok with that fix I already have a
>> patch for that.
>>
>> Regards,
>>
>> -- 
>> Gilles Darold
>> http://dalibo.com - http://dalibo.org
>>
>> _______________________________________________
>> Pgpool-hackers mailing list
>> Pgpool-hackers at pgfoundry.org
>> http://pgfoundry.org/mailman/listinfo/pgpool-hackers


-- 
Gilles Darold
http://dalibo.com - http://dalibo.org



More information about the Pgpool-hackers mailing list