[Pgpool-hackers] Major problem in find_primary_node()

Tatsuo Ishii ishii at sraoss.co.jp
Fri Jan 21 00:10:52 UTC 2011


> Le 20/01/2011 09:42, Tatsuo Ishii a écrit :
>>> Hi,
>>>
>>> I've found something critical into the way that find_primary_node()
>>> detect if a node is a standby host and by deduction the primary node in
>>> streaming replication mode.
>>>
>>> When it call "SELECT pg_is_in_recovery() AND pgpool_walrecrunning()" it
>>> always return false if pg_is_in_recovery = 'true' and
>>> pgpool_walrecrunning = 'false'. That mean that the host can be promoted
>>> as master, the situation will going worse if a failover occurs and
>>> PgPool takes this fake host as the master.
>>>
>>> This is not a problem that can occurs easily but let check the following:
>>>
>>> - Master has failover to node 1.
>>> - Node 0 is reconstructed but the primary_conninfo host in recovery.conf
>>> is wrong/broken for some reason.
>>>
>>> At this stage, both nodes are running with PgPool status 2 and currently
>>> the only problem is that node 0 is not replicated from the master node.
>> This is not what observe. In this case I see node 0 is in status
>> 3(down). If conninfo is broken in recovery.conf, postmaster won't
>> start up and keeps on saying "the database system is starting
>> up". What did you make a mistake in recovery.conf exactly?
> I know that the problem is difficult to reproduce, especially if every
> administration task is well done.
> 
> What you can try is to reconstruct Node 0 with a valid recovery.conf,
> then status of the node will be 2, PgPool will know it's a standby
> server, node 1 is the master host and all it's ok at this point.
> 
> Next step, just modify the recovery.conf of node 0 with a wrong ip
> address/hostname and then restart PostgreSQL. Node 0 in pool_status
> should stay in state 2 as secondary node which is still right.

Wrong IP/hostname, ok, I will try it.

> Then run the the SQL query"SELECT pg_is_in_recovery() AND
> pgpool_walrecrunning();" in psql in node 0, you will see that it detect
> that this node is a master (by returning false) and not a standby
> server. So restarting PgPool without the pgpool_status here will break
> everything.
> 
> Note that you don't need to to a failover, you can simply change the
> recovery.conf on any secondary node, restart the PostgreSQL daemon and
> the run the SQL query on it. The final case problem can be easily deduce
> from here.
> 
> I'm agree that this is something that will not happen every day in real
> world with the actual way of master promoting scheme (but it can happen).
> 
> Using "SELECT pg_is_in_recovery() = 'f' AND pgpool_walrecrunning() =
> 'f';" fix that problem because it really check if the node is not in
> standby mode.
> 
> As I'm working on a patch to promote any node in the PgPool line as
> master node, this problem can happen more often when PgPool restart has
> it will not stop at the first node not in standby mode (which the actual
> query can falsely introduce) but at any node in the line that is really
> not in recovery mode.
> 
> The final goal is to promote the master that has the less replication
> lags, so it can be any node in the line.
> 
> Hope it's clear enough now.
>>> Then if a failover from node 1 occurs, PgPool will try to find a new
>>> master by querying "SELECT pg_is_in_recovery() AND
>>> pgpool_walrecrunning()" to all nodes, but as node 0 will return false
>>> because the wal receiver is not running, PgPool will take it as the new
>>> master, and very bad annoying things will happen from this point.
>>>
>>> What I suggest is to try to find the master by finding a node that is
>>> not in recovery mode and where wal receiver is not running. This will
>>> allow a simple fix by changing the SQL query :
>>>
>>> SELECT  pg_is_in_recovery() = 'f' AND pgpool_walrecrunning() = 'f';
>>>   or
>>> SELECT not pg_is_in_recovery() AND not pgpool_walrecrunning();
>>>
>>> What's your opinion, if every one is ok with that fix I already have a
>>> patch for that.
>>>
>>> Regards,
>>>
>>> -- 
>>> Gilles Darold
>>> http://dalibo.com - http://dalibo.org
>>>
>>> _______________________________________________
>>> Pgpool-hackers mailing list
>>> Pgpool-hackers at pgfoundry.org
>>> http://pgfoundry.org/mailman/listinfo/pgpool-hackers
> 
> 
> -- 
> Gilles Darold
> http://dalibo.com - http://dalibo.org
> 


More information about the Pgpool-hackers mailing list