[Pgpool-hackers] Major problem in find_primary_node()

Fri Jan 21 02:02:10 UTC 2011

>> Le 20/01/2011 09:42, Tatsuo Ishii a écrit :
>>>> Hi,
>>>>
>>>> I've found something critical into the way that find_primary_node()
>>>> detect if a node is a standby host and by deduction the primary node in
>>>> streaming replication mode.
>>>>
>>>> When it call "SELECT pg_is_in_recovery() AND pgpool_walrecrunning()" it
>>>> always return false if pg_is_in_recovery = 'true' and
>>>> pgpool_walrecrunning = 'false'. That mean that the host can be promoted
>>>> as master, the situation will going worse if a failover occurs and
>>>> PgPool takes this fake host as the master.
>>>>
>>>> This is not a problem that can occurs easily but let check the following:
>>>>
>>>> - Master has failover to node 1.
>>>> - Node 0 is reconstructed but the primary_conninfo host in recovery.conf
>>>> is wrong/broken for some reason.
>>>>
>>>> At this stage, both nodes are running with PgPool status 2 and currently
>>>> the only problem is that node 0 is not replicated from the master node.
>>> This is not what observe. In this case I see node 0 is in status
>>> 3(down). If conninfo is broken in recovery.conf, postmaster won't
>>> start up and keeps on saying "the database system is starting
>>> up". What did you make a mistake in recovery.conf exactly?
>> I know that the problem is difficult to reproduce, especially if every
>> administration task is well done.
>> 
>> What you can try is to reconstruct Node 0 with a valid recovery.conf,
>> then status of the node will be 2, PgPool will know it's a standby
>> server, node 1 is the master host and all it's ok at this point.
>> 
>> Next step, just modify the recovery.conf of node 0 with a wrong ip
>> address/hostname and then restart PostgreSQL. Node 0 in pool_status
>> should stay in state 2 as secondary node which is still right.
> 
> Wrong IP/hostname, ok, I will try it.

I tried but failed to succeed in reproducing your problem.

Node 0 continues to print following errors(this is expected):

4491 2011-01-21 10:14:12 JST FATAL:  could not connect to the primary server: could not translate host name "foo" to address: Name or service not known

And failed to connect to node 0 postmaster. So pgpool regards it as down.

test=# show pool_nodes;
 node_id | hostname | port | status | lb_weight 
---------+----------+------+--------+-----------
 0       |          | 5433 | 3      | 0.500000
 1       |          | 5434 | 2      | 0.500000
(2 rows)

Is there anything I have to do more to reproduce your problem?

>> Then run the the SQL query"SELECT pg_is_in_recovery() AND
>> pgpool_walrecrunning();" in psql in node 0, you will see that it detect
>> that this node is a master (by returning false) and not a standby
>> server. So restarting PgPool without the pgpool_status here will break
>> everything.
>> 
>> Note that you don't need to to a failover, you can simply change the
>> recovery.conf on any secondary node, restart the PostgreSQL daemon and
>> the run the SQL query on it. The final case problem can be easily deduce
>> from here.
>> 
>> I'm agree that this is something that will not happen every day in real
>> world with the actual way of master promoting scheme (but it can happen).
>> 
>> Using "SELECT pg_is_in_recovery() = 'f' AND pgpool_walrecrunning() =
>> 'f';" fix that problem because it really check if the node is not in
>> standby mode.
>> 
>> As I'm working on a patch to promote any node in the PgPool line as
>> master node, this problem can happen more often when PgPool restart has
>> it will not stop at the first node not in standby mode (which the actual
>> query can falsely introduce) but at any node in the line that is really
>> not in recovery mode.
>> 
>> The final goal is to promote the master that has the less replication
>> lags, so it can be any node in the line.
>> 
>> Hope it's clear enough now.
>>>> Then if a failover from node 1 occurs, PgPool will try to find a new
>>>> master by querying "SELECT pg_is_in_recovery() AND
>>>> pgpool_walrecrunning()" to all nodes, but as node 0 will return false
>>>> because the wal receiver is not running, PgPool will take it as the new
>>>> master, and very bad annoying things will happen from this point.
>>>>
>>>> What I suggest is to try to find the master by finding a node that is
>>>> not in recovery mode and where wal receiver is not running. This will
>>>> allow a simple fix by changing the SQL query :
>>>>
>>>> SELECT  pg_is_in_recovery() = 'f' AND pgpool_walrecrunning() = 'f';
>>>>   or
>>>> SELECT not pg_is_in_recovery() AND not pgpool_walrecrunning();
>>>>
>>>> What's your opinion, if every one is ok with that fix I already have a
>>>> patch for that.
>>>>
>>>> Regards,
>>>>
>>>> -- 
>>>> Gilles Darold
>>>> http://dalibo.com - http://dalibo.org
>>>>
>>>> _______________________________________________
>>>> Pgpool-hackers mailing list
>>>> Pgpool-hackers at pgfoundry.org
>>>> http://pgfoundry.org/mailman/listinfo/pgpool-hackers
>> 
>> 
>> -- 
>> Gilles Darold
>> http://dalibo.com - http://dalibo.org
>> 
> _______________________________________________
> Pgpool-hackers mailing list
> Pgpool-hackers at pgfoundry.org
> http://pgfoundry.org/mailman/listinfo/pgpool-hackers