[pgpool-general: 3086] Re: Failover - new primary selection process?

Wed Jul 30 21:16:46 JST 2014

It can be extremely confusing to use the same terminology, I.e., "master" to refer to different things. I downloaded the sources for 3.3.3 yesterday and found this log line in the wd_lifecheck.c file. It appears at the beginning of the escalation process. At least on my system, I altered the message so as to not confuse my operations people.

Sent from my iPad

> On Jul 29, 2014, at 6:57 PM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> 
> On Tue, 29 Jul 2014 07:49:52 +0900 (JST)
> Tatsuo Ishii <ishii at postgresql.org> wrote:
> 
>>> Sorry for the delay in getting this to you, Ishii-san. All of my VM's
>>> were powered down after the chassis received a weekend upgrade.
>>> 
>>> The log message I was referring to is (from a failover test performed
>>> last week):
>>> 
>>> 2014-07-23 24:54:56 LOG: pid 70166: pgpool_down: I'm *oldest* so
>>> standing for master
>> 
>> I think "watchdog" subsystem of pgpool-II emits the message and I
>> believe "master" does not mean PostgreSQL master or primary node,
>> rather pgpool-II master of watchdog.
>> 
>> Yugo should be familiar with this. Yugo?
> 
> Yes, this message is of watchdog and means as exact as Ishii-san said.
> When watchdog is used, the oldest pgpool becomes "active pgpool", that
> is VIP holder. However, this is called "master pgpool" internally.
> 
>> 
>>> This is written almost immediately before ifconfig brings up the IP
>>> alias address and before the failover_command is executed. That's what
>>> led me to my comments, although as I admitted, I had not read the code
>>> in this area. If selection is just by sequential node ID, that's a
>>> little misleading.
>>> 
>>>> On 7/28/2014 7:05 AM, Tatsuo Ishii wrote:
>>>> What are you referring to error message? I do not see anything like
>>>> "I'm the oldest, so assuming the master role." message in pgpool-II
>>>> code anywhere.
>>>> 
>>>> Best regards,
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese:http://www.sraoss.co.jp
>>>> 
>>>>> Then, I hate to say it, Ishii-san, but your log message is a bit
>>>>> misleading. For instance, I always see a message during a failover
>>>>> where the standby claims, "I'm the oldest, so assuming the master
>>>>> role." So, maybe it should just "I have the lowest remaining node
>>>>> number, so I'm assuming the master role."  That wouldn't imply logic
>>>>> based on time, only sequences.
>>>>> 
>>>>> Sent from my iPad
>>>>> 
>>>>>> On Jul 28, 2014, at 3:20 AM, Tatsuo Ishii <ishii at postgresql.org>
>>>>>> wrote:
>>>>>> 
>>>>>> No. Pgpool-II has no idea which server has been running longer.
>>>>>> 
>>>>>> Pgpool-II has no built-in logic to choose next
>>>>>> to-be-primary-candidate. Which slave node will be promoted next time
>>>>>> is completely depending on your failover script. In general most
>>>>>> people specifies the candidate slave as the master node, which has the
>>>>>> yonguest node number among live slave nodes (%m in failover script).
>>>>>> Example: you have node 0 primary, node 1&2 standby.
>>>>>> 
>>>>>> 1) node 0 is down.
>>>>>> 
>>>>>> 2) node 1 is chosen as the next primary because it's the youngest live
>>>>>> node (master).
>>>>>> 
>>>>>> 3) node 1 is down.
>>>>>> 
>>>>>> 4) node 2 is chosen as the next primary because it's the youngest live
>>>>>> node (master).
>>>>>> 
>>>>>> Best regards,
>>>>>> --
>>>>>> Tatsuo Ishii
>>>>>> SRA OSS, Inc. Japan
>>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>>> Japanese:http://www.sraoss.co.jp
>>>>>> 
>>>>>>> Hey jay, thank you for the replay.
>>>>>>> Is this is a formal conclusion? Maybe the oldest is the one with the
>>>>>>> smaller ID ?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Sun, Jul 27, 2014 at 2:50 PM, <jayknowsunix at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Guy,
>>>>>>>> 
>>>>>>>> I run a master with 2 slaves. During a failover, I always see the
>>>>>>>> oldest
>>>>>>>> slave promoted to master. So, the selection is based on the server
>>>>>>>> which
>>>>>>>> has been running longer.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jay
>>>>>>>> 
>>>>>>>> Sent from my iPad
>>>>>>>> 
>>>>>>>>> On Jul 27, 2014, at 3:48 AM, Guy Meler <melguy at gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> Hey :)
>>>>>>>>> In master/slave mode, what is the algorithm for choosing the next
>>>>>>>> available primary backend?
>>>>>>>>> I want to make sure that if primary in site A goes down, than the next
>>>>>>>> available primary is in the same site.
>>>>>>>>> Thank you
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> pgpool-general mailing list
>>>>>>>>> pgpool-general at pgpool.net
>>>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>> _______________________________________________
>> pgpool-general mailing list
>> pgpool-general at pgpool.net
>> http://www.pgpool.net/mailman/listinfo/pgpool-general
> 
> 
> -- 
> Yugo Nagata <nagata at sraoss.co.jp>