[pgpool-general: 1377] Re: Failover problem when slave is de-attached

張 福民 cho at jtrustsystem.co.jp
Thu Feb 7 09:38:22 JST 2013


Hi

 > Our setup contains two database servers, one running as primary and 
one as slave.
 > We installed automatic failover and it seems to be working very well 
when the primary node goes down.
 > The standby server is promoted to primary and our java application 
continues working with no problems.
 > The problem that we have occurs when a slave node goes down.

Did you set Synchronous replication ON between the mater and slave?
It's possible that you have set [synchronous_commit = on] in 
postgresql.conf of the master.

After the slave was down, please set [synchronous_commit = local] of the 
master
  and reload the new value by [pg_ctl reload].


(2013/02/06 18:34), Stelios Limnaios wrote:
> Hi all,
>
> I'm kindly asking for your expertise on a problem we have with PGPool and failover.
>
> Our setup contains two database servers, one running as primary and one as slave.
> We installed automatic failover and it seems to be working very well when the primary node goes down.
> The standby server is promoted to primary and our java application continues working with no problems.
>
> The problem that we have occurs when a slave node goes down.
> PGPool starts being busy and PGPoolAdmin does not open any pages but it keeps waiting server responses for long (until timeout).
> When I try to stop pgpool from the command line on the server, it just won't do it, but it keeps stopping it forever.
> If I start again the failed slave, I can see in postgres logs that it connects to the primary node for replication, but I'm not able to open the PGPoolAdmin status page to click the Return button.
> PGPool seems to be waiting for something, but I'm not able to understand what is it.
>
> We use pgpool2-V3_2_STABLE, checked out from the repository (4-Oct-2012).
>
> In pgpool.conf we have set
> fail_over_on_backend_error:    /usr/local/etc/failover.sh %d "%h" %p %D %m %M "%H" %P
>
> and in failover.sh:
> if [ $failed_node_id = $old_primary_node_id ];then      # master failed
>      touch $trigger   # let standby take over
>      echo "Primary database "$failed_host_name" failed, please check the status of your replication system. Trigger used: "$trigger | mail -s "Ditaweb primary database failed" $admin_email
> else
>      echo "Slave database "$failed_host_name" failed, please check the status of your replication system." | mail -s "Ditaweb slave database failed" $admin_email
> fi
>
> The above script works fine as the 'slave failed' email is sent, and in the case when the primary node goes down failover is executed successfully.
> I've also attached pgpool.conf in case you need it.
>
> So, I guess the question is what makes PGPool having this behavior?
> Is it something that we need to setup in pgpool.conf, some kind of timeout?
>
> Thank you in advance for you time,
>
> Regards,
> Stelios Limnaios
>
>
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general �� pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general


-- 
┏┏┏┏━━━━━━━━━━━━━━━━━━━━━━━━
┏╋┏ J TrustSystem Co.,Ltd. 張 福民
┏╋  URL  http://jtrustsystem.co.jp/
┏■  〒600-8862
┏    京都市下京区七条御所ノ内中町50番地の5
┃       Jトラスト京都ビル  4階
┃    TEL/FAX 075-366-0375 / 075-322-1544
┃    E-Mail  cho �� jtrustsystem.co.jp
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━
:+++++++++++++++++++++++++++++++++++++++++++++++++++++++:
発信されたメッセージには機密情報や個人情報が含まれている
場合があり、宛先にお名前が明示されている正規の受取人に限
りご利用いただけます。
誤送信等により、受取人でない方に配信されました場合は、お
手数ですが送信者までただちにご返信いただき、その後、破棄
または消去して下さいますようお願い申し上げます。
正規の受取人以外の方による使用(調査、複製、転送、配布、
公表等)は固くお断りいたします。
:+++++++++++++++++++++++++++++++++++++++++++++++++++++++:

-------------- next part --------------
HTML����������������������������...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20130207/051e3fad/attachment.html>


More information about the pgpool-general mailing list