[pgpool-general: 2583] Re: pgpoll failure

Sat Feb 15 05:39:10 JST 2014

I changed it and works fine.
just one more thing.
when one node gets down (pgpool process, not database), it takes one minute
and a half to the other to make itself primary...
i change some parameters but it still takes 1,5 minutes to set maste pgpool
node

is it possible?
how so?

tanks again

On Thu, Feb 13, 2014 at 11:26 AM, Gonzalo Gil <gonxalo2000 at gmail.com> wrote:

> Great!
>
>
> On Thu, Feb 13, 2014 at 12:23 PM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
>
>> On Thu, 13 Feb 2014 11:59:20 -0200
>> Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
>>
>> > YES! it works!
>>
>> I'm glad to hear that.
>>
>> >
>> > i will install heartbear.... but i'm testing instalation and i take the
>> > easy way...
>> > i let you know when i got it running
>>
>> You don't need to install heartbeat (Pacemaker). Watchdog's heartbeat
>> mode is
>> pgpool-II's built-in functionality. For the most simple configuration,
>> what
>> you need to do is:
>>
>> wd_lifecheck_method = 'heartbeat'
>>
>> wd_heartbeat_port = 9694
>> wd_heartbeat_keepalive = 2
>> wd_heartbeat_deadtime = 30
>>
>> heartbeat_destination0 = 'tad2'       <= 'tad1' in tad2 server
>> heartbeat_destination_port0 = 9694
>> heartbeat_device0 = ''
>>
>>
>> >
>> > tks a lot!!
>> >
>> >
>> > On Thu, Feb 13, 2014 at 8:47 AM, Yugo Nagata <nagata at sraoss.co.jp>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > On Wed, 12 Feb 2014 12:05:56 -0200
>> > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
>> > >
>> > > > i think it does not work...
>> > >
>> > > I'm sorry for jumping to a wring conclusion. load_balance_mode is
>> > > irrelevant.
>> > > The problem is that, pgpool-II considers myself as down before
>> failover is
>> > > done completely. Before failover completed, pgpool-II's child process
>> > > doesn't
>> > > know the backend server is down, hence lifecheck query 'SELECT 1'
>> fails,
>> > > and
>> > > pgpool-II consider itself in down status.
>> > >
>> > > To avoid this, health check should be done more frequently, or,
>> lifecheck
>> > > interval should be larger. In your configuration,
>> health_check_max_retries
>> > > = 3
>> > > and helth_check_retry_delay = 10. So, it takes more than 30 seconds to
>> > > detect
>> > > backend DB's down and start failover. However, wd_interval = 5 and
>> > > wd_life_point = 3.
>> > > So, it is about 15 to 20 seconds before pgpool-II decide to go to down
>> > > status.
>> > >
>> > > Could you please try edit pgpool.conf? For example:
>> > >
>> > > health_check_max_retries = 2
>> > > health_check_retry_delay = 5
>> > > wd_interval = 10
>> > > wd_life_point = 3;
>> > >
>> > > In fact, I recommend to use heartbeat mode instead of query mode.
>> This mode
>> > > doesn't issue query like 'SELECT 1' for checking pgpool status. So,
>> this
>> > > avoids
>> > > the kind of problem.
>> > >
>> > > >
>> > > >
>> > > > http://172.16.62.141/status.php
>> > > >          IP Address         Port         Status         Weight
>> > > >
>> > > > node 0         tad1         5432         Up. Connected. Running as
>> > > primary
>> > > > server         postgres: Up         0.500                 |
>> > > > node 1         tad2         5432         Up. Connected. Running as
>> > > standby
>> > > > server         postgres: Up         0.500                 |
>> > > >
>> > > > http://172.16.62.142/status.php
>> > > >          IP Address         Port         Status         Weight
>> > > >
>> > > > node 0         tad1         5432         Up. Connected. Running as
>> > > primary
>> > > > server         postgres: Up         0.500                 |
>> > > > node 1         tad2         5432         Up. Connected. Running as
>> > > standby
>> > > > server         postgres: Up         0.500                 |
>> > > >
>> > > > shutdown  141, node0, tad1...
>> > > >
>> > > >
>> > > >
>> > > > i attach logs....
>> > > >
>> > > >
>> > > > this was the final result....
>> > > > --->
>> > > >          IP Address         Port         Status         Weight
>> > > >
>> > > > node 0         tad1         5432         Down         postgres: Down
>> > > >         0.500                 |
>> > > > node 1         tad2         5432         Up. Connected. Running as
>> > > standby
>> > > > server         postgres: Up         0.500                 |
>> > > > <---
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Feb 12, 2014 at 4:11 AM, Yugo Nagata <nagata at sraoss.co.jp>
>> > > wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > Thanks for sending confs & logs.
>> > > > >
>> > > > > I found that this problem occurs when load_balance_mode = off.
>> > > > > Could you please try with load_balance_mode = on?
>> > > > >
>> > > > > I'll continue to analyze the detailed reason.
>> > > > >
>> > > > > On Mon, 10 Feb 2014 11:40:41 -0200
>> > > > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
>> > > > >
>> > > > > > i send the message but it was too long.
>> > > > > > i'll attach the files....
>> > > > > >
>> > > > > > it happens again, even when node 2 was the postgres standby
>> node.
>> > > > > >
>> > > > > > after i put the logs here, i shutdown node 1 (it has the primary
>> > > > > database)
>> > > > > > and it happens the same thing. node 2 lost ip and no failover
>> > > happens.
>> > > > > >
>> > > > > >
>> > > > > > TKS!
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Feb 10, 2014 at 5:23 AM, Yugo Nagata <
>> nagata at sraoss.co.jp>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Hi,
>> > > > > > >
>> > > > > > > This is odd that pgpool-1 losts VIP when server2 goes down.
>> For
>> > > > > analysis,
>> > > > > > > could you please send pgpool.conf and log output (of both
>> pgpool1
>> > > and
>> > > > > > > pgpool2)?
>> > > > > > >
>> > > > > > > On Tue, 4 Feb 2014 13:38:16 -0200
>> > > > > > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
>> > > > > > >
>> > > > > > > > Hello Tatsuo Ishii. I send some query mails to
>> > > > > > > > pgpool-general at pgpool.netbut i don't get my own messagese.
>> But
>> > > i do
>> > > > > > > > recieve other mails from the
>> > > > > > > > forum.
>> > > > > > > >
>> > > > > > > > Can you answer me some questions or forward them to the
>> forum!?
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > I'm runing pgpool with streaming replication: pgpool1 - db
>> > > postgres1
>> > > > > > > > (server 1) and pgpool2 - db postgres 2 (server 2).
>> > > > > > > > I'm using watchdog with a virtual ip and life_check_query.
>> > > > > > > >
>> > > > > > > > It's all configured and working .... more or less....
>> > > > > > > >
>> > > > > > > > INIT: I start my system: postgres1 is standby database and
>> > > postgres2
>> > > > > > > > is master (streaming replication).
>> > > > > > > > pgpool1 has the virtual ip.(and pgpool2 no, obviously)
>> > > > > > > >
>> > > > > > > > i connect to database via pgpool and everithing is ok.
>> > > > > > > > i stop postgres1 and nothing happens because i check
>> new_master
>> > > <>
>> > > > > > > > old_master (no master failure).
>> > > > > > > > i start postgres1 again (and returning it with pgpoolAdmin)
>> or
>> > > call a
>> > > > > > > > recovery and it works great.
>> > > > > > > >
>> > > > > > > > I stop postgres2 and failover fires ... and i get postgres1
>> as
>> > > the
>> > > > > new
>> > > > > > > > primary.
>> > > > > > > > and so on...
>> > > > > > > >
>> > > > > > > > this works fine.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > i go back to INIT again....
>> > > > > > > > and i do in server2
>> > > > > > > > reboot -h now
>> > > > > > > >
>> > > > > > > > i see in the server1 (pgpool1) log that pgpool2 is down...ok
>> > > > > > > > watching the log, i see pgpool1 lost the virtual ip address
>> > > > > (!?)....and
>> > > > > > > > tell me to restart pgpool....(!?)
>> > > > > > > >
>> > > > > > > > i restart it and i see that failover fires ... but in the
>> > > failover
>> > > > > > > script i
>> > > > > > > > get new_master_node = old_master_node ...and thus i do not
>> make
>> > > > > touch and
>> > > > > > > > postgres1 keeps as a standby...
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > I change failover.sh (and the command in the pgpool.conf). i
>> > > include
>> > > > > all
>> > > > > > > > parameters to see it's values when failover.sh start....
>> > > > > > > >
>> > > > > > > > Then, i restart serve2 and "return" database to pgpool....
>> > > > > > > >
>> > > > > > > > again, pgpool1 has the virtual ip.
>> > > > > > > > i stop database in node 2 and failover fires.... but
>> pgpool2 does
>> > > > > > > it....and
>> > > > > > > > pgpool1 too (!?)
>> > > > > > > > i check network activity and saw that pgpool2 connects to
>> > > server1 and
>> > > > > > > make
>> > > > > > > > the touch and i did see log from pgpool1 firing the failover
>> > > command
>> > > > > > > too....
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Cuestions....
>> > > > > > > > 1. why pgpool1 lost virtual ip and ask me to restart!?
>> > > > > > > > 2. why pgpool2 fires failover? i thought just the "primary"
>> > > pgpool
>> > > > > (the
>> > > > > > > one
>> > > > > > > > with the virtual ip) fires it.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > i hope you understand mr.
>> > > > > > > > tks a lot for your time..
>> > > > > > > > sorry for my english.
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > > Yugo Nagata <nagata at sraoss.co.jp>
>> > > > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Yugo Nagata <nagata at sraoss.co.jp>
>> > > > >
>> > >
>> > >
>> > > --
>> > > Yugo Nagata <nagata at sraoss.co.jp>
>> > >
>>
>>
>> --
>> Yugo Nagata <nagata at sraoss.co.jp>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.sraoss.jp/pipermail/pgpool-general/attachments/20140214/d6a89503/attachment-0001.html>