[pgpool-general: 2627] Re: pgpoll failure

Yugo Nagata nagata at sraoss.co.jp
Thu Mar 13 10:33:36 JST 2014


On Wed, 12 Mar 2014 16:16:41 -0300
Gonzalo Gil <gonxalo2000 at gmail.com> wrote:

> Ok, i use -D and it works fine!
> 
> Now, i have: master pgpool, slave pgpool and a *virtualip *they share.
> 
> *pgpool_hba.conf* (on bouth servers)
> 
> *host             all                                  all
>   MYIP            trust*
> host             postgres, template1        postgres          ipmaster
>   trust
> host             postgres, template1        postgres          ipslave
> trust
> 
> *pool_password*
> postgres:xxxxxxxxxxxxxx
> 
> from host *MYIP* a do;
> psql -p 5434 -h *virtualip *-U postgres
> 
> when i look at pg_stat_activity i see my connection from *masterip*
> 
> and it works fine. i'm not asked for password.
> 
> *I change (remove myip trust);*
> *pgpool_hba.conf* (on bouth servers);
> 
> host             postgres, template1        postgres          ipmaster
>   trust
> host             postgres, template1        postgres          ipslave
> trust
> 
> 
> from host *MYIP* a do;
> psql -p 5434 -h virtualip -U postgres
> 
> and i'm not allowed to login
> 
> finally i change (md5 for myip):
> *pgpool_hba.conf* (on bouth servers)
> 
> *host             all                                  all
>   MYIP            **md5*
> host             postgres, template1        postgres          ipmaster
>   trust
> host             postgres, template1        postgres          ipslave
> trust
> 
> 
> 
> from host *MYIP* a do;
> psql -p 5434 -h virtualip -U postgres
> 
> and i can log in without been asked for password
> 
> if i use:
> 
> psql -p 5434 -h virtualip -U postgres -W
> 
> and i don't write any password, i still can connect!(?)
> 
> what i'm doing wrong!?

Please confirm that pg_hba.conf in backends are configuresd for md5.
If these are specified as trust, authentication doesn't work.

FAQ would be helpful for you:

http://www.pgpool.net/mediawiki/index.php/FAQ#I_created_pool_hba.conf_and_pool_passwd_to_enable_md5_authentication_through_pgpool-II_but_it_does_not_work._Why.3F


> 
> 
> Tks!
> 
> 
> On Tue, Mar 11, 2014 at 11:58 PM, Yugo Nagata <nagata at sraoss.co.jp> wrote:
> 
> > On Tue, 11 Mar 2014 12:02:26 -0300
> > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
> >
> > > it's done now.
> > >
> > > My problem now is that i am in a prod ambient... so i can play very
> > much...
> > >
> > > i have:
> > > node1, primary pgpool (with virtual ip) and primary database
> > > node2, standby node (database and pgpool)
> > >
> > > i add a node (2nd standby in standby server) /datastb in pgpool.conf
> > (bouth
> > > nodes)
> > >
> > > from node1:
> > > # pcp_node_info 10 localhost 9898 postgres postgres 0
> > > prod 5432 *3* 1.000000
> > > # pcp_node_info 10 localhost 9898 postgres postgres 1
> > > replica 5432 *1* 0.000000
> > > # pcp_node_info 10 localhost 9898 postgres postgres 2
> > > replica 5444 3 0.000000
> > >
> > > why node 0 (local) is donw!?
> >
> > pgpool reads a file containing backend status at startup. pgpool at node #0
> > might read an old status file. To avoid this, start pgpool with -D option
> > and pgpool will discards the old status file.
> >
> > From the above situation, you can attach node #0's backend by using
> > pcp_attach_node command. However, it should be confirmed that PostgreSQL
> > #0 is primary and #1 is standby actually.
> >
> > >
> > > it is not. i can connect directly (psql) or via pgpool (psql -h
> > virtual_ip)
> > >
> > > I can't recovery the node either.
> > >
> > > # pcp_recovery_node 10 localhost 9898 postgres postgres 2
> > > BackendError
> >
> > I think the reason is that pgpool regards backend #0 as down
> > as shown by pcp_node_info. Some clues of teh failure would
> > left in backend's log (#0 or #1), since recovery command is
> > executed in backend server by postgres.
> >
> > >
> > >
> > > in pgpool.log:
> > >
> > > Mar 11 11:39:30 postgresql pgpool[14723]: starting recovering node 2
> > > Mar 11 11:39:30 postgresql pgpool[14723]: starting recovery command:
> > > "SELECT pgpool_recovery('basebackup.sh', 'replica',
> > > '/var/lib/pgsql/9.0/datastb')"
> > > Mar 11 11:39:30 postgresql pgpool[14723]: exec_recovery: basebackup.sh
> > > command failed at 1st stage
> > >
> > >
> > > in the standby node,
> > >
> > > # pcp_node_info 10 localhost 9898 postgres postgres 0
> > > prod 5432 *1* 1.000000
> > > # pcp_node_info 10 localhost 9898 postgres postgres 1
> > > replica 5432* 1* 0.000000
> > > # pcp_node_info 10 localhost 9898 postgres postgres 2
> > > replica 5444 3 0.000000
> > >
> > > # pcp_recovery_node 10 localhost 9898 postgres postgres 2
> > >
> > > and the recovery starts ....
> > > but... database is quite big, so i see
> > >         new structure in standby server  /datastb and subdirectories
> > >         rsync command in master node...
> > >
> > > any clue why pcp commands are not working in mster node!!?
> > >
> > >
> > >
> > > tks!
> > >
> > >
> > >
> > >
> > > On Wed, Feb 19, 2014 at 7:50 AM, Yugo Nagata <nagata at sraoss.co.jp>
> > wrote:
> > >
> > > > On Fri, 14 Feb 2014 18:39:10 -0200
> > > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
> > > >
> > > > > I changed it and works fine.
> > > > > just one more thing.
> > > > > when one node gets down (pgpool process, not database), it takes one
> > > > minute
> > > > > and a half to the other to make itself primary...
> > > > > i change some parameters but it still takes 1,5 minutes to set maste
> > > > pgpool
> > > > > node
> > > >
> > > > It is possible depending on parameter configuration, but I can't
> > identify
> > > > the cause.
> > > > Could you please send your pgpool.conf and logs?
> > > >
> > > > >
> > > > >
> > > > > is it possible?
> > > > > how so?
> > > > >
> > > > > tanks again
> > > > >
> > > > >
> > > > > On Thu, Feb 13, 2014 at 11:26 AM, Gonzalo Gil <gonxalo2000 at gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > > Great!
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 13, 2014 at 12:23 PM, Yugo Nagata <nagata at sraoss.co.jp
> > >
> > > > wrote:
> > > > > >
> > > > > >> On Thu, 13 Feb 2014 11:59:20 -0200
> > > > > >> Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
> > > > > >>
> > > > > >> > YES! it works!
> > > > > >>
> > > > > >> I'm glad to hear that.
> > > > > >>
> > > > > >> >
> > > > > >> > i will install heartbear.... but i'm testing instalation and i
> > take
> > > > the
> > > > > >> > easy way...
> > > > > >> > i let you know when i got it running
> > > > > >>
> > > > > >> You don't need to install heartbeat (Pacemaker). Watchdog's
> > heartbeat
> > > > > >> mode is
> > > > > >> pgpool-II's built-in functionality. For the most simple
> > configuration,
> > > > > >> what
> > > > > >> you need to do is:
> > > > > >>
> > > > > >> wd_lifecheck_method = 'heartbeat'
> > > > > >>
> > > > > >> wd_heartbeat_port = 9694
> > > > > >> wd_heartbeat_keepalive = 2
> > > > > >> wd_heartbeat_deadtime = 30
> > > > > >>
> > > > > >> heartbeat_destination0 = 'tad2'       <= 'tad1' in tad2 server
> > > > > >> heartbeat_destination_port0 = 9694
> > > > > >> heartbeat_device0 = ''
> > > > > >>
> > > > > >>
> > > > > >> >
> > > > > >> > tks a lot!!
> > > > > >> >
> > > > > >> >
> > > > > >> > On Thu, Feb 13, 2014 at 8:47 AM, Yugo Nagata <
> > nagata at sraoss.co.jp>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > > Hi,
> > > > > >> > >
> > > > > >> > > On Wed, 12 Feb 2014 12:05:56 -0200
> > > > > >> > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
> > > > > >> > >
> > > > > >> > > > i think it does not work...
> > > > > >> > >
> > > > > >> > > I'm sorry for jumping to a wring conclusion.
> > load_balance_mode is
> > > > > >> > > irrelevant.
> > > > > >> > > The problem is that, pgpool-II considers myself as down before
> > > > > >> failover is
> > > > > >> > > done completely. Before failover completed, pgpool-II's child
> > > > process
> > > > > >> > > doesn't
> > > > > >> > > know the backend server is down, hence lifecheck query
> > 'SELECT 1'
> > > > > >> fails,
> > > > > >> > > and
> > > > > >> > > pgpool-II consider itself in down status.
> > > > > >> > >
> > > > > >> > > To avoid this, health check should be done more frequently,
> > or,
> > > > > >> lifecheck
> > > > > >> > > interval should be larger. In your configuration,
> > > > > >> health_check_max_retries
> > > > > >> > > = 3
> > > > > >> > > and helth_check_retry_delay = 10. So, it takes more than 30
> > > > seconds to
> > > > > >> > > detect
> > > > > >> > > backend DB's down and start failover. However, wd_interval =
> > 5 and
> > > > > >> > > wd_life_point = 3.
> > > > > >> > > So, it is about 15 to 20 seconds before pgpool-II decide to
> > go to
> > > > down
> > > > > >> > > status.
> > > > > >> > >
> > > > > >> > > Could you please try edit pgpool.conf? For example:
> > > > > >> > >
> > > > > >> > > health_check_max_retries = 2
> > > > > >> > > health_check_retry_delay = 5
> > > > > >> > > wd_interval = 10
> > > > > >> > > wd_life_point = 3;
> > > > > >> > >
> > > > > >> > > In fact, I recommend to use heartbeat mode instead of query
> > mode.
> > > > > >> This mode
> > > > > >> > > doesn't issue query like 'SELECT 1' for checking pgpool
> > status.
> > > > So,
> > > > > >> this
> > > > > >> > > avoids
> > > > > >> > > the kind of problem.
> > > > > >> > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > http://172.16.62.141/status.php
> > > > > >> > > >          IP Address         Port         Status
> > Weight
> > > > > >> > > >
> > > > > >> > > > node 0         tad1         5432         Up. Connected.
> > Running
> > > > as
> > > > > >> > > primary
> > > > > >> > > > server         postgres: Up         0.500                 |
> > > > > >> > > > node 1         tad2         5432         Up. Connected.
> > Running
> > > > as
> > > > > >> > > standby
> > > > > >> > > > server         postgres: Up         0.500                 |
> > > > > >> > > >
> > > > > >> > > > http://172.16.62.142/status.php
> > > > > >> > > >          IP Address         Port         Status
> > Weight
> > > > > >> > > >
> > > > > >> > > > node 0         tad1         5432         Up. Connected.
> > Running
> > > > as
> > > > > >> > > primary
> > > > > >> > > > server         postgres: Up         0.500                 |
> > > > > >> > > > node 1         tad2         5432         Up. Connected.
> > Running
> > > > as
> > > > > >> > > standby
> > > > > >> > > > server         postgres: Up         0.500                 |
> > > > > >> > > >
> > > > > >> > > > shutdown  141, node0, tad1...
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > i attach logs....
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > this was the final result....
> > > > > >> > > > --->
> > > > > >> > > >          IP Address         Port         Status
> > Weight
> > > > > >> > > >
> > > > > >> > > > node 0         tad1         5432         Down
> > postgres:
> > > > Down
> > > > > >> > > >         0.500                 |
> > > > > >> > > > node 1         tad2         5432         Up. Connected.
> > Running
> > > > as
> > > > > >> > > standby
> > > > > >> > > > server         postgres: Up         0.500                 |
> > > > > >> > > > <---
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > On Wed, Feb 12, 2014 at 4:11 AM, Yugo Nagata <
> > > > nagata at sraoss.co.jp>
> > > > > >> > > wrote:
> > > > > >> > > >
> > > > > >> > > > > Hi,
> > > > > >> > > > >
> > > > > >> > > > > Thanks for sending confs & logs.
> > > > > >> > > > >
> > > > > >> > > > > I found that this problem occurs when load_balance_mode =
> > off.
> > > > > >> > > > > Could you please try with load_balance_mode = on?
> > > > > >> > > > >
> > > > > >> > > > > I'll continue to analyze the detailed reason.
> > > > > >> > > > >
> > > > > >> > > > > On Mon, 10 Feb 2014 11:40:41 -0200
> > > > > >> > > > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
> > > > > >> > > > >
> > > > > >> > > > > > i send the message but it was too long.
> > > > > >> > > > > > i'll attach the files....
> > > > > >> > > > > >
> > > > > >> > > > > > it happens again, even when node 2 was the postgres
> > standby
> > > > > >> node.
> > > > > >> > > > > >
> > > > > >> > > > > > after i put the logs here, i shutdown node 1 (it has the
> > > > primary
> > > > > >> > > > > database)
> > > > > >> > > > > > and it happens the same thing. node 2 lost ip and no
> > > > failover
> > > > > >> > > happens.
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > TKS!
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > >
> > > > > >> > > > > > On Mon, Feb 10, 2014 at 5:23 AM, Yugo Nagata <
> > > > > >> nagata at sraoss.co.jp>
> > > > > >> > > > > wrote:
> > > > > >> > > > > >
> > > > > >> > > > > > > Hi,
> > > > > >> > > > > > >
> > > > > >> > > > > > > This is odd that pgpool-1 losts VIP when server2 goes
> > > > down.
> > > > > >> For
> > > > > >> > > > > analysis,
> > > > > >> > > > > > > could you please send pgpool.conf and log output (of
> > both
> > > > > >> pgpool1
> > > > > >> > > and
> > > > > >> > > > > > > pgpool2)?
> > > > > >> > > > > > >
> > > > > >> > > > > > > On Tue, 4 Feb 2014 13:38:16 -0200
> > > > > >> > > > > > > Gonzalo Gil <gonxalo2000 at gmail.com> wrote:
> > > > > >> > > > > > >
> > > > > >> > > > > > > > Hello Tatsuo Ishii. I send some query mails to
> > > > > >> > > > > > > > pgpool-general at pgpool.netbut i don't get my own
> > > > messagese.
> > > > > >> But
> > > > > >> > > i do
> > > > > >> > > > > > > > recieve other mails from the
> > > > > >> > > > > > > > forum.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Can you answer me some questions or forward them to
> > the
> > > > > >> forum!?
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > I'm runing pgpool with streaming replication:
> > pgpool1 -
> > > > db
> > > > > >> > > postgres1
> > > > > >> > > > > > > > (server 1) and pgpool2 - db postgres 2 (server 2).
> > > > > >> > > > > > > > I'm using watchdog with a virtual ip and
> > > > life_check_query.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > It's all configured and working .... more or
> > less....
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > INIT: I start my system: postgres1 is standby
> > database
> > > > and
> > > > > >> > > postgres2
> > > > > >> > > > > > > > is master (streaming replication).
> > > > > >> > > > > > > > pgpool1 has the virtual ip.(and pgpool2 no,
> > obviously)
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > i connect to database via pgpool and everithing is
> > ok.
> > > > > >> > > > > > > > i stop postgres1 and nothing happens because i check
> > > > > >> new_master
> > > > > >> > > <>
> > > > > >> > > > > > > > old_master (no master failure).
> > > > > >> > > > > > > > i start postgres1 again (and returning it with
> > > > pgpoolAdmin)
> > > > > >> or
> > > > > >> > > call a
> > > > > >> > > > > > > > recovery and it works great.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > I stop postgres2 and failover fires ... and i get
> > > > postgres1
> > > > > >> as
> > > > > >> > > the
> > > > > >> > > > > new
> > > > > >> > > > > > > > primary.
> > > > > >> > > > > > > > and so on...
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > this works fine.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > i go back to INIT again....
> > > > > >> > > > > > > > and i do in server2
> > > > > >> > > > > > > > reboot -h now
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > i see in the server1 (pgpool1) log that pgpool2 is
> > > > down...ok
> > > > > >> > > > > > > > watching the log, i see pgpool1 lost the virtual ip
> > > > address
> > > > > >> > > > > (!?)....and
> > > > > >> > > > > > > > tell me to restart pgpool....(!?)
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > i restart it and i see that failover fires ... but
> > in
> > > > the
> > > > > >> > > failover
> > > > > >> > > > > > > script i
> > > > > >> > > > > > > > get new_master_node = old_master_node ...and thus i
> > do
> > > > not
> > > > > >> make
> > > > > >> > > > > touch and
> > > > > >> > > > > > > > postgres1 keeps as a standby...
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > I change failover.sh (and the command in the
> > > > pgpool.conf). i
> > > > > >> > > include
> > > > > >> > > > > all
> > > > > >> > > > > > > > parameters to see it's values when failover.sh
> > start....
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Then, i restart serve2 and "return" database to
> > > > pgpool....
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > again, pgpool1 has the virtual ip.
> > > > > >> > > > > > > > i stop database in node 2 and failover fires.... but
> > > > > >> pgpool2 does
> > > > > >> > > > > > > it....and
> > > > > >> > > > > > > > pgpool1 too (!?)
> > > > > >> > > > > > > > i check network activity and saw that pgpool2
> > connects
> > > > to
> > > > > >> > > server1 and
> > > > > >> > > > > > > make
> > > > > >> > > > > > > > the touch and i did see log from pgpool1 firing the
> > > > failover
> > > > > >> > > command
> > > > > >> > > > > > > too....
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > Cuestions....
> > > > > >> > > > > > > > 1. why pgpool1 lost virtual ip and ask me to
> > restart!?
> > > > > >> > > > > > > > 2. why pgpool2 fires failover? i thought just the
> > > > "primary"
> > > > > >> > > pgpool
> > > > > >> > > > > (the
> > > > > >> > > > > > > one
> > > > > >> > > > > > > > with the virtual ip) fires it.
> > > > > >> > > > > > > >
> > > > > >> > > > > > > >
> > > > > >> > > > > > > > i hope you understand mr.
> > > > > >> > > > > > > > tks a lot for your time..
> > > > > >> > > > > > > > sorry for my english.
> > > > > >> > > > > > >
> > > > > >> > > > > > >
> > > > > >> > > > > > > --
> > > > > >> > > > > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > > >> > > > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > --
> > > > > >> > > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > > >> > > > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > --
> > > > > >> > > Yugo Nagata <nagata at sraoss.co.jp>
> > > > > >> > >
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Yugo Nagata <nagata at sraoss.co.jp>
> > > > > >>
> > > > > >
> > > > > >
> > > >
> > > >
> > > > --
> > > > Yugo Nagata <nagata at sraoss.co.jp>
> > > >
> >
> >
> > --
> > Yugo Nagata <nagata at sraoss.co.jp>
> >


-- 
Yugo Nagata <nagata at sraoss.co.jp>


More information about the pgpool-general mailing list