[Pgpool-general] Online Recovery/Dynamic Addition of Node
Harold Lim
rold_50 at yahoo.com
Wed Apr 22 18:09:48 UTC 2009
I ran pgbench with the -C option (establish new connection for each transaction)
When I try to start online recovery while pgbench is running, it gets stuck on 2nd stage.
2009-04-22 13:46:40 LOG: pid 28699: starting recovering node 1
2009-04-22 13:46:42 LOG: pid 28699: CHECKPOINT in the 1st stage done
2009-04-22 13:46:42 LOG: pid 28699: starting recovery command: "SELECT pgpool_recovery('copy_base_backup', '172.16.63.11', '/opt/PostgreSQL/8.3/data')"
2009-04-22 13:49:00 LOG: pid 28699: 1st stage is done
2009-04-22 13:49:00 LOG: pid 28699: starting 2nd stage
2009-04-22 13:59:03 ERROR: pid 28699: wait_connection_closed: existing connections did not close in 600 sec.
2009-04-22 13:59:03 ERROR: pid 28699: start_recovery: timeover for waiting connection closed
But, when I look at my pgpool process, all transaction is idle:
postgres 28667 0.0 0.0 5616 1288 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28668 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28669 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28670 0.1 0.0 5616 1300 pts/1 S+ 13:45 0:01 pgpool: postgres postgres 127.0.0.1(43342) idle in transaction
postgres 28671 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28672 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28673 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: postgres postgres 127.0.0.1(43344) idle in transaction
postgres 28674 0.0 0.0 5616 1260 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28675 0.0 0.0 5616 1248 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28676 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28677 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28678 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28679 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28680 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: postgres postgres 127.0.0.1(43346) idle in transaction
postgres 28681 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28682 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: postgres postgres 127.0.0.1(43348) idle in transaction
postgres 28683 0.0 0.0 5616 1248 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28684 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: postgres postgres 127.0.0.1(43354) idle
postgres 28685 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28686 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28687 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28688 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28689 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: postgres postgres 127.0.0.1(43352) idle in transaction
postgres 28690 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28691 0.0 0.0 5616 1292 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28692 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28693 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: postgres postgres 127.0.0.1(43350) idle in transaction
postgres 28694 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28695 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28696 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28697 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
postgres 28698 0.0 0.0 5616 1244 pts/1 S+ 13:45 0:00 pgpool: wait for connection request
Is this a problem with pgpool? or with pgbench?
Is it because pgbench's connections never disconnect?
As I understand it, when pgpool gets to stage 2, all of the connection has already been disconnected.
Thanks,
Harold
--- On Tue, 4/21/09, Jaume Sabater <jsabater at gmail.com> wrote:
> From: Jaume Sabater <jsabater at gmail.com>
> Subject: Re: [Pgpool-general] Online Recovery/Dynamic Addition of Node
> To: rold_50 at yahoo.com
> Cc: pgpool-general at pgfoundry.org
> Date: Tuesday, April 21, 2009, 3:15 AM
> On Tue, Apr 21, 2009 at 1:10 AM, Harold Lim
> <rold_50 at yahoo.com> wrote:
>
> > Is it correct to say that in PGpool, if the current
> running workload is heavy, the time it also take to finish
> online recovery or dynamic addition of node will also be
> longer?
>
> Due to the nature of online recovery, it will take
> additional time if
> the server is under load. The higher the load, the higher
> the time
> it'll take. Personally, I have not played much with
> online recovery
> under (heavy) load, therefore maybe Tatsuo or someone else
> will be
> able to add more information about it.
>
> P.S. Finetuning the segments parameters in PostgreSQL and
> making sure
> the online recovery does not happen while autovacuum is
> active will
> help.
>
> --
> Jaume Sabater
> http://linuxsilo.net/
>
> "Ubi sapientas ibi libertas"
More information about the Pgpool-general
mailing list