[Pgpool-general] Performance in load balance mode

Tatsuo Ishii ishii at sraoss.co.jp
Fri May 22 00:27:48 UTC 2009


> I am doing some experimentation with pgpool-II.  I have three servers,
> one frontend running pgpool-II, and two backends running Postgres 8.3.
> They are configured in replication and load balance mode.  I've been
> using pgbench to estimate the overhead of the replication, and I have
> found some interesting results.
> 
> With both backends attached, I get about 500 transactions per second.
> Simply disabling one of the backends (via pcp_detach_node) brings that
> up to about 775 transactions per second.  And when I benchmark the
> backend directly, without going through pgpool, I get around 1000
> transactions per second.
> 
> We are intending to use a replicated database to replace a standalone
> server, so ideally we would like to have roughly equivalent performance.
> 
> Is this sort of behavior expected with pgpool, or have I misconfigured it?

pgpool sends a query to the second node after the first node completes
the query. If you have the third or the fourth node, then query will
be sent concurrently to the second node, the third and the fourth
node. Pure SELECT(read) queries are sent to one of the nodes if load
balance mode is enabled.  So you would theoreticaly expect:

1) the write queries take 2x time
2) the read queries take 1/(number of nodes) time

This suggests that if you have only two database nodes, the peformance
compared with normal PostgreSQL will be 1/2 at the worst case.  I
don't know what kind of options you gave to pgbench, but default
queries sent by pgbench is very write intensitive. So it is likely you
get the worst performance. Unfortunately pgbench sends all queries
inside a transaction. In this case pgpool does not load balance
SELECTs. So I would suggest to write custom query file, with
appropreate read/write query mix *not* using trasanction if you want
to test the peformance of pgpool.

If you have more than two nodes, you have more chances to get better
performance. The write queries will take only same time as two nodes
case, while the read quries will get more performance propotional to
the number of nodes. From my experience, the best performance could be
obtained by using 3-4 nodes. Too many nodes is not good due to the
overhead of pgpool. Another thing I would like to point out is, let your
DB applications and pgpool live on same machine. This make it possible
to use UNIX domain sockets between apps and pgpool, which will provide
far better performance than TCP/IP.
--
Tatsuo Ishii
SRA OSS, Inc. Japan


More information about the Pgpool-general mailing list