[pgpool-general: 4909] Re: Performance with pgpool is lower than without pgpool

Sat Aug 13 11:28:21 JST 2016

> Hello,
> 
> I was analyzing my environment with pgpool and postgres. For example: My
> select queries are taking about 2 times more that direct connections with
> the postgres.
> 
> The pgpool replication sends all write requests to all servers in the
> cluster.

No, because "master_slave_mode = on" and "master_slave_sub_mode =
'stream'" in your configuration.  In this case replication is done by
PostgreSQL streaming replication.

> For this reason the performance decreases?

By using pgpool, all queries must be sent through pgpool process. If
you directly connect to PostgreSQL total time to run a query is sum
of:

1) time to send a query string to PostgreSQL
2) time to process the query on PostgreSQL
3) time to return the query result

In the case of Pgpool-II:

1) time to send a query string to Pgpool-II
2) time to parse the query on Pgpool-II
3) time to send a query string to PostgreSQL
4) time to process the query on PostgreSQL
5) time to return the query result to Pgpool-II
6) time to return the query result to client

There are lot of factors which increase or decrease the overhead:

- for 1 and 6 the network connection speed between client and
  Pgpool-II matters

- for 2 the complexity of the query matters

- for 3 and 5 the network connection speed between PostgreSQL and
  Pgpool-II matters

So when you try to analyze the performance of Pgpool-II, be careful
about the those factors.

For your information, here is my result on my small laptop.

- A Linux laptop with 2 physical cores (HT: 4) with 16GB memory and
  SSD storage

- Pgpool-II 3.5 + PostgreSQL 9.5.3 x 2 with streaming replication
  configuration (same as yours)

- Using pgbench to major performance (scaling factor is 1).

Case 1: very light weight query

pgbench -S runs a SELECT fetching one row using index scan in each
transaction.

$ pgbench -S -T 30 -c 1 -p 11002 test [direct PostgreSQL]
starting vacuum...end.
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 360816
latency average: 0.083 ms
tps = 12027.166725 (including connections establishing)
tps = 12027.659858 (excluding connections establishing)

$ pgbench -S -T 30 -c 1 -p 11000 test [via Pgpool-II]
starting vacuum...end.
transaction type: SELECT only
scaling factor: 100
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 79816
latency average: 0.376 ms
tps = 2660.511517 (including connections establishing)
tps = 2661.028018 (excluding connections establishing)

Case 2: very heavy weight query

$ pgbench -T 30 -c 1 -p 11002 -f /tmp/a.sql test [direct PostgreSQL]
starting vacuum...end.
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 30
latency average: 1000.000 ms
tps = 0.996381 (including connections establishing)
tps = 0.996419 (excluding connections establishing)

$ pgbench -T 30 -c 1 -p 11000 -f /tmp/a.sql test [via Pgpool-II]
starting vacuum...end.
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 30
latency average: 1000.000 ms
tps = 0.976569 (including connections establishing)
tps = 0.976780 (excluding connections establishing)

Summary:

As you can see, in the case 1 (very light weight query), Pgpool-II is
4.5 times slower than direct connection to PostgreSQL. Here, the query
is so light, we just measure the network latency and time to parse the
query.

In the case 2 (very heavy weight query), Pgpool-II showed almost same
performance as direct connection to PostgreSQL. Here, because the
query is so heavy, the network overhead and query parsing time was
hidden.

Those two cases are both extreme. Probably actual result is between 1
and 2 depending on how a query is heavy in the real world.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

> The parametes are:
> 
> num_init_children = 200
> 
> max_pool = 2
> 
> child_life_time = 300
> 
> connection_life_time = 300
> 
> client_idle_limit = 300
> 
> load_balance_mode = on
> 
> master_slave_mode = on
> 
> master_slave_sub_mode = 'stream'
> 
> My pgpool version is pgpool-II version 3.5.3 (ekieboshi)
> 
> My postgresql version is 9.4.4
> 
> My SO is CentOS release 6.7 (Final)
> 
> Thank you
> 
> Best regards
> 
> 
> -- 
> JotaComm
> http://jotacomm.wordpress.com