[pgpool-committers: 2702] pgpool: Mega patch to enhance performance in extended protocol mode.
ishii at postgresql.org
Thu Sep 24 11:13:16 JST 2015
Mega patch to enhance performance in extended protocol mode.
For now it only affects to streaming replication mode, that is, the
performance of other modes remains same. In the future we may be able
to enhance this.
Anyway, basic idea here is, removing "flush" message which used to be
sent in each step of extended protocol messages (parse, bind, describe
and execute). This brought significant communication overhead. To
achieve the goal following modifications have been made.
- New data structure called "pending message queue" is created to
manage which response of particular message is pending. When
Parse/Bind/Close message are received, each message is en-queued.
The information is used to process those response messages, when
Parse complete/Bind completes and Close compete message are received
because they don't have any information regarding statement/portal.
- New state variable to represent that there's any "pending response"
exists. If an extended protocol message, for example, parse is sent
and still the response to parse (i.e. parse complete) is not yet
received, the variable is set to true.
- If there's any pending response, do_query() sends flush message to
retrieve any pending data from primary backend and save it. After
finishing the job, the saved data is restored. This will keep the
original response message sequence.
- Deal with a special case in read_kind_from_backend(). Certain client
(at least JDBC driver has this habit) does not send a sync message
after an execute message. See the comment in
read_kind_from_backend() for more details.
- New data structure called "sync map" to manage sync message response
is added. Used for managing sync message response in streaming
replication mode. In streaming replication mode, there are at most
two nodes involved. One is always primary, and the other (if any)
could be standby. If the load balancing manager chooses the primary
as the load balancing node, only the primary node is involved. Every
time extended protocol is sent to backend, we set true on the member
of array below. It is cleared after sync message is processed and
command complete or error response message is processed.
- New parameter "bool nowait" is added to
pool_extended_send_and_wait() to control whether flush message is
sent or not after sending extended protocol command.
- Move CommandComplete() to new file src/protocol/CommandComplete.c to
make maintainer's life easier.
Initial performance evaluation:
- Hard ware: Let's note CF-SX3, mem 16GB, SSD 512GB, Core i7-4600U (2 cores)
- OS: Ubuntu 14.04 LTS
- 2 PostgreSQL streaming replication setup using pgpool_setup
- pgbench options: pgbench -M extended -c 16 -j 8 -S -T 30 test (scale=1)
- Results (excluding connections establishing):
pgpool-II 3.4.4: 7274.926104 TPS, this commit: 11290.026468 TPS
So, this committed version is 55.2% faster than 3.4.3.
- parse_before_bind() in pool_proto_modules is ifdef'ed out. Even
without this, all regression tests are passed. But I am not sure if
it's really harmless.
- When parse message is sent, it is possible that deadlock occurs
according to the comment. I'm not sure if it's still true or not. If
true, we need to fix it.
src/Makefile.am | 3 +-
src/Makefile.in | 7 +-
src/context/pool_query_context.c | 82 +++--
src/context/pool_session_context.c | 295 +++++++++++++++-
src/include/context/pool_query_context.h | 2 +-
src/include/context/pool_session_context.h | 63 ++++
src/include/pool.h | 1 +
src/include/protocol/pool_proto_modules.h | 5 +
src/protocol/pool_process_query.c | 282 +++++++++++++--
src/protocol/pool_proto_modules.c | 372 +++++++-------------
src/query_cache/pool_memqcache.c | 192 +++++++---
.../regression/tests/006.memqcache/jdbctest.java | 2 +-
src/test/regression/tests/051.bug60/test.sh | 6 +-
src/utils/pool_stream.c | 54 ++-
14 files changed, 983 insertions(+), 383 deletions(-)
More information about the pgpool-committers