0000238: Performance of selects queries

ID	Project	Category	View Status	Date Submitted	Last Update

0000238	Pgpool-II	Enhancement	public	2016-08-29 18:52	2016-09-06 09:14

Reporter	supp_k	Assigned To	t-ishii
Priority	high	Severity	feature	Reproducibility	always
Status	resolved	Resolution	open
Platform	x86 64	OS	CentOS	OS Version	6.x, 7.x
Product Version	3.5.3
Target Version	3.6.0

Summary	0000238: Performance of selects queries
Description	After integration of pgpool into our product we could detect the performance of a standalone 1 thread select is slowed down by 50%. It is good scaled when the number of clients is growing. But one single thread select now works slowly. After a quick profiling of the pgpool's activity from the network perspective and having counted the number of flush operations it does we could detect that every single record (tuple) returned by PostgreSQL is processed by pgpool separately. Though the data is returned by PostgreSQL in one single array of bytes the Pgpool forwards response to the clients as separate (from network perspective) pieces of data with separate write/flush actions, from the OS's perspective every flush operation into network socket looks like a separate interruption. It directly impacts the performance of pgpool. We could detect the described feature with the help of the strace tool (https://en.wikipedia.org/wiki/Strace) and standalone internal tests. Please see the output below. From the log one can see the PostgreSQL has returned 1025 bytes but the pgpool chunks it into separate packets forwarded to the client which executes simple SQL: "select id from table" (the table contains several 16 records)
Steps To Reproduce	We have two backends behind the Pgpool. Please see the attached quick patch (not sure if it is correst). Also the chart the reflects performance testing results is attached: - the blue series reflects performance without pgpool, - the orange chart is the pgpool testing result - the X axis show the number of concurrent clients, the Y axis reflects the number of requests per second. It can be seen the performance selects is significantly lower.
Additional Information	1472135729.734737 select(10, [9], NULL, [9], NULL) = 1 (in [9]) 1472135729.734792 read(9, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0010D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0"..., 1024) = 1024 1472135729.734828 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0010", 57) = 57 1472135729.734864 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011", 57) = 57 1472135729.734898 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010", 57) = 57 1472135729.734931 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0011", 57) = 57 1472135729.734971 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010", 57) = 57 1472135729.735003 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011", 57) = 57 1472135729.735036 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010", 57) = 57 1472135729.735073 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011", 57) = 57 1472135729.735106 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0010\0\0\0\0010", 57) = 57 1472135729.735138 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0010\0\0\0\0011", 57) = 57 1472135729.735172 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0010", 57) = 57 1472135729.735204 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010\0\0\0\0011\0\0\0\0011", 57) = 57 1472135729.735238 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0010", 57) = 57 1472135729.735270 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011", 57) = 57 1472135729.735303 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0011\0\0\0\0010", 57) = 57 1472135729.735335 write(8, "D\0\0\0008\0\n\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0010\0\0\0\0011\0\0\0\0011\0\0\0\0011", 57) = 57 1472135729.735368 write(8, "D\0\0\0008\0\n\0\0\ We have discovered the code and implemented some fix (attached). It boosts the performance of pgpool by ratio 1.5 but not always because for sure we don't understand the whole logic of the code. It seems the performance of pgpool can be boosted if the series of write/flush operations will be replaced with one single write/flush. We are not using caching and seems the parsing of whole the PostgreSQL response is unnecessary here. We are ready to assist you with the performance and stability testing and provide the true data regarding our test environment and data. Yours sincerely, Sergey HA architect http://www.odin.com
Tags	No tags attached.

supp_k 2016-08-29 18:52 reporter	flush_move.patch (1,747 bytes) flush_move.patch (1,747 bytes)

supp_k 2016-08-29 18:52 reporter	chart_performance.png (11,437 bytes) chart_performance.png (11,437 bytes)

supp_k 2016-08-31 05:44 reporter ~0001038	We believe that if the next piece is modified it would boost the performance of selects significantly. https://git.postgresql.org/gitweb/?p=pgpool2.git;a=shortlog;h=refs/heads/V3_5_STABLE source file pool_proto_modules, line 2378: default: status = SimpleForwardToFrontend(kind, frontend, backend); if (pool_flush(frontend)) return POOL_END; break; Here we think the "pool_flush" action can be replaced with some additional check which is already for instance if performed in "int pool_write_noerror(POOL_CONNECTION cp, void buf, int len)" function. As we see it executes the actual write when the buffer end is reached. Thus if reduce the number of flushes we can reduce the number of operations performed by OS and CPU. Of course we dont understand the whole logic of the code but what we observe is the performance of simple selects that return several records. The original PostgreSQL does the same - it returnds the byte array instead of several chunked pieces of the whole result set when it comes to several records (tuples). Again we are ready and want to assist you with the performance and stability testing of the patch. Yours sincerely, Sergey HA architect http://www.odin.com

t-ishii 2016-08-31 13:56 developer ~0001039 Last edited: 2016-08-31 13:57	Since SimpleForwardToFrontend is called for each raw data received from backend, above code you referred cause system call for each raw data. Yes, it's pretty expensive. However, using pool_write_noerror instead is pointless since writing to socket is already done in SimpleForwardToFrontend. Attached diff (performance.diff) avoids flushing in SimpleForwardToFrontend when the data raw message in handled and removes pool_flush call after SimpleForwardToFronten. Other than data raw messages, flushing is done in SimpleForwardToFrontend. I got 40 to 50% performance boost using pgbench and custom query (SELECT * FROM pgbench_accounts) on my laptop with attached diff (performance.diff). Can you please test it?

t-ishii 2016-08-31 13:56 developer	performance.diff (1,146 bytes) performance.diff (1,146 bytes)

supp_k 2016-08-31 17:30 reporter ~0001040	I see the performance boost by 50% but not always. The performance increase is is floating for unknown reasons. We have dedicated hardware for these tests. Also I see the macros in the attached diff file (+#ifdef NOT_USED ... ). Should I activate it? Or it has a different purpose?

t-ishii 2016-08-31 17:44 developer ~0001041	My patch does not improve the performance if the query in questions returns small number of rows. For example, "pgbench -S" does not show improvement because the SELECT query in this pattern returns only 1 row at a time. Also I don't see improvement when resources are saturated because the effect of the patch is masked. For example, there's large number of concurrent queries are going on. > Also I see the macros in the attached diff file (+#ifdef NOT_USED ... ). > Should I activate it? Or it has a different purpose? It's a kind of customs in C coding :-) It means that a programmer wants to remove the part of the code temporarily. I will remove the part before commit.

supp_k 2016-08-31 21:36 reporter ~0001042	We have illuminated all possible obstacles that can consume system resources. But we still see that the performance boost for one thread is not stable. Sometimes we see it gives 50% performance boost. It is unpredictable. Can you please recommend what reasons may influence the performance of your patch?

supp_k 2016-08-31 22:15 reporter ~0001043	Is it possible to implement the similar action for date passed to backends? The strace util shows that simple "select 1" transferred to backend in separate flushes.

supp_k 2016-09-01 04:50 reporter ~0001044	Maximum stable performance growth we could reach is 50% for the query (select 1). It is possible to flush the whole request to backend and response from it in one operation? Anyway network clients (DB client and the PostgreSQL in this case) are waiting for the whole array of data to be received. In this case we see the performance level can be boosted more.

t-ishii 2016-09-02 07:57 developer ~0001045	That depends on the message types. Please define "whole message". I don't think in general it is possible. Anyway, I got 80% performance increase in my same test case by tweaking Pgpool-II further. Please test if you like (performance-v2.diff). BTW, bug tracker is not the perfect place for this kind of discussion. Please move to the mailing list.

t-ishii 2016-09-02 07:58 developer	performance-v2.diff (2,132 bytes) performance-v2.diff (2,132 bytes)

supp_k 2016-09-06 06:15 reporter ~0001047	Yes, the fix gives improvement in performance. We see the 25-50% increase of performance. Would it be possible to include the patch in the major codebase? Thank you!

t-ishii 2016-09-06 09:12 developer ~0001048	Yes, that's my plan.

Date Modified	Username	Field	Change
2016-08-29 18:52	supp_k	New Issue
2016-08-29 18:52	supp_k	File Added: flush_move.patch
2016-08-29 18:52	supp_k	File Added: chart_performance.png
2016-08-30 07:37	t-ishii	Assigned To	=> t-ishii
2016-08-30 07:37	t-ishii	Status	new => assigned
2016-08-30 07:37	t-ishii	Additional Information Updated
2016-08-31 05:44	supp_k	Note Added: 0001038
2016-08-31 13:56	t-ishii	Note Added: 0001039
2016-08-31 13:56	t-ishii	File Added: performance.diff
2016-08-31 13:57	t-ishii	Note Edited: 0001039
2016-08-31 13:57	t-ishii	Status	assigned => feedback
2016-08-31 17:30	supp_k	Note Added: 0001040
2016-08-31 17:30	supp_k	Status	feedback => assigned
2016-08-31 17:44	t-ishii	Note Added: 0001041
2016-08-31 21:36	supp_k	Note Added: 0001042
2016-08-31 22:15	supp_k	Note Added: 0001043
2016-09-01 04:50	supp_k	Note Added: 0001044
2016-09-02 07:57	t-ishii	Note Added: 0001045
2016-09-02 07:58	t-ishii	File Added: performance-v2.diff
2016-09-06 06:15	supp_k	Note Added: 0001047
2016-09-06 09:12	t-ishii	Note Added: 0001048
2016-09-06 09:13	t-ishii	Target Version	=> 3.6.0
2016-09-06 09:14	t-ishii	Status	assigned => resolved

View Issue Details

Activities

Issue History