View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000315 | Pgpool-II | Enhancement | public | 2017-06-20 01:47 | 2017-08-29 09:25 |
| Reporter | aarevalo | Assigned To | Muhammad Usama | ||
| Priority | normal | Severity | major | Reproducibility | always |
| Status | closed | Resolution | open | ||
| Product Version | 3.6.4 | ||||
| Summary | 0000315: High CPU usage when commiting large transactions and using in (shared) memory cache | ||||
| Description | Pgpool version: 3.6.4 OS: Ubuntu Server 16 In memory cache settings: memory_cache_enabled = on memqcache_method = 'shmem' memqcache_total_size = 2147483648 # 2 GB memqcache_max_num_cache = 1000000 memqcache_auto_cache_invalidation = on memqcache_maxcache = 409600 # default, 400 KB We have detected that some transactions, which involve large number of queries and data rows, make pgpool child processes get stuck up to several minutes just after they receive the "COMMIT" statement. During this time, processes consume 100% of their CPU, until they are ready to accept the next connection. We have tracked down to the point where they are "stuck". It's always on the same place: #0 AllocSetFree (context=0x12e3b50, pointer=0x53e5bb0) at ../../src/utils/mmgr/aset.c:965 0000001 0x0000000000437cf7 in pool_discard_buffer (buffer=0x53d4a80) at query_cache/pool_memqcache.c:2928 0000002 0x0000000000439a25 in pool_discard_temp_query_cache (temp_cache=0x53d4620) at query_cache/pool_memqcache.c:2817 0000003 0x0000000000439aa5 in pool_discard_query_cache_array (cache_array=0xeeb5a40) at query_cache/pool_memqcache.c:2750 0000004 0x0000000000439b9f in pool_reset_memqcache_buffer () at query_cache/pool_memqcache.c:1677 0000005 0x000000000043cf15 in pool_handle_query_cache (backend=backend@entry=0x131e490, query=query@entry=0x14f6dc0 "COMMIT", node=node@entry=0x134e4a0, state=<optimized out>) at query_cache/pool_memqcache.c:3285 0000006 0x0000000000435fbc in ReadyForQuery (frontend=frontend@entry=0x131ff10, backend=backend@entry=0x131e490, send_ready=send_ready@entry=1 '\001', cache_commit=cache_commit@entry=1 '\001') at protocol/pool_proto_modules.c:1942 0000007 0x000000000043662c in ProcessBackendResponse (frontend=frontend@entry=0x131ff10, backend=backend@entry=0x131e490, state=state@entry=0x7fffae4523fc, num_fields=num_fields@entry=0x7fffae4523fa) at protocol/pool_proto_modules.c:2567 0000008 0x000000000042ae9e in pool_process_query (frontend=0x131ff10, backend=0x131e490, reset_request=reset_request@entry=0) at protocol/pool_process_query.c:303 0000009 0x0000000000425781 in do_child (fds=fds@entry=0x11e64c0) at protocol/child.c:377 0000010 0x0000000000409a45 in fork_a_child (fds=0x11e64c0, id=89) at main/pgpool_main.c:755 0000011 0x000000000040ac4e in reaper () at main/pgpool_main.c:2525 0000012 0x000000000040d9ad in pool_sleep (second=<optimized out>) at main/pgpool_main.c:2741 0000013 0x000000000040f9ef in PgpoolMain (discard_status=discard_status@entry=0 '\000', clear_memcache_oidmaps=clear_memcache_oidmaps@entry=0 '\000') at main/pgpool_main.c:533 0000014 0x00000000004081ec in main (argc=<optimized out>, argv=<optimized out>) at main/main.c:300 After some investigation, it seems that processes are consuming huge amounts of CPU trying to free previously allocated chunks of memory. They do so by looping over a single linked list. As this piece of code comes from the Postgresql project, we have found that they have applied some optimizations on this topic (https://github.com/postgres/postgres/commit/ff97741bc810390db6dd4da0f31ee1e93c8d3abb) that can be back-ported to Pgpool. In memory cache works really well most of the time, but at peak times it makes our system totally irresponsible, as it becomes a bottleneck, even with dedicated hardware (16x Xeon E5-2689v4 @ 3.10Ghz cores, 64 GB RAM). | ||||
| Tags | No tags attached. | ||||
|
|
Hi, I have imported all the changes made to PostgreSQL's memory manager API since it was installed in the Pgpool-II. https://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=85392b89b5791cb3dceb59c6567f47911758467e Can you please check if it improves the performance for mentioned case, You would require to build the Pgpool-II from the source code for performing the test. Thanks, Best Regards |
|
|
We have compiled pgpool and is running in our production environment under heavy load (sustained +20K TPS) since two days ago. We haven't seen the previous behaviour, so now it's working as expected. Moreover, we see an improvement in CPU usage ... now it's using about 25% less processor resources (measuring load average and user cpu time), probably related to the improvements in memory management. Thanks for the great work! |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2017-06-20 01:47 | aarevalo | New Issue | |
| 2017-06-20 11:20 | t-ishii | Assigned To | => Muhammad Usama |
| 2017-06-20 11:20 | t-ishii | Status | new => assigned |
| 2017-06-30 00:56 | Muhammad Usama | Note Added: 0001562 | |
| 2017-07-07 16:52 | aarevalo | Note Added: 0001578 | |
| 2017-08-29 09:25 | pengbo | Status | assigned => closed |