[pgpool-general: 959] Re: Query Memory Cache goes OOM
Sebastian Kornehl
webmaster at sourcebase.org
Wed Sep 5 20:22:14 JST 2012
I've waited ~30 minutes after the machine was unresponsive to check if
it came back - and it came. pgpool log says:
2012-09-05 12:37:25 LOG: pid 8812: statement: SELECT value FROM data
WHERE id =16239;
2012-09-05 13:03:53 ERROR: pid 12575: pool_read: EOF encountered with
backend
2012-09-05 13:03:55 ERROR: pid 12575: pool_read_kind: error while
reading message kind
2012-09-05 13:04:05 ERROR: pid 13114: SimpleForwardToFrontend:
pool_write_and_flush failed
2012-09-05 13:04:05 LOG: pid 13114: do_child: exits with status 1 due
to error
[.. this comes a few times, then ..]
2012-09-05 13:04:08 LOG: pid 8730: do_child: exits with status 1 due
to error
2012-09-05 13:04:09 ERROR: pid 8812: SimpleForwardToFrontend:
pool_write_and_flush failed
2012-09-05 13:04:15 LOG: pid 8812: do_child: exits with status 1 due
to error
2012-09-05 13:04:47 LOG: pid 9338: authentication is timeout
2012-09-05 13:04:54 LOG: pid 9691: connection received: host=[...]
port=42784
2012-09-05 13:04:54 LOG: pid 960: PCP child 1596 exits with status 9
by signal 9
2012-09-05 13:04:56 LOG: pid 960: fork a new PCP child pid 14474
2012-09-05 13:04:56 LOG: pid 960: worker child 1597 exits with status
9 by signal 9
2012-09-05 13:04:56 LOG: pid 960: fork a new worker child pid 14475
2012-09-05 13:05:01 LOG: pid 3927: statement: show pool_cache;
syslog says:
2012 Sep 5 13:04:11 pgpool-1 [err]: kernel: Out of memory: Killed
process 1596, UID 0, (pgpool).
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: pgpool invoked
oom-killer: gfp_mask=0x200d2, order=0, oomkilladj=0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel:
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Call Trace:
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802c1b64>]
out_of_memory+0x8b/0x203
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8020fa5d>]
__alloc_pages+0x27f/0x308
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80211c32>]
do_wp_page+0x3e9/0x8e8
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802336b6>]
read_swap_cache_async+0x42/0xd1
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802c6c25>]
swapin_readahead+0x4e/0x77
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8020996a>]
__handle_mm_fault+0xf11/0x144f
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80231f76>]
release_sock+0x13/0xbe
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80207116>]
kmem_cache_free+0x84/0xd7
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80207116>]
kmem_cache_free+0x84/0xd7
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80263929>]
_spin_lock_irqsave+0x9/0x14
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80266d94>]
do_page_fault+0xf72/0x131b
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8041e973>]
sys_accept+0x1b6/0x1d2
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80207116>]
kmem_cache_free+0x84/0xd7
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80263929>]
_spin_lock_irqsave+0x9/0x14
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802956b6>]
recalc_sigpending+0xe/0x25
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8025f82b>]
error_exit+0x0/0x6e
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel:
2012 Sep 5 13:04:11 pgpool-1 [info]: kernel: Mem-info:
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: DMA per-cpu:
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 hot: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 cold: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 hot: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 cold: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 hot: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 cold: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 hot: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 cold: high 0,
batch 1 used:0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: DMA32 per-cpu:
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 hot: high 186,
batch 31 used:74
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 cold: high 62,
batch 15 used:16
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 hot: high 186,
batch 31 used:46
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 cold: high 62,
batch 15 used:42
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 hot: high 186,
batch 31 used:10
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 cold: high 62,
batch 15 used:46
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 hot: high 186,
batch 31 used:29
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 cold: high 62,
batch 15 used:52
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Normal per-cpu: empty
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: HighMem per-cpu: empty
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Free pages: 10124kB
(0kB HighMem)
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Active:3296
inactive:947386 dirty:0 writeback:0 unstable:0 free:2531 slab:8265
mapped-file:2016 mapped-anon:950927 pagetables:33607
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: DMA free:2040kB
min:16kB low:20kB high:24kB active:0kB inactive:0kB present:9076kB
pages_scanned:0 all_unreclaimable? yes
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0
4024 4024 4024
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: DMA32 free:8084kB
min:8108kB low:10132kB high:12160kB active:13184kB inactive:3789544kB
present:4120800kB pages_scanned:11144945 all_unreclaimable? yes
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 0 0 0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Normal free:0kB min:0kB
low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 0 0 0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: HighMem free:0kB
min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB
pages_scanned:0 all_unreclaimable? no
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 0 0 0
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: DMA: 0*4kB 1*8kB 1*16kB
1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2040kB
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: DMA32: 17*4kB 4*8kB
3*16kB 0*32kB 0*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 1*4096kB
= 8084kB
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Normal: empty
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: HighMem: empty
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: 26511 pagecache pages
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Swap cache: add
3128627, delete 3120367, find 1948467/2124994, race 35+122
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Free swap = 0kB
2012 Sep 5 13:04:11 pgpool-1 [warning]: kernel: Total swap = 1048568kB
2012 Sep 5 13:04:11 pgpool-1 [info]: kernel: Free swap: 0kB
2012 Sep 5 13:04:11 pgpool-1 [info]: kernel: 1048576 pages of RAM
2012 Sep 5 13:04:11 pgpool-1 [info]: kernel: 23824 reserved pages
2012 Sep 5 13:04:11 pgpool-1 [info]: kernel: 159440 pages shared
2012 Sep 5 13:04:11 pgpool-1 [info]: kernel: 8394 pages swap cached
On 09/05/2012 12:24 PM, Sebastian Kornehl wrote:
> It's a fresh set up just for playing around with pgpool, the Database
> and pgpool have their own machines with no other services.
>
> -Sebastian
>
> On 09/05/2012 12:16 PM, Tatsuo Ishii wrote:
>>> I changed this, actual I test this with 20 loops. This seems to hold,
>>> but still uses lot of memory (and swap).
>>>
>>> Maybe I just misunderstand the config values, is
>>> "memqcache_total_size" a per connection configuration or a total
>>> value?
>> No. Because it sits on shared memory, it is shared by all connections.
>> Are there any servers running on the machine other than pgpool?
>> --
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>>> show pool_cache gives now:
>>>
>>> num_cache_hits | num_selects | cache_hit_ratio | num_hash_entries |
>>> used_hash_entries | num_cache_entries | used_cache_enrties_size |
>>> free_cache_entries_size | fragment_cache_entries_size
>>> ----------------+-------------+-----------------+------------------+-------------------+-------------------+-------------------------+----------------------
>>>
>>> ---+-----------------------------
>>> 137 | 17411 | 0.01 | 1048576 | 13026 | 13026 | 1512533 | 260121555
>>> | 299816
>>>
>>> With 20 selects per second (and 15 seconds until memqcache_expire) I
>>> would say there should be ~300 entries in the cache and not 13026.
>>>
>>> With 31 selects per second 'top' looks like this:
>>>
>>> Mem: 4194304k total, 4180556k used, 13748k free, 1144k buffers
>>> Swap: 1048568k total, 1048560k used, 8k free, 160984k cached
>>>
>>> Thanks,
>>>
>>> Sebastian
>>>
>>> On 09/05/2012 10:56 AM, Tatsuo Ishii wrote:
>>>> Thanks, but I didn't find any evidence of actual memory leak.
>>>> BTW,
>>>>>>> memqcache_max_num_cache = 10
>>>> looks way too small.
>>>>
>>>> Can you set to default and try again?
>>>>
>>>> memqcache_max_num_cache = 1000000
>>>> --
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>>> Hi,
>>>>>
>>>>> I attached the log file.
>>>>>
>>>>> Thanks for your Help!
>>>>>
>>>>> Sebastian
>>>>>
>>>>> On 09/05/2012 04:23 AM, Tatsuo Ishii wrote:
>>>>>> I couldn't reproduce your problem here.
>>>>>> Can you try to run with valgrind to detect memory leak?
>>>>>> Just in case how to use valgrind:
>>>>>>
>>>>>> 1) start pgpool by following command
>>>>>> valgrind --trace-children=yes --leak-check=full pgpool
>>>>>> [your pgpool
>>>>>> options here...]
>>>>>> Note that you need to -n option of pgpool and redirect
>>>>>> pgpool log to
>>>>>> somewhere.
>>>>>>
>>>>>> 2) stop pgpool.
>>>>>>
>>>>>> 3) send pgpool log.
>>>>>> --
>>>>>> Tatsuo Ishii
>>>>>> SRA OSS, Inc. Japan
>>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>>> Japanese: http://www.sraoss.co.jp
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'm testing the new query cache (pgpool-II version 3.2.0) on
>>>>>>> CentOS 5.
>>>>>>>
>>>>>>> System Specs:
>>>>>>> 4 cores
>>>>>>> 4 GB Ram
>>>>>>>
>>>>>>> shmmax= 2147483648 (=2 GB)
>>>>>>>
>>>>>>> When I start pgpool, my memory usage looks like this:
>>>>>>>
>>>>>>> Mem: 4194304k total, 660016k used, 3534288k free, 62116k buffers
>>>>>>>
>>>>>>> My test scenario is the following:
>>>>>>>
>>>>>>> # Config:
>>>>>>>
>>>>>>> memory_cache_enabled = on
>>>>>>> memqcache_method = 'shmem'
>>>>>>> memqcache_total_size = 262144000 #250MB
>>>>>>> memqcache_max_num_cache = 10
>>>>>>> memqcache_expire = 15
>>>>>>> memqcache_auto_cache_invalidation = off
>>>>>>> memqcache_maxcache = 409600 #400kb
>>>>>>> memqcache_cache_block_size = 1048576 #1MB
>>>>>>> memqcache_oiddir = '/var/log/pgpool/oiddir'
>>>>>>>
>>>>>>> Table: data
>>>>>>> id integer pkey
>>>>>>> value character varying(255)
>>>>>>> [some more columns]
>>>>>>>
>>>>>>> SELECT value FROM data WHERE id =${randomNumber};
>>>>>>>
>>>>>>> so when I start the first
>>>>>>> while true; do psql -c "foobar ..." ; sleep 1; done
>>>>>>>
>>>>>>> everything looks good. When I start 9 more loops (= 10 selects per
>>>>>>> second), the used memory increases fast. After ~2 minutes it looks
>>>>>>> like this:
>>>>>>>
>>>>>>> Mem: 4194304k total, 2439012k used, 1755292k free, 62588k buffers
>>>>>>>
>>>>>>> after 2 more minutes:
>>>>>>>
>>>>>>> Mem: 4194304k total, 4053796k used, 140508k free, 62752k buffers
>>>>>>>
>>>>>>> When I add more clients, the machine begins to swap and dies.
>>>>>>>
>>>>>>> show pool_cache gives:
>>>>>>>
>>>>>>> num_cache_hits | num_selects | cache_hit_ratio |
>>>>>>> num_hash_entries |
>>>>>>> used_hash_entries | num_cache_entries |
>>>>>>> used_cache_enrties_size |
>>>>>>> free_cache_entries_size | fragment_cache_entries_size
>>>>>>> ----------------+-------------+-----------------+------------------+-------------------+-------------------+-------------------------+----------------------
>>>>>>>
>>>>>>> ---+-----------------------------
>>>>>>> 0 | 1527 | 0.00 | 16 | 15 | 15 | 1746 | 262142254 | 0
>>>>>>> (1 row)
>>>>>>>
>>>>>>> So my question:
>>>>>>>
>>>>>>> What I'm doing wrong? The plan is to connect more Frontends with
>>>>>>> 300+
>>>>>>> statements per second.
>>>>>>> When I turn memory_cache_enabled off the problem does not occur.
>>>>>>>
>>>>>>> Thanks in advance!
>>>>>>>
>>>>>>> Sebastian
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> pgpool-general mailing list
>>>>>>> pgpool-general at pgpool.net
>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general
More information about the pgpool-general
mailing list