[pgpool-general: 959] Re: Query Memory Cache goes OOM

Sebastian Kornehl webmaster at sourcebase.org
Wed Sep 5 20:22:14 JST 2012


I've waited ~30 minutes after the machine was unresponsive to check if 
it came back - and it came. pgpool log says:

2012-09-05 12:37:25 LOG:   pid 8812: statement: SELECT value FROM data 
WHERE id =16239;
2012-09-05 13:03:53 ERROR: pid 12575: pool_read: EOF encountered with 
backend
2012-09-05 13:03:55 ERROR: pid 12575: pool_read_kind: error while 
reading message kind
2012-09-05 13:04:05 ERROR: pid 13114: SimpleForwardToFrontend: 
pool_write_and_flush failed
2012-09-05 13:04:05 LOG:   pid 13114: do_child: exits with status 1 due 
to error
[.. this comes a few times, then ..]
2012-09-05 13:04:08 LOG:   pid 8730: do_child: exits with status 1 due 
to error
2012-09-05 13:04:09 ERROR: pid 8812: SimpleForwardToFrontend: 
pool_write_and_flush failed
2012-09-05 13:04:15 LOG:   pid 8812: do_child: exits with status 1 due 
to error
2012-09-05 13:04:47 LOG:   pid 9338: authentication is timeout
2012-09-05 13:04:54 LOG:   pid 9691: connection received: host=[...] 
port=42784
2012-09-05 13:04:54 LOG:   pid 960: PCP child 1596 exits with status 9 
by signal 9
2012-09-05 13:04:56 LOG:   pid 960: fork a new PCP child pid 14474
2012-09-05 13:04:56 LOG:   pid 960: worker child 1597 exits with status 
9 by signal 9
2012-09-05 13:04:56 LOG:   pid 960: fork a new worker child pid 14475
2012-09-05 13:05:01 LOG:   pid 3927: statement: show pool_cache;

syslog says:

2012 Sep  5 13:04:11 pgpool-1 [err]: kernel: Out of memory: Killed 
process 1596, UID 0, (pgpool).
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: pgpool invoked 
oom-killer: gfp_mask=0x200d2, order=0, oomkilladj=0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel:
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Call Trace:
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802c1b64>] 
out_of_memory+0x8b/0x203
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8020fa5d>] 
__alloc_pages+0x27f/0x308
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80211c32>] 
do_wp_page+0x3e9/0x8e8
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802336b6>] 
read_swap_cache_async+0x42/0xd1
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802c6c25>] 
swapin_readahead+0x4e/0x77
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8020996a>] 
__handle_mm_fault+0xf11/0x144f
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80231f76>] 
release_sock+0x13/0xbe
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80207116>] 
kmem_cache_free+0x84/0xd7
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80207116>] 
kmem_cache_free+0x84/0xd7
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80263929>] 
_spin_lock_irqsave+0x9/0x14
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80266d94>] 
do_page_fault+0xf72/0x131b
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8041e973>] 
sys_accept+0x1b6/0x1d2
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80207116>] 
kmem_cache_free+0x84/0xd7
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff80263929>] 
_spin_lock_irqsave+0x9/0x14
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff802956b6>] 
recalc_sigpending+0xe/0x25
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: [<ffffffff8025f82b>] 
error_exit+0x0/0x6e
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel:
2012 Sep  5 13:04:11 pgpool-1 [info]: kernel: Mem-info:
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: DMA per-cpu:
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 hot: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 cold: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 hot: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 cold: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 hot: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 cold: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 hot: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 cold: high 0, 
batch 1 used:0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: DMA32 per-cpu:
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 hot: high 186, 
batch 31 used:74
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 0 cold: high 62, 
batch 15 used:16
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 hot: high 186, 
batch 31 used:46
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 1 cold: high 62, 
batch 15 used:42
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 hot: high 186, 
batch 31 used:10
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 2 cold: high 62, 
batch 15 used:46
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 hot: high 186, 
batch 31 used:29
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: cpu 3 cold: high 62, 
batch 15 used:52
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Normal per-cpu: empty
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: HighMem per-cpu: empty
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Free pages: 10124kB 
(0kB HighMem)
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Active:3296 
inactive:947386 dirty:0 writeback:0 unstable:0 free:2531 slab:8265 
mapped-file:2016 mapped-anon:950927 pagetables:33607
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: DMA free:2040kB 
min:16kB low:20kB high:24kB active:0kB inactive:0kB present:9076kB 
pages_scanned:0 all_unreclaimable? yes
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 
4024 4024 4024
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: DMA32 free:8084kB 
min:8108kB low:10132kB high:12160kB active:13184kB inactive:3789544kB 
present:4120800kB pages_scanned:11144945 all_unreclaimable? yes
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 0 0 0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Normal free:0kB min:0kB 
low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 
all_unreclaimable? no
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 0 0 0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: HighMem free:0kB 
min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB 
pages_scanned:0 all_unreclaimable? no
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: lowmem_reserve[]: 0 0 0 0
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: DMA: 0*4kB 1*8kB 1*16kB 
1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 2040kB
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: DMA32: 17*4kB 4*8kB 
3*16kB 0*32kB 0*64kB 2*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 1*4096kB 
= 8084kB
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Normal: empty
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: HighMem: empty
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: 26511 pagecache pages
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Swap cache: add 
3128627, delete 3120367, find 1948467/2124994, race 35+122
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Free swap  = 0kB
2012 Sep  5 13:04:11 pgpool-1 [warning]: kernel: Total swap = 1048568kB
2012 Sep  5 13:04:11 pgpool-1 [info]: kernel: Free swap:            0kB
2012 Sep  5 13:04:11 pgpool-1 [info]: kernel: 1048576 pages of RAM
2012 Sep  5 13:04:11 pgpool-1 [info]: kernel: 23824 reserved pages
2012 Sep  5 13:04:11 pgpool-1 [info]: kernel: 159440 pages shared
2012 Sep  5 13:04:11 pgpool-1 [info]: kernel: 8394 pages swap cached



On 09/05/2012 12:24 PM, Sebastian Kornehl wrote:
> It's a fresh set up just for playing around with pgpool, the Database 
> and pgpool have their own machines with no other services.
>
> -Sebastian
>
> On 09/05/2012 12:16 PM, Tatsuo Ishii wrote:
>>> I changed this, actual I test this with 20 loops. This seems to hold,
>>> but still uses lot of memory (and swap).
>>>
>>> Maybe I just misunderstand the config values, is
>>> "memqcache_total_size" a per connection configuration or a total
>>> value?
>> No. Because it sits on shared memory, it is shared by all connections.
>> Are there any servers running on the machine other than pgpool?
>> -- 
>> Tatsuo Ishii
>> SRA OSS, Inc. Japan
>> English: http://www.sraoss.co.jp/index_en.php
>> Japanese: http://www.sraoss.co.jp
>>
>>> show pool_cache gives now:
>>>
>>>   num_cache_hits | num_selects | cache_hit_ratio | num_hash_entries |
>>>   used_hash_entries | num_cache_entries | used_cache_enrties_size |
>>>   free_cache_entries_size | fragment_cache_entries_size
>>> ----------------+-------------+-----------------+------------------+-------------------+-------------------+-------------------------+---------------------- 
>>>
>>> ---+-----------------------------
>>>   137 | 17411 | 0.01 | 1048576 | 13026 | 13026 | 1512533 | 260121555
>>>     | 299816
>>>
>>> With 20 selects per second (and 15 seconds until memqcache_expire) I
>>> would say there should be ~300 entries in the cache and not 13026.
>>>
>>> With 31 selects per second 'top' looks like this:
>>>
>>> Mem: 4194304k total, 4180556k used, 13748k free, 1144k buffers
>>> Swap: 1048568k total, 1048560k used, 8k free, 160984k cached
>>>
>>> Thanks,
>>>
>>> Sebastian
>>>
>>> On 09/05/2012 10:56 AM, Tatsuo Ishii wrote:
>>>> Thanks, but I didn't find any evidence of actual memory leak.
>>>> BTW,
>>>>>>> memqcache_max_num_cache = 10
>>>> looks way too small.
>>>>
>>>> Can you set to default and try again?
>>>>
>>>> memqcache_max_num_cache = 1000000
>>>> -- 
>>>> Tatsuo Ishii
>>>> SRA OSS, Inc. Japan
>>>> English: http://www.sraoss.co.jp/index_en.php
>>>> Japanese: http://www.sraoss.co.jp
>>>>
>>>>> Hi,
>>>>>
>>>>> I attached the log file.
>>>>>
>>>>> Thanks for your Help!
>>>>>
>>>>> Sebastian
>>>>>
>>>>> On 09/05/2012 04:23 AM, Tatsuo Ishii wrote:
>>>>>> I couldn't reproduce your problem here.
>>>>>> Can you try to run with valgrind to detect memory leak?
>>>>>> Just in case how to use valgrind:
>>>>>>
>>>>>> 1) start pgpool by following command
>>>>>>       valgrind --trace-children=yes --leak-check=full pgpool 
>>>>>> [your pgpool
>>>>>>       options here...]
>>>>>>       Note that you need to -n option of pgpool and redirect 
>>>>>> pgpool log to
>>>>>>       somewhere.
>>>>>>
>>>>>> 2) stop pgpool.
>>>>>>
>>>>>> 3) send pgpool log.
>>>>>> -- 
>>>>>> Tatsuo Ishii
>>>>>> SRA OSS, Inc. Japan
>>>>>> English: http://www.sraoss.co.jp/index_en.php
>>>>>> Japanese: http://www.sraoss.co.jp
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I'm testing the new query cache (pgpool-II version 3.2.0) on 
>>>>>>> CentOS 5.
>>>>>>>
>>>>>>> System Specs:
>>>>>>> 4 cores
>>>>>>> 4 GB Ram
>>>>>>>
>>>>>>> shmmax= 2147483648 (=2 GB)
>>>>>>>
>>>>>>> When I start pgpool, my memory usage looks like this:
>>>>>>>
>>>>>>> Mem: 4194304k total, 660016k used, 3534288k free, 62116k buffers
>>>>>>>
>>>>>>> My test scenario is the following:
>>>>>>>
>>>>>>> # Config:
>>>>>>>
>>>>>>> memory_cache_enabled = on
>>>>>>> memqcache_method = 'shmem'
>>>>>>> memqcache_total_size =  262144000 #250MB
>>>>>>> memqcache_max_num_cache = 10
>>>>>>> memqcache_expire = 15
>>>>>>> memqcache_auto_cache_invalidation = off
>>>>>>> memqcache_maxcache = 409600 #400kb
>>>>>>> memqcache_cache_block_size = 1048576 #1MB
>>>>>>> memqcache_oiddir = '/var/log/pgpool/oiddir'
>>>>>>>
>>>>>>> Table: data
>>>>>>> id integer pkey
>>>>>>> value character varying(255)
>>>>>>> [some more columns]
>>>>>>>
>>>>>>> SELECT value FROM data WHERE id =${randomNumber};
>>>>>>>
>>>>>>> so when I start the first
>>>>>>> while true; do psql -c "foobar ..." ; sleep 1; done
>>>>>>>
>>>>>>> everything looks good. When I start 9 more loops (= 10 selects per
>>>>>>> second), the used memory increases fast. After ~2 minutes it looks
>>>>>>> like this:
>>>>>>>
>>>>>>> Mem: 4194304k total, 2439012k used, 1755292k free, 62588k buffers
>>>>>>>
>>>>>>> after 2 more minutes:
>>>>>>>
>>>>>>> Mem: 4194304k total, 4053796k used, 140508k free, 62752k buffers
>>>>>>>
>>>>>>> When I add more clients, the machine begins to swap and dies.
>>>>>>>
>>>>>>> show pool_cache gives:
>>>>>>>
>>>>>>>     num_cache_hits | num_selects | cache_hit_ratio | 
>>>>>>> num_hash_entries |
>>>>>>>     used_hash_entries | num_cache_entries | 
>>>>>>> used_cache_enrties_size |
>>>>>>>     free_cache_entries_size | fragment_cache_entries_size
>>>>>>> ----------------+-------------+-----------------+------------------+-------------------+-------------------+-------------------------+---------------------- 
>>>>>>>
>>>>>>> ---+-----------------------------
>>>>>>>     0 | 1527 | 0.00 | 16 | 15 | 15 | 1746 | 262142254 | 0
>>>>>>> (1 row)
>>>>>>>
>>>>>>> So my question:
>>>>>>>
>>>>>>> What I'm doing wrong? The plan is to connect more Frontends with 
>>>>>>> 300+
>>>>>>> statements per second.
>>>>>>> When I turn memory_cache_enabled off the problem does not occur.
>>>>>>>
>>>>>>> Thanks in advance!
>>>>>>>
>>>>>>> Sebastian
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> pgpool-general mailing list
>>>>>>> pgpool-general at pgpool.net
>>>>>>> http://www.pgpool.net/mailman/listinfo/pgpool-general
>
> _______________________________________________
> pgpool-general mailing list
> pgpool-general at pgpool.net
> http://www.pgpool.net/mailman/listinfo/pgpool-general



More information about the pgpool-general mailing list