[pgpool-general: 7722] Re: Out of memory

Luiz Pasqual luiz at pasquall.com
Mon Sep 27 21:50:12 JST 2021


I see.

The problem is, the OOM happens when all RAM is already compromised by
pgpool (all 16Gb of RAM is dedicated to pgpool), so any common request will
die anyway.

It happens on versions: 4.2.5 and 4.2.1. Now, I'm running the same
application on 3.7.3 and it's running great.

I don't know if it helps, but just to demonstrate the amount of RAM is
necessary to run a few connections I created another VM, and comparing:

62 active connections running on 4.2.1 (same behaviour on 4.2.5):
$ vmstat -a -SM
procs -----------memory---------- ---swap-- -----io---- -system--
------cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id
wa st
 0  0      0   4829    173  10772    0    0     2     1   68   56  1  1 99
 0  0

31 active connections running on 3.7.3:
$ vmstat -a -SM
procs -----------memory---------- ---swap-- -----io---- -system--
------cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id
wa st
 0  0      0  15174     73    635    0    0     1     1   76    3  0  0 100
 0  0

Both are running basically the same application.

Seeing the RAM consumption of 3.7.3, it does not seem to be receiving such
a big requests. Is there a way to track it down?



On Mon, Sep 27, 2021 at 9:07 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:

> At this point I cannot judge whether the problem is caused by a pgpool
> bug or client's resource request is too much.
>
> Typical bug which requests too much memory allocation is not something
> like this because 33947648 = 32MB memory request itself is not insane.
>
> > Sep 24 09:14:10 pgpool pgpool[12650]: [426-2] 2021-09-24 09:14:10: pid
> > 12650: DETAIL:  Failed on request of size 33947648.
>
> (Please let us know what version of Pgpool-II you are using because
> it's important information to identify any known bug).
>
> In the mean time, however, I think 32MB memory request is not very
> common in pgpool. One thing I wonder is, whether your application
> issues SQL which requires large memory: e.g. very long SQL statement,
> COPY for large data set. They will request large read/write buffer in
> pgpool.
>
> > We saw both, but pgpool aborting is way more common:
> > Sep 24 09:14:10 pgpool pgpool[12650]: [426-1] 2021-09-24 09:14:10: pid
> > 12650: ERROR:  out of memory
> > Sep 24 09:14:10 pgpool pgpool[12650]: [426-2] 2021-09-24 09:14:10: pid
> > 12650: DETAIL:  Failed on request of size 33947648.
> > Sep 24 09:14:10 pgpool pgpool[12650]: [426-3] 2021-09-24 09:14:10: pid
> > 12650: LOCATION:  mcxt.c:900
> >
> > Here, two other ways we saw at the logs, but those just occurred once
> each:
> > Sep 24 07:33:14 pgpool pgpool[5874]: [434-1] 2021-09-24 07:33:14: pid
> 5874:
> > FATAL:  failed to fork a child
> > Sep 24 07:33:14 pgpool pgpool[5874]: [434-2] 2021-09-24 07:33:14: pid
> 5874:
> > DETAIL:  system call fork() failed with reason: Cannot allocate memory
> > Sep 24 07:33:14 pgpool pgpool[5874]: [434-3] 2021-09-24 07:33:14: pid
> 5874:
> > LOCATION:  pgpool_main.c:681
> >
> > And:
> > Sep 23 17:07:40 pgpool kernel: [157160.691518] pgpool invoked oom-killer:
> > gfp_mask=0x24200ca, order=0, oom_score_adj=0
> > Sep 23 17:07:40 pgpool kernel: [157160.691525] CPU: 1 PID: 1194 Comm:
> > pgpool Not tainted 4.4.276 #1
> > Sep 23 17:07:40 pgpool kernel: [157160.691527] Hardware name: VMware,
> Inc.
> > VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00
> > 12/12/2018
> > Sep 23 17:07:40 pgpool kernel: [157160.691528]  0000000000000000
> > ffff8803281cbae8 ffffffff81c930e7 ffff880420efe2c0
> > Sep 23 17:07:40 pgpool kernel: [157160.691530]  ffff880420efe2c0
> > ffff8803281cbb50 ffffffff81c8d9da ffff8803281cbb08
> > Sep 23 17:07:40 pgpool kernel: [157160.691532]  ffffffff81133f1a
> > ffff8803281cbb80 ffffffff81182eb0 ffff8800bba393c0
> > Sep 23 17:07:40 pgpool kernel: [157160.691533] Call Trace:
> > Sep 23 17:07:40 pgpool kernel: [157160.691540]  [<ffffffff81c930e7>]
> > dump_stack+0x57/0x6d
> > Sep 23 17:07:40 pgpool kernel: [157160.691542]  [<ffffffff81c8d9da>]
> > dump_header.isra.9+0x54/0x1ae
> > Sep 23 17:07:40 pgpool kernel: [157160.691547]  [<ffffffff81133f1a>] ?
> > __delayacct_freepages_end+0x2a/0x30
> > Sep 23 17:07:40 pgpool kernel: [157160.691553]  [<ffffffff81182eb0>] ?
> > do_try_to_free_pages+0x350/0x3d0
> > Sep 23 17:07:40 pgpool kernel: [157160.691556]  [<ffffffff811709f9>]
> > oom_kill_process+0x209/0x3c0
> > Sep 23 17:07:40 pgpool kernel: [157160.691558]  [<ffffffff81170eeb>]
> > out_of_memory+0x2db/0x2f0
> > Sep 23 17:07:40 pgpool kernel: [157160.691561]  [<ffffffff81176111>]
> > __alloc_pages_nodemask+0xa81/0xae0
> > Sep 23 17:07:40 pgpool kernel: [157160.691565]  [<ffffffff811ad2cd>]
> > __read_swap_cache_async+0xdd/0x130
> > Sep 23 17:07:40 pgpool kernel: [157160.691567]  [<ffffffff811ad337>]
> > read_swap_cache_async+0x17/0x40
> > Sep 23 17:07:40 pgpool kernel: [157160.691569]  [<ffffffff811ad455>]
> > swapin_readahead+0xf5/0x190
> > Sep 23 17:07:40 pgpool kernel: [157160.691571]  [<ffffffff8119ce3f>]
> > handle_mm_fault+0xf3f/0x15e0
> > Sep 23 17:07:40 pgpool kernel: [157160.691574]  [<ffffffff81c9d4e2>] ?
> > __schedule+0x272/0x770
> > Sep 23 17:07:40 pgpool kernel: [157160.691576]  [<ffffffff8104e241>]
> > __do_page_fault+0x161/0x370
> > Sep 23 17:07:40 pgpool kernel: [157160.691577]  [<ffffffff8104e49c>]
> > do_page_fault+0xc/0x10
> > Sep 23 17:07:40 pgpool kernel: [157160.691579]  [<ffffffff81ca3782>]
> > page_fault+0x22/0x30
> > Sep 23 17:07:40 pgpool kernel: [157160.691581] Mem-Info:
> > Sep 23 17:07:40 pgpool kernel: [157160.691584] active_anon:3564689
> > inactive_anon:445592 isolated_anon:0
> > Sep 23 17:07:40 pgpool kernel: [157160.691584]  active_file:462
> > inactive_file:44 isolated_file:0
> > Sep 23 17:07:40 pgpool kernel: [157160.691584]  unevictable:0 dirty:2
> > writeback:2212 unstable:0
> > Sep 23 17:07:40 pgpool kernel: [157160.691584]  slab_reclaimable:3433
> > slab_unreclaimable:5859
> > Sep 23 17:07:40 pgpool kernel: [157160.691584]  mapped:989 shmem:2607
> > pagetables:16367 bounce:0
> > Sep 23 17:07:40 pgpool kernel: [157160.691584]  free:51773 free_pcp:189
> > free_cma:0
> > Sep 23 17:07:40 pgpool kernel: [157160.691589] DMA free:15904kB min:128kB
> > low:160kB high:192kB active_anon:0kB inactive_anon:0kB active_file:0kB
> > inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
> > present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB
> > mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
> > kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB
> > local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
> > all_unreclaimable? yes
> > Sep 23 17:07:40 pgpool kernel: [157160.691590] lowmem_reserve[]: 0 2960
> > 15991 15991
> > Sep 23 17:07:40 pgpool kernel: [157160.691596] DMA32 free:77080kB
> > min:25000kB low:31248kB high:37500kB active_anon:2352676kB
> > inactive_anon:590536kB active_file:472kB inactive_file:140kB
> > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129216kB
> > managed:3043556kB mlocked:0kB dirty:0kB writeback:4144kB mapped:1140kB
> > shmem:3568kB slab_reclaimable:1028kB slab_unreclaimable:3004kB
> > kernel_stack:816kB pagetables:10988kB unstable:0kB bounce:0kB
> > free_pcp:312kB local_pcp:196kB free_cma:0kB writeback_tmp:0kB
> > pages_scanned:4292 all_unreclaimable? yes
> > Sep 23 17:07:40 pgpool kernel: [157160.691597] lowmem_reserve[]: 0 0
> 13031
> > 13031
> > Sep 23 17:07:40 pgpool kernel: [157160.691601] Normal free:114108kB
> > min:110032kB low:137540kB high:165048kB active_anon:11906080kB
> > inactive_anon:1191832kB active_file:1376kB inactive_file:36kB
> > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB
> > managed:13343784kB mlocked:0kB dirty:8kB writeback:4704kB mapped:2816kB
> > shmem:6860kB slab_reclaimable:12704kB slab_unreclaimable:20432kB
> > kernel_stack:4848kB pagetables:54480kB unstable:0kB bounce:0kB
> > free_pcp:444kB local_pcp:196kB free_cma:0kB writeback_tmp:0kB
> > pages_scanned:105664 all_unreclaimable? yes
> > Sep 23 17:07:40 pgpool kernel: [157160.691602] lowmem_reserve[]: 0 0 0 0
> > Sep 23 17:07:40 pgpool kernel: [157160.691603] DMA: 0*4kB 0*8kB 0*16kB
> > 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U)
> 1*2048kB
> > (U) 3*4096kB (M) = 15904kB
> > Sep 23 17:07:40 pgpool kernel: [157160.691610] DMA32: 48*4kB (ME) 63*8kB
> > (ME) 62*16kB (E) 46*32kB (UME) 35*64kB (UME) 26*128kB (UME) 25*256kB
> (UME)
> > 61*512kB (UME) 28*1024kB (UME) 1*2048kB (M) 0*4096kB = 77080kB
> > Sep 23 17:07:40 pgpool kernel: [157160.691616] Normal: 165*4kB (MEH)
> > 317*8kB (UMEH) 442*16kB (UMEH) 289*32kB (UMEH) 162*64kB (UMEH) 121*128kB
> > (UMEH) 74*256kB (UMEH) 33*512kB (UMEH) 24*1024kB (ME) 0*2048kB 2*4096kB
> (M)
> > = 113980kB
> > Sep 23 17:07:40 pgpool kernel: [157160.691623] 5552 total pagecache pages
> > Sep 23 17:07:40 pgpool kernel: [157160.691624] 2355 pages in swap cache
> > Sep 23 17:07:40 pgpool kernel: [157160.691625] Swap cache stats: add
> > 5385308, delete 5382953, find 1159094/1325033
> > Sep 23 17:07:40 pgpool kernel: [157160.691626] Free swap  = 0kB
> > Sep 23 17:07:40 pgpool kernel: [157160.691626] Total swap = 4194300kB
> > Sep 23 17:07:40 pgpool kernel: [157160.691627] 4194174 pages RAM
> > Sep 23 17:07:40 pgpool kernel: [157160.691628] 0 pages
> HighMem/MovableOnly
> > Sep 23 17:07:40 pgpool kernel: [157160.691628] 93363 pages reserved
> > Sep 23 17:07:40 pgpool kernel: [157160.691989] Out of memory: Kill
> process
> > 8975 (pgpool) score 7 or sacrifice child
> > Sep 23 17:07:40 pgpool kernel: [157160.691995] Killed process 8975
> (pgpool)
> > total-vm:337504kB, anon-rss:166824kB, file-rss:1920kB
> >
> >
> >
> > On Mon, Sep 27, 2021 at 1:22 AM Tatsuo Ishii <ishii at sraoss.co.jp> wrote:
> >
> >> Hi,
> >>
> >> > Hello,
> >> >
> >> > Our pgpool is consuming A LOT of memory and frequently dies with: Out
> of
> >> > memory error.
> >> >
> >> > We have 2 backends, 1 master and 1 slave. Here some config:
> >> > num_init_children = 150
> >> > max_pool = 1
> >> > child_life_time = 300
> >> > child_max_connections = 1
> >> > connection_life_time = 0
> >> > client_idle_limit = 0
> >> > connection_cache = on
> >> > load_balance_mode = on
> >> > memory_cache_enabled = off
> >> >
> >> > RAM: 16Gb
> >> >
> >> > Does anyone have a clue what's going on?
> >> >
> >> > Thank you.
> >>
> >> Is that OOM killer or pgpool itself aborted with an out of memory
> >> error?  If latter, can you share the pgpool log?
> >> --
> >> Tatsuo Ishii
> >> SRA OSS, Inc. Japan
> >> English: http://www.sraoss.co.jp/index_en.php
> >> Japanese:http://www.sraoss.co.jp
> >>
> >
> >
> > --
> > Luiz Fernando Pasqual S. Souza
> > mail: luiz at pasquall.com
>


-- 
Luiz Fernando Pasqual S. Souza
mail: luiz at pasquall.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.pgpool.net/pipermail/pgpool-general/attachments/20210927/b9b69c03/attachment.htm>


More information about the pgpool-general mailing list