【赏金20元】怀疑堆外内存泄漏,烦请大家给些思路和建议
- 操作系统Linux
- JDK版本JDK8
- 内存8GB
- CPU核数null
- 操作系统位数64位
问题描述
线上的一个服务,在运行一段时间后,RES 会持续膨胀到 3G,4G 多, 直到被系统 kill 掉。
环境说明
启动参数:
-Xms1024m -Xmx1024m -XX:NativeMemoryTracking=summary -XX:MetaspaceSize=64m -XX:MaxMetaspaceSize=256m
内存记录
5.19
RES
2736.08 MB
Thread Nums
605
Native Memory:
jcmd $pid VM.native_memory summary scale=MB
Total: reserved=3298MB, committed=2140MB
- Java Heap (reserved=1024MB, committed=1024MB)
(mmap: reserved=1024MB, committed=1024MB)
- Class (reserved=1149MB, committed=139MB)
(classes #19913)
(malloc=9MB #53910)
(mmap: reserved=1140MB, committed=130MB)
- Thread (reserved=610MB, committed=610MB)
(thread #605)
(stack: reserved=606MB, committed=606MB)
(malloc=2MB #3026)
(arena=2MB #1208)
- Code (reserved=264MB, committed=116MB)
(malloc=21MB #27562)
(mmap: reserved=244MB, committed=96MB)
- GC (reserved=43MB, committed=43MB)
(malloc=6MB #972)
(mmap: reserved=37MB, committed=37MB)
- Compiler (reserved=1MB, committed=1MB)
- Internal (reserved=177MB, committed=177MB)
(malloc=177MB #36227)
- Symbol (reserved=24MB, committed=24MB)
(malloc=21MB #222179)
(arena=4MB #1)
- Native Memory Tracking (reserved=5MB, committed=5MB)
(tracking overhead=5MB)
怀疑点
怀疑是 DirectBuffer 没释放回收内存
在 JDK 8 的环境中,不指定 -XX: MaxDirectMemorySize 堆外最大内存值时,默认值与 xmx 差不多大,如果是 DirectBuffer 内存没回收,那么应该也不会超过上面设置的 xmx,实际是超过了,应该要报错:java.lang.OutOfMemoryError: Direct buffer memory
才对,但是日志并没有找到这个错误
怀疑是 Native Code(C 代码)申请的堆外内存
通过开启 NativeMemoryTracking 统计,显示的 committed 的内存小于物理内存,因为 jcmd 命令显示的内存包含堆内内存、Code 区域、通过 unsafe.allocateMemory 和 DirectByteBuffer 申请的内存,但是不包含其他 Native Code(C 代码)申请的堆外内存。
尝试过的方法
dump 内存分析
实际上,dump live 的对象,只有300MB 多,初始分配的 1024MB 的 heap,直至到服务挂掉的最后, heap 使用率都是处于正常的情况。
using thread-local object allocation.
Parallel GC with 4 thread(s)
Heap Configuration:
MinHeapFreeRatio = 0
MaxHeapFreeRatio = 100
MaxHeapSize = 1073741824 (1024.0MB)
NewSize = 357564416 (341.0MB)
MaxNewSize = 357564416 (341.0MB)
OldSize = 716177408 (683.0MB)
NewRatio = 2
SurvivorRatio = 8
MetaspaceSize = 67108864 (64.0MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 268435456 (256.0MB)
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 353370112 (337.0MB)
used = 137123824 (130.77146911621094MB)
free = 216246288 (206.22853088378906MB)
38.80459024219909% used
From Space:
capacity = 2097152 (2.0MB)
used = 1261680 (1.2032318115234375MB)
free = 835472 (0.7967681884765625MB)
60.161590576171875% used
To Space:
capacity = 2097152 (2.0MB)
used = 0 (0.0MB)
free = 2097152 (2.0MB)
0.0% used
PS Old Generation
capacity = 716177408 (683.0MB)
used = 338325072 (322.6519317626953MB)
free = 377852336 (360.3480682373047MB)
47.240399965255534% used
51997 interned Strings occupying 5600608 bytes.
GC 分析
GC 打印也看似正常,看不出问题
jstat -gc -h5 14837 5000
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
2048.0 2048.0 1376.1 0.0 345088.0 216691.7 699392.0 330307.6 133248.0 124961.8 14720.0 13360.8 53010 497.625 92 24.405 522.031
2048.0 2048.0 1376.1 0.0 345088.0 326280.2 699392.0 330307.6 133248.0 124961.8 14720.0 13360.8 53010 497.625 92 24.405 522.031
2048.0 2048.0 0.0 1376.1 345088.0 71113.9 699392.0 330331.6 133248.0 124961.8 14720.0 13360.8 53011 497.634 92 24.405 522.039
2048.0 2048.0 0.0 1376.1 345088.0 169572.3 699392.0 330331.6 133248.0 124961.8 14720.0 13360.8 53011 497.634 92 24.405 522.039
2048.0 2048.0 0.0 1376.1 345088.0 268892.8 699392.0 330331.6 133248.0 124961.8 14720.0 13360.8 53011 497.634 92 24.405 522.039
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
2048.0 2048.0 1296.1 0.0 345088.0 40181.7 699392.0 330363.6 133248.0 124961.8 14720.0 13360.8 53012 497.644 92 24.405 522.050
2048.0 2048.0 1296.1 0.0 345088.0 131806.8 699392.0 330363.6 133248.0 124961.8 14720.0 13360.8 53012 497.644 92 24.405 522.050
2048.0 2048.0 1296.1 0.0 345088.0 220248.8 699392.0 330363.6 133248.0 124961.8 14720.0 13360.8 53012 497.644 92 24.405 522.050
2048.0 2048.0 1296.1 0.0 345088.0 299581.8 699392.0 330363.6 133248.0 124961.8 14720.0 13360.8 53012 497.644 92 24.405 522.050
2048.0 2048.0 0.0 1232.1 345088.0 53573.3 699392.0 330395.6 133248.0 124961.8 14720.0 13360.8 53013 497.653 92 24.405 522.058
Dump 内存块打印
尝试用过 pmap
查找一些连续内存块比较大的内存块,没有发现一些大小相同(64MB)的内存块dump 了一些内存块比较大的(64多,80多,40多),然后尝试用 strings
转换打印下,大部分打印的都是不可读的信息,也看不出什么倪端
提问
这个服务会有一些压缩,解压缩的操作,文件 IO 操作的业务不少。而且目前测试环境没条件重现。
烦请大家帮忙看下,目前没啥思路了。