1回复
4年前
G1 to-space exhausted 的问题
问题:线上出现告警,堆内存使用率超过95%
应用说明:该系统存储了很多账户-银行的配置信息,供业务系统查询使用,业务系统调用是全量获取的,所以会产生些大对象(将list转化为json返回)
JVM参数:
-Xms6144m-Xmx6144m-XX:+UseG1GC-XX:MaxGCPauseMillis=500-Xloggc:/home/finance/Logs/paycommon.msxf.lo/paycommon-gc.log-XX:+PrintGCDetails-XX:+PrintGCDateStamps-XX:+UseGCLogFileRotation-XX:NumberOfGCLogFiles=10-XX:GCLogFileSize=50M
检查GC日志,发现告警时间段附近的GC日志为:
2020-09-16T19:29:36.138+0800: 1113411.729: [GC pause (G1 Humongous Allocation) (young) (initial-mark), 0.0486270 secs]
[Parallel Time: 44.3 ms, GC Workers: 4]
[GC Worker Start (ms): Min: 1113411731.0, Avg: 1113411731.0, Max: 1113411731.0, Diff: 0.1]
[Ext Root Scanning (ms): Min: 3.8, Avg: 4.2, Max: 4.5, Diff: 0.6, Sum: 16.6]
[Update RS (ms): Min: 34.6, Avg: 34.7, Max: 34.7, Diff: 0.1, Sum: 138.7]
[Processed Buffers: Min: 184, Avg: 205.5, Max: 229, Diff: 45, Sum: 822]
[Scan RS (ms): Min: 1.0, Avg: 1.1, Max: 1.2, Diff: 0.2, Sum: 4.5]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 3.6, Avg: 4.0, Max: 4.3, Diff: 0.7, Sum: 15.9]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Termination Attempts: Min: 1, Avg: 1.2, Max: 2, Diff: 1, Sum: 5]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 43.9, Avg: 44.0, Max: 44.0, Diff: 0.1, Sum: 175.8]
[GC Worker End (ms): Min: 1113411775.0, Avg: 1113411775.0, Max: 1113411775.0, Diff: 0.0]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.1 ms]
[Other: 4.3 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.2 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.1 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 1.2 ms]
[Free CSet: 0.2 ms]
[Eden: 14.0M(3682.0M)->0.0B(3680.0M) Survivors: 4096.0K->6144.0K Heap: 2634.4M(6144.0M)->1491.6M(6144.0M)]
[Times: user=0.18 sys=0.00, real=0.04 secs]
2020-09-16T19:29:36.187+0800: 1113411.778: [GC concurrent-root-region-scan-start]
2020-09-16T19:29:36.203+0800: 1113411.793: [GC concurrent-root-region-scan-end, 0.0154486 secs]
2020-09-16T19:29:36.203+0800: 1113411.793: [GC concurrent-mark-start]
2020-09-16T19:29:36.637+0800: 1113412.227: [GC concurrent-mark-end, 0.4340025 secs]
2020-09-16T19:29:36.651+0800: 1113412.241: [GC remark 2020-09-16T19:29:36.651+0800: 1113412.241: [Finalize Marking, 0.0014060 secs] 2020-09-16T19:29:366
.652+0800: 1113412.243: [GC ref-proc, 0.0011151 secs] 2020-09-16T19:29:36.653+0800: 1113412.244: [Unloading, 0.0640499 secs], 0.0700085 secs]
[Times: user=0.24 sys=0.00, real=0.07 secs]
2020-09-16T19:29:36.723+0800: 1113412.313: [GC cleanup 2085M->1645M(6144M), 0.0041280 secs]
[Times: user=0.02 sys=0.00, real=0.00 secs]
2020-09-16T19:29:36.727+0800: 1113412.318: [GC concurrent-cleanup-start]
2020-09-16T19:29:36.728+0800: 1113412.318: [GC concurrent-cleanup-end, 0.0003667 secs]
2020-09-16T21:34:35.527+0800: 1120911.117: [GC pause (G1 Evacuation Pause) (young) (to-space exhausted), 1.2081112 secs] [Parallel Time: 226.4 ms, GC Workers: 4] [GC Worker Start (ms): Min: 1120911125.5, Avg: 1120911125.6, Max: 1120911125.7, Diff: 0.1] [Ext Root Scanning (ms): Min: 4.0, Avg: 4.6, Max: 5.7, Diff: 1.6, Sum: 18.4] [Update RS (ms): Min: 23.9, Avg: 24.0, Max: 24.2, Diff: 0.3, Sum: 96.1] [Processed Buffers: Min: 98, Avg: 109.5, Max: 119, Diff: 21, Sum: 438] [Scan RS (ms): Min: 2.4, Avg: 2.5, Max: 2.6, Diff: 0.2, Sum: 10.1] [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.3] [Object Copy (ms): Min: 190.3, Avg: 193.9, Max: 195.4, Diff: 5.1, Sum: 775.5] [Termination (ms): Min: 0.0, Avg: 0.9, Max: 3.7, Diff: 3.7, Sum: 3.8] [Termination Attempts: Min: 1, Avg: 1.2, Max: 2, Diff: 1, Sum: 5] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 226.0, Avg: 226.1, Max: 226.3, Diff: 0.3, Sum: 904.3] [GC Worker End (ms): Min: 1120911351.6, Avg: 1120911351.7, Max: 1120911351.8, Diff: 0.2] [Code Root Fixup: 0.0 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.8 ms] [Other: 980.8 ms] [Evacuation Failure: 968.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.2 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 7.7 ms] [Humongous Reclaim: 2.2 ms] [Free CSet: 1.2 ms] [Eden: 2942.0M(3680.0M)->0.0B(306.0M) Survivors: 6144.0K->0.0B Heap: 5889.8M(6144.0M)->2864.0M(6144.0M)] [Times: user=4.17 sys=0.03, real=1.21 secs]
2020-09-16T21:56:31.801+0800: 1122227.392: [GC pause (G1 Evacuation Pause) (mixed), 0.0330733 secs] [Parallel Time: 29.8 ms, GC Workers: 4] [GC Worker Start (ms): Min: 1122227392.3, Avg: 1122227392.4, Max: 1122227392.4, Diff: 0.1] [Ext Root Scanning (ms): Min: 3.4, Avg: 3.8, Max: 4.5, Diff: 1.2, Sum: 15.2] [Update RS (ms): Min: 21.0, Avg: 21.4, Max: 21.5, Diff: 0.5, Sum: 85.6] [Processed Buffers: Min: 35, Avg: 50.8, Max: 59, Diff: 24, Sum: 203] [Scan RS (ms): Min: 1.0, Avg: 1.2, Max: 1.5, Diff: 0.5, Sum: 4.9] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 2.6, Avg: 3.2, Max: 3.7, Diff: 1.1, Sum: 12.7] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 4] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 29.6, Avg: 29.6, Max: 29.7, Diff: 0.1, Sum: 118.5] [GC Worker End (ms): Min: 1122227422.0, Avg: 1122227422.0, Max: 1122227422.0, Diff: 0.0] [Code Root Fixup: 0.0 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.3 ms] [Other: 3.0 ms] [Choose CSet: 0.1 ms] [Ref Proc: 1.4 ms] [Ref Enq: 0.0 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 0.7 ms] [Eden: 306.0M(306.0M)->0.0B(2744.0M) Survivors: 0.0B->2048.0K Heap: 3170.0M(6144.0M)->2602.0M(6144.0M)] [Times: user=0.11 sys=0.01, real=0.03 secs]
在19:29分有一次并发标记(初始标记的YGC为大对象分配的原因)的日志,然后一直到21:34分,发生了一次to-space exhausted
可以看到21:34分GC之前的堆使用率已经是5889m/6114m=96%,超过告警阈值。
问题:
为啥在19:29~21:34分内存增长到5889M 96%之前没有发生过一次并发标记和MIX GC? 并发标记的触发不是heap使用率达到45%就触发么?
我的理解是是不是由于并发标记中的初始标记是依赖YGC的,然后此时YGC一直没触发(因为Young区还没满),导致并发标记没办法进行,堆一直往上涨,一直到to-space exhausted?
求教社区的各位大大,感谢
4000 阅读