11回复
4年前
G1垃圾回收Region数量变化很大,并且周期性变化
生产环境,关键配置如下:
-XX:+UseG1GC -Xmx32g -Xms32g -Xss256k
-XX:+UnlockExperimentalVMOptions
-XX:MaxGCPauseMillis=200
-XX:MetaspaceSize=320m -XX:MaxMetaspaceSize=320m
-XX:InitiatingHeapOccupancyPercent=70
-XX:+PrintAdaptiveSizePolicy
-XX:+ParallelRefProcEnabled
-XX:ConcGCThreads=20
在线上运行过程中,流量和业务都基本保持恒定,但是Region数量缺波动很大,从最高的1100+,到最低的时候100+(Region size一直保持16M没变),而且这种变化是周期性的,周期时间大概在55分钟~1小时,Region数量逐渐变小,然后再突然上涨,部分gc日志如下:
{Heap before GC invocations=176604 (full 0):
garbage-first heap total 33554432K, used 11745581K [0x00007f71b0000000, 0x00007f71b1004000, 0x00007f79b0000000)
region size 16384K, 309 young (5062656K), 13 survivors (212992K)
Metaspace used 44235K, capacity 45164K, committed 45824K, reserved 47104K
2019-12-24T18:23:06.365+0800: 1656728.104: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 327155712 bytes, new threshold 15 (max 15)
- age 1: 45895088 bytes, 45895088 total
- age 2: 9512504 bytes, 55407592 total
- age 3: 3381480 bytes, 58789072 total
- age 4: 15717320 bytes, 74506392 total
- age 5: 8075840 bytes, 82582232 total
- age 6: 7113560 bytes, 89695792 total
- age 7: 6970456 bytes, 96666248 total
- age 8: 7290024 bytes, 103956272 total
1656728.105: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 112516, predicted base time: 31.41 ms, remaining time: 168.59 ms, target pause time: 200.00 ms]
1656728.105: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 296 regions, survivors: 13 regions, predicted young region time: 14.47 ms]
1656728.105: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 296 regions, survivors: 13 regions, old: 0 regions, predicted pause time: 45.89 ms, target pause time: 200.00 ms]
, 0.0419178 secs]
[Parallel Time: 30.4 ms, GC Workers: 28]
[GC Worker Start (ms): Min: 1656728106.1, Avg: 1656728106.3, Max: 1656728106.5, Diff: 0.3]
[Ext Root Scanning (ms): Min: 2.9, Avg: 3.7, Max: 20.9, Diff: 18.0, Sum: 105.0]
[Update RS (ms): Min: 0.0, Avg: 16.0, Max: 16.9, Diff: 16.9, Sum: 449.4]
[Processed Buffers: Min: 0, Avg: 87.6, Max: 123, Diff: 123, Sum: 2454]
[Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.5]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 8.8, Avg: 9.9, Max: 10.0, Diff: 1.2, Sum: 277.5]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 28]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 4.2]
[GC Worker Total (ms): Min: 29.7, Avg: 29.9, Max: 30.2, Diff: 0.5, Sum: 837.7]
[GC Worker End (ms): Min: 1656728136.1, Avg: 1656728136.2, Max: 1656728136.3, Diff: 0.2]
[Code Root Fixup: 0.2 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.7 ms]
[Other: 10.7 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 5.6 ms]
[Ref Enq: 0.3 ms]
[Redirty Cards: 0.4 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.7 ms]
[Eden: 4736.0M(4736.0M)->0.0B(19.0G) Survivors: 208.0M->192.0M Heap: 11.2G(32.0G)->6706.6M(32.0G)]
Heap after GC invocations=176605 (full 0):
garbage-first heap total 33554432K, used 6867574K [0x00007f71b0000000, 0x00007f71b1004000, 0x00007f79b0000000)
region size 16384K, 12 young (196608K), 12 survivors (196608K)
Metaspace used 44235K, capacity 45164K, committed 45824K, reserved 47104K
}
[Times: user=0.61 sys=0.16, real=0.05 secs]
这次GC完,eden从4736M变成了19G
我是如何发现这个问题的:
主要体现在young gc的频繁程度上:region数量逐渐变小,eden区也就越来越小,young gc也就越来越频繁,监控截图如下:
曲线表示10s内younggc-count。
疑惑的点有以下:
- Region数量动态调整可以理解,但是为何在我流量基本没变化的情况下,数量会一直减少?
- Region数量是根据什么来变化的?有哪些因素会影响?也就是为何会突变到19G
- 是否可以设置一个最小年轻代大小来控制最小Region数量,不让他这样大范围的波动?
8765 阅读