性能问答>G1垃圾回收Region数量变化很大,并且周期性变化>
11回复

G1垃圾回收Region数量变化很大,并且周期性变化



生产环境,关键配置如下:

-XX:+UseG1GC -Xmx32g -Xms32g -Xss256k 
-XX:+UnlockExperimentalVMOptions 
-XX:MaxGCPauseMillis=200 
-XX:MetaspaceSize=320m -XX:MaxMetaspaceSize=320m 
-XX:InitiatingHeapOccupancyPercent=70 
-XX:+PrintAdaptiveSizePolicy 
-XX:+ParallelRefProcEnabled 
-XX:ConcGCThreads=20

在线上运行过程中,流量和业务都基本保持恒定,但是Region数量缺波动很大,从最高的1100+,到最低的时候100+(Region size一直保持16M没变),而且这种变化是周期性的,周期时间大概在55分钟~1小时,Region数量逐渐变小,然后再突然上涨,部分gc日志如下:

{Heap before GC invocations=176604 (full 0):
 garbage-first heap   total 33554432K, used 11745581K [0x00007f71b0000000, 0x00007f71b1004000, 0x00007f79b0000000)
  region size 16384K, 309 young (5062656K), 13 survivors (212992K)
 Metaspace       used 44235K, capacity 45164K, committed 45824K, reserved 47104K
2019-12-24T18:23:06.365+0800: 1656728.104: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 327155712 bytes, new threshold 15 (max 15)
- age   1:   45895088 bytes,   45895088 total
- age   2:    9512504 bytes,   55407592 total
- age   3:    3381480 bytes,   58789072 total
- age   4:   15717320 bytes,   74506392 total
- age   5:    8075840 bytes,   82582232 total
- age   6:    7113560 bytes,   89695792 total
- age   7:    6970456 bytes,   96666248 total
- age   8:    7290024 bytes,  103956272 total
 1656728.105: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 112516, predicted base time: 31.41 ms, remaining time: 168.59 ms, target pause time: 200.00 ms]
 1656728.105: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 296 regions, survivors: 13 regions, predicted young region time: 14.47 ms]
 1656728.105: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 296 regions, survivors: 13 regions, old: 0 regions, predicted pause time: 45.89 ms, target pause time: 200.00 ms]
, 0.0419178 secs]
   [Parallel Time: 30.4 ms, GC Workers: 28]
      [GC Worker Start (ms): Min: 1656728106.1, Avg: 1656728106.3, Max: 1656728106.5, Diff: 0.3]
      [Ext Root Scanning (ms): Min: 2.9, Avg: 3.7, Max: 20.9, Diff: 18.0, Sum: 105.0]
      [Update RS (ms): Min: 0.0, Avg: 16.0, Max: 16.9, Diff: 16.9, Sum: 449.4]
         [Processed Buffers: Min: 0, Avg: 87.6, Max: 123, Diff: 123, Sum: 2454]
      [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.5]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 8.8, Avg: 9.9, Max: 10.0, Diff: 1.2, Sum: 277.5]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 28]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 4.2]
      [GC Worker Total (ms): Min: 29.7, Avg: 29.9, Max: 30.2, Diff: 0.5, Sum: 837.7]
      [GC Worker End (ms): Min: 1656728136.1, Avg: 1656728136.2, Max: 1656728136.3, Diff: 0.2]
   [Code Root Fixup: 0.2 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.7 ms]
   [Other: 10.7 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 5.6 ms]
      [Ref Enq: 0.3 ms]
      [Redirty Cards: 0.4 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.7 ms]
   [Eden: 4736.0M(4736.0M)->0.0B(19.0G) Survivors: 208.0M->192.0M Heap: 11.2G(32.0G)->6706.6M(32.0G)]
Heap after GC invocations=176605 (full 0):
 garbage-first heap   total 33554432K, used 6867574K [0x00007f71b0000000, 0x00007f71b1004000, 0x00007f79b0000000)
  region size 16384K, 12 young (196608K), 12 survivors (196608K)
 Metaspace       used 44235K, capacity 45164K, committed 45824K, reserved 47104K
}
 [Times: user=0.61 sys=0.16, real=0.05 secs] 

这次GC完,eden从4736M变成了19G
我是如何发现这个问题的:
主要体现在young gc的频繁程度上:region数量逐渐变小,eden区也就越来越小,young gc也就越来越频繁,监控截图如下:
image.png
曲线表示10s内younggc-count。
疑惑的点有以下:

  • Region数量动态调整可以理解,但是为何在我流量基本没变化的情况下,数量会一直减少?
  • Region数量是根据什么来变化的?有哪些因素会影响?也就是为何会突变到19G
  • 是否可以设置一个最小年轻代大小来控制最小Region数量,不让他这样大范围的波动?
8828 阅读
请先登录,查看11条精彩评论吧
快去登录吧,你将获得
  • 浏览更多精彩评论
  • 和开发者讨论交流,共同进步