性能文章>G1垃圾回收源码分析(三)>

G1垃圾回收源码分析(三)原创

2年前
7766211

新生代

之前叙述了G1的分区和Rset,这一次来关注一下G1新生代在发生GC的主要处理方式。G1的新生代的分区个数受之前动态计算出得分区的大小影响,如果设置了(MaxNewSize和NewSize)。除以G1推断的分区大小,可以得到新生代的最大分区数和最小分区数。如果同时设置(MaxNewSize和NewSize)和(NewRatio),则忽略(NewRatio)。如果只设置了(NewRatio),那么新生代的大小就是堆内存/(NewRatio—+1)。除以G1推断的分区大小,可以得到新生代的分区数。

如果没有设置(MaxNewSize和NewSize)或者(NewRatio),那么G1会根据独有的参数(G1MaxNewSizePercent=60)和(GNewSizePercent=5)占整个堆的比例来计算最大和最小分区数

新生代最大分区数和最小分区数被G1推断得到的一样的时候,这就意味着新生代不会动态变化,这就意味着在停顿预测的时候可能会无法满足期望值。

void G1YoungGenSizer::recalculate_min_max_young_length(uint number_of_heap_regions, uint* min_young_length, uint* max_young_length) {
assert(number_of_heap_regions > 0, "Heap must be initialized");

switch (_sizer_kind) {
case SizerDefaults:
*min_young_length = calculate_default_min_length(number_of_heap_regions);
*max_young_length = calculate_default_max_length(number_of_heap_regions);
break;
case SizerNewSizeOnly:
*max_young_length = calculate_default_max_length(number_of_heap_regions);
*max_young_length = MAX2(*min_young_length, *max_young_length);
break;
case SizerMaxNewSizeOnly:
*min_young_length = calculate_default_min_length(number_of_heap_regions);
*min_young_length = MIN2(*min_young_length, *max_young_length);
break;
case SizerMaxAndNewSize:
// Do nothing. Values set on the command line, don't update them at runtime.
break;
case SizerNewRatio:
*min_young_length = number_of_heap_regions / (NewRatio + 1);
*max_young_length = *min_young_length;
break;
default:
ShouldNotReachHere();
}

G1动态处理新生代大小,自适应新生代大小扩充。动态计算参见代码G1CollectorPolicy.cpp#expansion_amount如下。

size_t G1CollectorPolicy::expansion_amount() {
double recent_gc_overhead = recent_avg_pause_time_ratio() * 100.0;
double threshold = _gc_overhead_perc;
if (recent_gc_overhead > threshold) {
const size_t min_expand_bytes = 1*M;
size_t reserved_bytes = _g1->max_capacity();
size_t committed_bytes = _g1->capacity();
size_t uncommitted_bytes = reserved_bytes - committed_bytes;
size_t expand_bytes;
size_t expand_bytes_via_pct =
uncommitted_bytes * G1ExpandByPercentOfAvailable / 100;
expand_bytes = MIN2(expand_bytes_via_pct, committed_bytes);
expand_bytes = MAX2(expand_bytes, min_expand_bytes);
expand_bytes = MIN2(expand_bytes, uncommitted_bytes);
return expand_bytes;
} else {
return 0;
}
}

_gc_overhead_perc这个阈值关联参数(GCTimeRatio=9),

GCOverheadPerc

代表GC时间占用的时间和应用时间不超过10%不需要拓展,超过则需要拓展内存。需要扩展的大小和(G1ExpandByPercentOfAvailable=20)相关,把现有空间增加一倍,或者以G1ExpandByPercentOfAvailable设定的可扩展空间的百分比,以较小的为准,以最小扩展为界,最大分配一倍的当前已分配的内存,最小分配1M的内存,如果最小值都难以满足的话,则把剩下的所有空间都分配。触发时机参见代码CollectedHeap.cpp#do_collection_pause_at_safepoint在执行GC垃圾停顿收集的时候触发,最终调用expand方法进行内存扩充。

bool G1CollectedHeap::expand(size_t expand_bytes) {
size_t aligned_expand_bytes = ReservedSpace::page_align_size_up(expand_bytes);
aligned_expand_bytes = align_size_up(aligned_expand_bytes,
HeapRegion::GrainBytes);
ergo_verbose2(ErgoHeapSizing,
"expand the heap",
ergo_format_byte("requested expansion amount")
ergo_format_byte("attempted expansion amount"),
expand_bytes, aligned_expand_bytes);

if (is_maximal_no_gc()) {
ergo_verbose0(ErgoHeapSizing,
"did not expand the heap",
ergo_format_reason("heap already fully expanded"));
return false;
}

uint regions_to_expand = (uint)(aligned_expand_bytes / HeapRegion::GrainBytes);
assert(regions_to_expand > 0, "Must expand by at least one region");

uint expanded_by = _hrm.expand_by(regions_to_expand);

if (expanded_by > 0) {
size_t actual_expand_bytes = expanded_by * HeapRegion::GrainBytes;
assert(actual_expand_bytes <= aligned_expand_bytes, "post-condition");
g1_policy()->record_new_heap_size(num_regions());
} else {
ergo_verbose0(ErgoHeapSizing,
"did not expand the heap",
ergo_format_reason("heap expansion operation failed"));
// The expansion of the virtual storage space was unsuccessful.
// Let's see if it was because we ran out of swap.
if (G1ExitOnExpansionFailure &&
_hrm.available() >= regions_to_expand) {
// We had head room...
vm_exit_out_of_memory(aligned_expand_bytes, OOM_MMAP_ERROR, "G1 heap expansion");
}
}
return regions_to_expand > 0;
}

G1-YGC

我们都知道当新生代剩下的空间不够分配会触发GC垃圾回收,新生代的GC是对部分内存进行垃圾回收,GC时间比较少,分区化的G1堆针对新生代的收集的内存也是不固定的。首先我们明白在进行YGC的时候会进行STW。然后会选择需要收集的CSet,针对新生代而言就是整个新生代分区。然后加入收集任务中,去并行处理引用。引用关系搜索完毕之后,就是进行对象引用回收,处理对象晋升,晋升失败的还原对象头,尝试扩展内存等。G1-YGC工作流程如下

do_collection_pause_at_safepoint
直接进入CollectedHeap.cpp#evacuate_collection_set方法一探其究。下图为并行清理CSet方法的工作流程

EvacuateCollectionSet

  1. 使用G1RootProcessor类去执行根扫描,扫描直接强引用。主要是JVM根和Java根。使用G1ParCopyHelper把对象复制。

    • Java根

      • 类加载器

        深度遍历当前类的加载的所有存活的Klass对象,找到之后复制到Survivor区或者晋升老年代。

      • 线程栈

        处理Java线程栈和本地方法栈中找,通过StackFrameStream的next执行飞到Sender,从而得到调用者,进而其找到关联的活跃堆内对象,将其复制到Survivor区或者晋升老年代。

      知道了G1RootProcessor类会从上述的两个大方向上去找活跃对象,那么直接看代码,g1RootProcessor.cpp#evacuate_roots

      • void G1RootProcessor::process_java_roots(OopClosure* strong_roots,
        CLDClosure* thread_stack_clds,
        CLDClosure* strong_clds,
        CLDClosure* weak_clds,
        CodeBlobClosure* strong_code,
        G1GCPhaseTimes* phase_times,
        uint worker_i) {
        assert(thread_stack_clds == NULL || weak_clds == NULL, "There is overlap between those, only one may be set");
        // Iterating over the CLDG and the Threads are done early to allow us to
        // first process the strong CLDs and nmethods and then, after a barrier,
        // let the thread process the weak CLDs and nmethods.
        {
        G1GCParPhaseTimesTracker x(phase_times, G1GCPhaseTimes::CLDGRoots, worker_i);
        if (!_process_strong_tasks->is_task_claimed(G1RP_PS_ClassLoaderDataGraph_oops_do)) {
        ClassLoaderDataGraph::roots_cld_do(strong_clds, weak_clds);
        }
        }
        
        {
        G1GCParPhaseTimesTracker x(phase_times, G1GCPhaseTimes::ThreadRoots, worker_i);
        Threads::possibly_parallel_oops_do(strong_roots, thread_stack_clds, strong_code);
        }
        }
        
        void ClassLoaderDataGraph::roots_cld_do(CLDClosure* strong, CLDClosure* weak) {
        for (ClassLoaderData* cld = _head; cld != NULL; cld = cld->_next) {
        CLDClosure* closure = cld->keep_alive() ? strong : weak;
        if (closure != NULL) {
        closure->do_cld(cld);
        }
        }
        }
        
        void ClassLoaderData::oops_do(OopClosure* f, KlassClosure* klass_closure, bool must_claim) {
        if (must_claim && !claim()) {
        return;
        }
        
        f->do_oop(&_class_loader);
        _dependencies.oops_do(f);
        _handles->oops_do(f);
        if (klass_closure != NULL) {
        classes_do(klass_closure);
        }
        }
        void ClassLoaderData::classes_do(KlassClosure* klass_closure) {
        for (Klass* k = _klasses; k != NULL; k = k->next_link()) {
        klass_closure->do_klass(k);
        assert(k != k->next_link(), "no loops!");
        }
        }

      最终发现调用的G1KlassScanClosure中的do_klass

      • class G1KlassScanClosure : public KlassClosure {
        G1ParCopyHelper* _closure;
        bool _process_only_dirty;
        int _count;
        public:
        G1KlassScanClosure(G1ParCopyHelper* closure, bool process_only_dirty)
        : _process_only_dirty(process_only_dirty), _closure(closure), _count(0) {}
        void do_klass(Klass* klass) {
        if (!_process_only_dirty || klass->has_modified_oops()) {
        klass->clear_modified_oops();
        _closure->set_scanned_klass(klass);
        klass->oops_do(_closure);
        _closure->set_scanned_klass(NULL);
        }
        _count++;
        }
        };

      主要执行klass->oops_do(_closure);,这个f为G1ParCopyHelper的对象,所以最终调用的g1CollectedHeap.cpp@G1ParCopyClosure#do_oop_workG1ParCopyHelperdo_oop最终调用do_oop_work来把活跃对象复制到新分区。

      针对线程的处理则是在thread.cpp#possibly_parallel_oops_doThreads::possibly_parallel_oops_do(strong_roots, thread_stack_clds, strong_code);实际调用JavaThread::oops_do遍历栈桢

      • void Thread::oops_do(OopClosure* f, CLDClosure* cld_f, CodeBlobClosure* cf) {
        active_handles()->oops_do(f);
        // Do oop for ThreadShadow
        f->do_oop((oop*)&_pending_exception);
        handle_area()->oops_do(f);
        }
        void JavaThread::oops_do(OopClosure* f, CLDClosure* cld_f, CodeBlobClosure* cf) {
        Thread::oops_do(f, cld_f, cf);
        assert( (!has_last_Java_frame() && java_call_counter() == 0) ||
        (has_last_Java_frame() && java_call_counter() > 0), "wrong java_sp info!");
        
        if (has_last_Java_frame()) {
        RememberProcessedThread rpt(this);
        if (_privileged_stack_top != NULL) {
        _privileged_stack_top->oops_do(f);
        }
        if (_array_for_gc != NULL) {
        for (int index = 0; index < _array_for_gc->length(); index++) {
        f->do_oop(_array_for_gc->adr_at(index));
        }
        }
        for (MonitorChunk* chunk = monitor_chunks(); chunk != NULL; chunk = chunk->next()) {
        chunk->oops_do(f);
        }
        for(StackFrameStream fst(this); !fst.is_done(); fst.next()) {
        fst.current()->oops_do(f, cld_f, cf, fst.register_map());
        }
        }
        set_callee_target(NULL);
        assert(vframe_array_head() == NULL, "deopt in progress at a safepoint!");
        GrowableArray* list = deferred_locals();
        if (list != NULL) {
        for (int i = 0; i < list->length(); i++) {
        list->at(i)->oops_do(f);
        }
        }
        f->do_oop((oop*) &_threadObj);
        f->do_oop((oop*) &_vm_result);
        f->do_oop((oop*) &_exception_oop);
        f->do_oop((oop*) &_pending_async_exception);
        
        if (jvmti_thread_state() != NULL) {
        jvmti_thread_state()->oops_do(f);
        }
        }

      从JNI本地代码栈和JVM内部方法栈中找活跃对象,从java栈中找,遍历Monitor块,遍历jvmti(JVM Tool Interface)这里主要使用是JavaAgent。最后执行G1ParCopyHelperdo_oop最终调用do_oop_work来把活跃对象复制到新分区。

    • JVM根

      一些全局JVM对象,如Universe,JNIHandles,SystemDictionary,StringTable等等

      void G1RootProcessor::process_vm_roots(OopClosure* strong_roots,
                                             OopClosure* weak_roots,
                                             G1GCPhaseTimes* phase_times,
                                             uint worker_i) {
      {
          G1GCParPhaseTimesTracker x(phase_times, G1GCPhaseTimes::UniverseRoots, worker_i);
          if (!_process_strong_tasks->is_task_claimed(G1RP_PS_Universe_oops_do)) {
            Universe::oops_do(strong_roots);
          }
        }
       ....
       void Universe::oops_do(OopClosure* f, bool do_all) {
      
        f->do_oop((oop*) &_int_mirror);
        f->do_oop((oop*) &_float_mirror);
        f->do_oop((oop*) &_double_mirror);
       ........
      }

      针对JVM根 同样也是调用的G1ParCopyHelperdo_oop只不过对JVM根而言则是各种全局对象。例如Univers

    g1CollectedHeap.cpp@G1ParCopyClosure#do_oop_work工作流程如下
    do_oop_work
    执行对象复制复制的操作在G1ParScanThreadState#copy_to_survivor_space方法中。具体处理如下
    CopyAndSurvivorSpace

  2. 处理RSet

  • 我们在G1ParTask的work方法中来看处理RSet的入口。

    • void G1RootProcessor::scan_remembered_sets(G1ParPushHeapRSClosure* scan_rs,
      OopClosure* scan_non_heap_weak_roots,
      uint worker_i) {
      ...
      _g1h->g1_rem_set()->oops_into_collection_set_do(scan_rs, &scavenge_cs_nmethods, worker_i);
      }

    主要是去执行G1RemSet中的oops_into_collection_set_do方法。主要信息更新RSet和扫描RSet。

    • void G1RemSet::oops_into_collection_set_do(G1ParPushHeapRSClosure* oc,
      CodeBlobClosure* code_root_cl,
      uint worker_i) {
      DirtyCardQueue into_cset_dcq(&_g1->into_cset_dirty_card_queue_set());
      updateRS(&into_cset_dcq, worker_i);
      scanRS(oc, code_root_cl, worker_i);
      _cset_rs_update_cl[worker_i] = NULL;
      }

    这里看到有个DCQ,在研究RSet的时候就遇到这种队列,当时说的是给予Mutator用于记录应用线程运行时引用情况,这里这个主要是用于记录复制失败后,要保留的引用,此队列数据将传递到用于管理RSet更新的DirtyCardQueueSet。

    • 更新RSet

      主要用于把上面这个DCQ对象存到RSet的PRT当中。

      • G1GCParPhaseTimesTracker x(_g1p->phase_times(), G1GCPhaseTimes::UpdateRS, worker_i);
        // Apply the given closure to all remaining log entries.
        RefineRecordRefsIntoCSCardTableEntryClosure into_cset_update_rs_cl(_g1, into_cset_dcq);
        
        _g1->iterate_dirty_card_closure(&into_cset_update_rs_cl, into_cset_dcq, false, worker_i);
        }
        void G1CollectedHeap::iterate_dirty_card_closure(CardTableEntryClosure* cl,
        DirtyCardQueue* into_cset_dcq,
        bool concurrent,
        uint worker_i) {
        // Clean cards in the hot card cache
        G1HotCardCache* hot_card_cache = _cg1r->hot_card_cache();
        hot_card_cache->drain(worker_i, g1_rem_set(), into_cset_dcq);
        
        DirtyCardQueueSet& dcqs = JavaThread::dirty_card_queue_set();
        size_t n_completed_buffers = 0;
        while (dcqs.apply_closure_to_completed_buffer(cl, worker_i, 0, true)) {
        n_completed_buffers++;
        }
        g1_policy()->phase_times()->record_thread_work_item(G1GCPhaseTimes::UpdateRS, worker_i, n_completed_buffers);
        dcqs.clear_n_completed_buffers();
        assert(!dcqs.completed_buffers_exist_dirty(), "Completed buffers exist!");
        }

      首先使用RefineRecordRefsIntoCSCardTableEntryClosure闭包处理,处理整个卡中如果存在对堆内对象的引用,就是脏卡,就需要入队,被Refine线程处理

      iterate_dirty_card_closure方法处理DCQS中剩余的DCQ,和Java线程处理方式一样。

    • 扫描Rset

      根据Rset中的信息找到引用者

      • void G1RemSet::scanRS(G1ParPushHeapRSClosure* oc,
        CodeBlobClosure* code_root_cl,
        uint worker_i) {
        double rs_time_start = os::elapsedTime();
        HeapRegion *startRegion = _g1->start_cset_region_for_worker(worker_i);
        
        ScanRSClosure scanRScl(oc, code_root_cl, worker_i);
        
        _g1->collection_set_iterate_from(startRegion, &scanRScl);
        scanRScl.set_try_claimed();
        _g1->collection_set_iterate_from(startRegion, &scanRScl);
        
        double scan_rs_time_sec = (os::elapsedTime() - rs_time_start)
        - scanRScl.strong_code_root_scan_time_sec();
        
        assert(_cards_scanned != NULL, "invariant");
        _cards_scanned[worker_i] = scanRScl.cards_done();
        
        _g1p->phase_times()->record_time_secs(G1GCPhaseTimes::ScanRS, worker_i, scan_rs_time_sec);
        _g1p->phase_times()->record_time_secs(G1GCPhaseTimes::CodeRoots, worker_i, scanRScl.strong_code_root_scan_time_sec());
        }

      使用GC线程id分片处理不同的分区,执行流程主要是俩次扫描分区。处理一般对象和代码对象主要处理内联优化之后的代码引用对象。主要执行流程如下
      WX20201106-170743

  1. 对象复制
  • 主要处理根扫描出的对象和 RSet中找到的子对象全部复制到新的分区当中。所有的对象都被放在ParScanState的队列中。执行复制的过程就是从该队列中出队,处理不同的对象类型。最终调用deal_with_reference方法来处理。把cset中所有的活跃对象都复制到新的分区的Survivor或者老年代当中。

     

相关阅读

G1垃圾回收源码分析(一)

G1垃圾回收源码分析(二)

G1垃圾回收源码分析(三)

点赞收藏
分类:标签:
小蓝鲸

奶爸码农

请先登录,查看2条精彩评论吧
快去登录吧,你将获得
  • 浏览更多精彩评论
  • 和开发者讨论交流,共同进步

为你推荐

线上问题排查,一不小心踩到阿里的 arthas坑了

线上问题排查,一不小心踩到阿里的 arthas坑了

【全网首发】一次想不到的 Bootstrap 类加载器带来的 Native 内存泄露分析

【全网首发】一次想不到的 Bootstrap 类加载器带来的 Native 内存泄露分析

解读JVM级别本地缓存Caffeine青出于蓝的要诀 —— 缘何会更强、如何去上手

解读JVM级别本地缓存Caffeine青出于蓝的要诀 —— 缘何会更强、如何去上手

【全网首发】一次疑似 JVM Native 内存泄露的问题分析

【全网首发】一次疑似 JVM Native 内存泄露的问题分析

解读JVM级别本地缓存Caffeine青出于蓝的要诀2 —— 弄清楚Caffeine的同步、异步回源方式

解读JVM级别本地缓存Caffeine青出于蓝的要诀2 —— 弄清楚Caffeine的同步、异步回源方式

【全网首发】从源码角度分析一次诡异的类被加载问题

【全网首发】从源码角度分析一次诡异的类被加载问题

11
2