源碼分析HotSpot GC過程(一)，hotspotgc

最後更新：2015-12-01 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

　　對於HotSpot虛擬機器記憶體回收過程，這裡將分析介紹預設配置下MarkSweepPolicy的DefNewGeneration和TenuredGeneration的記憶體回收內容以及介紹其他GC策略和代實現的GC思想。GC的過程姑且簡單地分為記憶體代實現無關的GC過程和記憶體代GC過程。
本文將先進行記憶體代實現無關的GC過程分析，記憶體代GC過程將在後面進行分析。

　　從GenCollectedHeap的do_collection()說起：
　　1.在GC之前有許多必要的檢查和統計任務，比如對回收記憶體代的統計、堆記憶體大小的統計等，注意本節內容將不再去分析一些效能統計的內容，有興趣的可自行分析。
　　(1).檢查是否已經GC鎖是否已經啟用，並設定需要進行GC的標誌為true，這時，通過is_active_and_needs_gc()就可以判斷是否已經有線程觸發了GC。

 if (GC_locker::check_active_before_gc()) {    return; // GC is disabled (e.g. JNI GetXXXCritical operation)  }

　　(2).檢查是否需要回收所有的軟引用。

 const bool do_clear_all_soft_refs = clear_all_soft_refs ||                          collector_policy()->should_clear_all_soft_refs();

　　(3).記錄永久代已經使用的記憶體空間大小。

const size_t perm_prev_used = perm_gen()->used();

　　(4).確定回收類型是否是FullGC以及gc觸發類型(GC/Full GC(system)/Full GC，用作Log輸出)。

bool complete = full && (max_level == (n_gens()-1));    const char* gc_cause_str = "GC ";    if (complete) {      GCCause::Cause cause = gc_cause();      if (cause == GCCause::_java_lang_system_gc) {        gc_cause_str = "Full GC (System) ";      } else {        gc_cause_str = "Full GC ";      }    }

　　(5).gc計數加1操作(包括總GC計數和FullGC計數)。

increment_total_collections(complete);

　　(6).統計堆已被使用的空間大小。

size_t gch_prev_used = used();

　　(7).如果是FullGC，那麼從最高的記憶體代到最低的記憶體代，若某個記憶體代不希望對比其更低的記憶體代進行單獨回收，那麼就以該記憶體代作為GC的起始記憶體代。這裡說明下什麼是單獨回收。新生代比如DefNewGeneration的實現將對新生代使用複製演算法進行記憶體回收，而老年代TenuredGeneration的記憶體回收則會使用其標記-壓縮-清理演算法對新生代也進行處理。所以可以說DefNewGeneration的記憶體回收是對新生代進行單獨回收，而TenuredGeneration的記憶體回收則是對老年代和更低的記憶體代都進行回收。

　　int starting_level = 0;　　if (full) {      // Search for the oldest generation which will collect all younger      // generations, and start collection loop there.      for (int i = max_level; i >= 0; i--) {        if (_gens[i]->full_collects_younger_generations()) {          starting_level = i;          break;        }      }    }

　　2.接下來從GC的起始記憶體代開始，向最老的記憶體代進行回收。
　　(1).其中should_collect()將根據該記憶體代GC條件返回是否應該對該記憶體代進行GC。若當前回收的記憶體代是最老的記憶體代，如果本次gc不是FullGC，將調用increment_total_full_collections()修正之前的FulllGC計數值。

　　 int max_level_collected = starting_level;　　 for (int i = starting_level; i <= max_level; i++) {      if (_gens[i]->should_collect(full, size, is_tlab)) {        if (i == n_gens() - 1) {  // a major collection is to happen          if (!complete) {            // The full_collections increment was missed above.            increment_total_full_collections();          }

　　(2).統計GC前該記憶體代使用空間大小以及其他記錄工作。
　　(3).驗證工作。

　　先調用prepare_for_verify()使各記憶體代進行驗證的準備工作(正常情況下什麼都不需要做)，隨後調用Universe的verify()進行GC前驗證

if (VerifyBeforeGC && i >= VerifyGCLevel &&            total_collections() >= VerifyGCStartAt) {          HandleMark hm;  // Discard invalid handles created during verification          if (!prepared_for_verification) {            prepare_for_verify();            prepared_for_verification = true;          }          gclog_or_tty->print(" VerifyBeforeGC:");          Universe::verify(true);        }

　　線程、堆(各記憶體代)、符號表、字串表、代碼緩衝、系統字典等，如對堆的驗證將對堆內的每個oop對象的類型Klass進行驗證，驗證對象是否是oop，類型klass是否在永久代，oop的klass域是否是klass 。那麼為什麼在這裡進行GC驗證？GC前驗證和GC後驗證又分別有什麼作用？ VerifyBeforeGC和VerifyAfterGC都需要和UnlockDiagnosticVMOptions配合使用以用來診斷JVM問題，但是驗證過程非常耗時，所以在正常的編譯版本中並沒有將驗證內容進行輸出。
　　(4).儲存記憶體代各地區的碰撞指標到該地區的_save_mark_word變數。

save_marks();

　　(5).初始化引用處理器。

ReferenceProcessor* rp = _gens[i]->ref_processor();if (rp->discovery_is_atomic()) {            rp->verify_no_references_recorded();            rp->enable_discovery();            rp->setup_policy(do_clear_all_soft_refs);          } else {            // collect() below will enable discovery as appropriate          }

　　(6).由各記憶體代完成gc

_gens[i]->collect(full, do_clear_all_soft_refs, size, is_tlab);

　　(7).將不可觸及的引用對象加入到Reference的pending鏈表

if (!rp->enqueuing_is_done()) {            rp->enqueue_discovered_references();          } else {            rp->set_enqueuing_is_done(false);          }          rp->verify_no_references_recorded();        }

　　其中enqueue_discovered_references根據是否使用壓縮指標選擇不同的enqueue_discovered_ref_helper()模板函數，enqueue_discovered_ref_helper()實現如下：
　　

template <class T>bool enqueue_discovered_ref_helper(ReferenceProcessor* ref,                                   AbstractRefProcTaskExecutor* task_executor) {  T* pending_list_addr = (T*)java_lang_ref_Reference::pending_list_addr();  T old_pending_list_value = *pending_list_addr;  ref->enqueue_discovered_reflists((HeapWord*)pending_list_addr, task_executor);  oop_store(pending_list_addr, oopDesc::load_decode_heap_oop(pending_list_addr));  ref->disable_discovery();  return old_pending_list_value != *pending_list_addr;}

　　pending_list_addr是Reference的私人靜態(類)成員pending鏈表的首元素的地址，gc階段當引用對象的可達狀態變化時，會將引用加入到pending鏈表中，而Reference的私人靜態(類)成員ReferenceHandler將不斷地從pending鏈表中取出引用加入ReferenceQueue。
　　enqueue_discovered_reflists()根據是否使用多線程有著不同的處理方式，若採用多線程則會建立一個RefProcEnqueueTask交由AbstractRefProcTaskExecutor進行處理，這裡我們分析單線程的串列處理情況：
這裡，DiscoveredList數組_discoveredSoftRefs儲存了最多_max_num_q*subclasses_of_ref個軟引用的鏈表。在將引用鏈表處理後會將引用鏈表的起始引用置為哨兵引用，並設定引用鏈長度為0，表示該列表為空白。

void ReferenceProcessor::enqueue_discovered_reflists(HeapWord* pending_list_addr,  AbstractRefProcTaskExecutor* task_executor) {  if (_processing_is_mt && task_executor != NULL) {    // Parallel code    RefProcEnqueueTask tsk(*this, _discoveredSoftRefs,                           pending_list_addr, sentinel_ref(), _max_num_q);    task_executor->execute(tsk);  } else {    // Serial code: call the parent class's implementation    for (int i = 0; i < _max_num_q * subclasses_of_ref; i++) {      enqueue_discovered_reflist(_discoveredSoftRefs[i], pending_list_addr);      _discoveredSoftRefs[i].set_head(sentinel_ref());      _discoveredSoftRefs[i].set_length(0);    }  }}

　　enqueue_discovered_reflist()如下：

　　取出refs_list鏈上的首元素，next為discovered域所成鏈表上的下一個元素

  oop obj = refs_list.head();  while (obj != sentinel_ref()) {    assert(obj->is_instanceRef(), "should be reference object");    oop next = java_lang_ref_Reference::discovered(obj);

　　如果next是最後的哨兵引用，那麼，原子交換discovered域所成鏈表上的表尾元素與pending_list_addr的值，即將其加入到pending鏈表的表頭，接下來根據插入到表頭的鏈表的處理方式，當pending鏈表為空白時，作為表尾元素其next域指向自身，否則，將其next域指向鏈表的原表頭元素，這樣就將該元素插入到pending鏈表的原表頭位置，即：

if (next == sentinel_ref()) {  // obj is last      // Swap refs_list into pendling_list_addr and      // set obj's next to what we read from pending_list_addr.      oop old = oopDesc::atomic_exchange_oop(refs_list.head(), pending_list_addr);      // Need oop_check on pending_list_addr above;      // see special oop-check code at the end of      // enqueue_discovered_reflists() further below.      if (old == NULL) {        // obj should be made to point to itself, since        // pending list was empty.        java_lang_ref_Reference::set_next(obj, obj);      } else {        java_lang_ref_Reference::set_next(obj, old);      }

　　否則若next不是最後的哨兵引用，設定引用對象的next域為next，即將從引用鏈表的表頭元素開始，將虛擬機器所使用的discovered域所成鏈錶轉化為Java層可使用的next域所成pending列表。

} else {      java_lang_ref_Reference::set_next(obj, next);    }

　　最後設定引用對象的discovered域為NULL，即切斷當前引用在discovered域所成鏈表中的參考關聯性，並繼續遍曆引用鏈　　

java_lang_ref_Reference::set_discovered(obj, (oop) NULL);    obj = next;  }

　　綜上所述，入隊的操作就是通過原來的discovered域進行遍曆，將引用鏈表用next域重新串連後切斷discovered域的關係並將新鏈表附在pending鏈表的表頭。

　　(9).回到GC完成後的處理：更新統計資料和進行GC後驗證

　　3.輸出一些GC的日誌資訊

    complete = complete || (max_level_collected == n_gens() - 1);        if (complete) { // We did a "major" collection      post_full_gc_dump();   // do any post full gc dumps    }    if (PrintGCDetails) {      print_heap_change(gch_prev_used);      // Print perm gen info for full GC with PrintGCDetails flag.      if (complete) {        print_perm_heap_change(perm_prev_used);      }    }

　　4.更新各記憶體代的大小

 for (int j = max_level_collected; j >= 0; j -= 1) {      // Adjust generation sizes.      _gens[j]->compute_new_size();    }

　　5.FullGC後更新和調整永久代記憶體大小

if (complete) {      // Ask the permanent generation to adjust size for full collections      perm()->compute_new_size();      update_full_collections_completed();    }

6.若配置了ExitAfterGCNum，則當gc次數達到使用者配置的最大GC計數時退出VM

 if (ExitAfterGCNum > 0 && total_collections() == ExitAfterGCNum) {    tty->print_cr("Stopping after GC #%d", ExitAfterGCNum);    vm_exit(-1);  }

GC的記憶體代實現無關的流程圖如下：

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More