Garbage collection mechanism and memory allocation policy

Source: Internet
Author: User

We often say that the garbage collection (Grabage COLLECTION,GC) needs to determine the following three things:

哪些内存需要回收?什么时候回收?如何回收?

before these three things, let's see why we have to do garbage collection.
Because of the instance objects in the program, variables and other information if not garbage collection, along with the operation of the program, will inevitably bring down the performance of the program, resulting in card, slow or even system anomalies.

    • What memory needs to be recycled?

We talked about the program counter, the virtual machine stack, the local method stack, three zones are thread-isolated data areas, that is, the three regions are born with threads and are extinguished with threads. Because of the certainty of memory allocation and recycling in these memory areas, there is no need to consider recycling too much in these areas. The heap and method areas in Java are different, and this part of the memory allocation and recycling is dynamic. The garbage collection we're talking about is recycling this part of the memory.

    • When do I recycle?

Before recycling, we have to know that part of the object is alive and which part of the object has died.
The so-called survival and death, simply speaking, is that an object does not have any references, then it can be recycled.

How can I tell if an object survives or dies?
Reference counting algorithm

给对象中添加一个引用计数器,每当有一个地方引用它,计数器的值就加1,当引用失效时,计数器的值就减1,当计时器的值为0时,该对象就是不可能再被使用的。

The reference algorithm is simple and efficient, and squirrel like Microsoft's COM, Python language, and game scripting areas use reference timing algorithms for memory management.
However, Java does not use the reference counting algorithm to manage memory, the main reason is that it is difficult to solve the problem of mutual circular references. When two objects are useless, they should be recycled, but because they refer to each other, the reference timer is not 0, so the memory cannot be reclaimed.

Java uses another algorithm to determine if an object is alive--accessibility analysis.
This algorithm uses some of the objects called GC roots as a starting point from which to search, and when a GC cannot reach an object no matter how many paths it passes, it means that the object is unavailable and is judged to be a recyclable object. (This object is not a non-recyclable, we will see the object has a chance to save themselves)

: Although there are references between Object5, OBJECT6, and object7, they are not accessible to the GC, so they will be judged as recyclable objects.

So what exactly is this GC roots? Can contain the following types of:
1. Objects referenced in the virtual machine stack (local variable table in the stack frame)
2. member of Class static property reference in method area
3. Objects referenced by constants in methods
4. Objects referenced in the local method stack (native method)

Most of the above is a reference, the reference is a piece of memory in the data stored in the type is another piece of memory start address.
But this definition is too narrow. After JDK1.2, Java extends the concept of references into strong references, soft references, weak references, and virtual references.
Strong references : References that come out of new. If this reference is present, the referenced object will never be recycled.
Soft References : Describes some useful but not necessary objects. Like objects in the Java cache, this object is optional and can be reclaimed if memory is low, while remaining in memory can improve the performance of the program.
Weak references : Also describe non-mandatory objects, but are weaker than soft references. This means that objects that are referenced only by weak references are reclaimed regardless of whether the current memory is sufficient.
Virtual Reference : The weakest reference relationship in which an object's lifetime is not related to a virtual reference, and an instance of an object cannot be obtained through a virtual reference. The purpose of this reference is to receive a system notification when this object is reclaimed by the garbage collector.

The object of salvation?
As we said earlier, when the object is ready to be recycled, the object is not recycled directly.
If this object does not overwrite the Finalize () method, or if the method has been called by the virtual machine, it is directly recycled.
If the Finalize () method of this object is not called by the virtual machine, then the Finalize () method is the last chance for the object to survive, as long as the object is associated with any object on the reference chain (the path that the GC roots can reach). This allows you to remove yourself from the collection that is about to be reclaimed at the time of the second collection (and the GC roots is not the first time).

The following code:

/** * * Once object self-salvation * If the object finds no reference chain connected by GC root after feasibility analysis, it will be marked by the first * and go to filter * Filter criteria is whether this object is necessary to perform a finalize () method * Finalize method can only be executed Once * * * * Public  class finalizeescapegc {     Public StaticFINALIZEESCAPEGC Save_hook =NULL; Public void isAlive() {System.out.println ("I am Alive"); }@Override    protected void Finalize()throwsThrowable {Super. Finalize (); System.out.println ("Finalize method executed");//Save yourself ingFinalizeescapegc.save_hook = This; } Public Static void Main(string[] args)throwsinterruptedexception {Save_hook =NewFINALIZEESCAPEGC (); Save_hook =NULL;        System.GC (); Thread.Sleep ( -);//0.5 sec        if(Save_hook! =NULL) {save_hook.isalive (); }Else{System.out.println ("I am Dead"); } Save_hook =NULL;        System.GC (); Thread.Sleep ( -);//0.5 sec        if(Save_hook! =NULL) {save_hook.isalive (); }Else{System.out.println ("I am Dead"); }    }   }

The recovery method area , which is the permanent generation in the hotspot.
The garbage collection of the permanent generation mainly collects two parts: obsolete constants and useless classes
Recycling constants is similar to reclaiming objects in the Java heap.
Class needs to meet the following three criteria to be considered useless

1、该类的所有实例都被回收,java堆中不存在任何该类实例的对象2、加载该类的ClassLoader已经被回收3、该类对应的java.lang.Class对象没有在任何地方被引用,无法在任   何地方通过反射访问该类的方法。
    • How to recycle?

Must have been recycled through the garbage collector, there are many algorithms in the garbage collector, and here's a look at the GC algorithm.

1. Mark-Clear Algorithm (mark-sweep):
This is the most basic collection. There are two stages of marking and clearing: First marking all the objects that need to be recycled, collecting all tagged objects uniformly after the mark is complete, the tagging process is what we call accessibility analysis.
There are two disadvantages to this algorithm: the process of labeling and purging is not efficient, and after the purge is complete, a large amount of discontinuous memory fragmentation is generated, so that a large object can be allocated later without sufficient contiguous memory to trigger a garbage collection in advance.

2. Copy algorithm (copying):
Divide the memory into two blocks of equal size, using only one piece at a time. When the space in this block is used up, the object that survives is copied to another piece of memory. Then clean up the memory that was used. But the cost of this algorithm is to reduce the memory to half. Because most of the new generation of objects in Java are going to die, this algorithm is often used to reclaim the new generation.
When recycled, the objects that survived in Eden and survivor are copied one at a time to another survivor, and finally the Eden and the newly used survivor are cleared, and when the survivor memory is insufficient, the old age is guaranteed to be allocated.
  
3, marking-finishing algorithm (mark-compact):
Because the object of the old age is the object that has been recycled many times, if the copy algorithm is used in the old age, then all objects can survive, and the efficiency of replication will be very low at this time.
For the characteristics of the old age, there is a marker-collation algorithm that marks recyclable objects first, then moves all surviving objects to one end and then cleans the memory outside the end boundary directly
  

    • Garbage collector:

        
      Garbage collection algorithm is only the method of memory recycling, then garbage collector is the concrete implementation of memory recycling.
      Because the Java Virtual Machine specification does not specify how the garbage collector is implemented, the collectors provided by different versions of virtual machines differ greatly from one vendor to another.
      Is the hotspot virtual machine's garbage collector (the connection between the two collectors indicates that they can be used with each other):

Serial collector (serial)

The serial collector is the most basic and longest-growing collector, but this collector is a single thread.
The so-called single-threaded thread will only start a threads to collect garbage, and more importantly, when it collects garbage, all threads must suspend work until it is collected to the end. That is stop the world.
However, the serial collector is the default Cenozoic collector running in client mode. In client mode, a pause of dozens of milliseconds is acceptable as long as the pauses are not frequent.

parnew Collector (parallel)
The Parnew collector can be understood as a multithreaded version of the serial collector, often the preferred Cenozoic collector in virtual machines in server mode, as it can be used in conjunction with a CMS.

Parallel Scavenge collector (parallel recycle)
The Parallel scavenge collector is also a new generation of multi-threaded collectors, similar to parnew.
But the feature of the parallel scavenge collector is that its focus is different from other collectors: The purpose of a collector such as a CMS is to reduce the waiting time for user processes when garbage collection occurs; parallel The purpose of the scavenge collector is to achieve a controllable throughput (run user code time/(run user code time + garbage Collection Time)). So also known as the throughput priority collector.
In addition to this, the Parallel scavenge also has an adaptive throttling strategy that, when opened-xx:+useradaptivesizepolicy, provides a virtual opportunity to collect performance monitoring information based on the current system's operation.

Serial Old collector
Serial old is an older version of the Serial collector, and is a single-threaded collector that uses the tag-grooming algorithm, the first three of which are all using the stop-copy algorithm.

Parallel Old collector
Parallel old is an older version of the Parallel scavenge collector, using multi-threaded and tag grooming algorithms. Where throughput and CPU sensitivity are a priority, you can use Parallel scavenge+parallel old

cms Collector (concurrency)
The CMS (Concurrent Mark Sweep) collector is a collector that targets the shortest recovery pause time.
CMS is implemented based on the tag-purge algorithm, but its process is divided into four stages:

初始标记:标记GC Roots能直接关联到的对象并发标记:对GC Roots进行可达性分析,判断对象是否存活重新标记:修正在并发标记阶段因为用户操作而使标记改变的对象,停顿时间比初始标记的停顿时间长并发清除:对其进行清除回收

The most time-consuming process is concurrent tagging and concurrent cleanup, but both can work with the user thread

The CMS was almost an epoch-making garbage collector, but it also has three drawbacks:

1, very sensitive to CPU resources , because it is in the concurrency phase, will not cause the user process to pause, but because of the use of a portion of the thread, it will cause the application to slow down the total throughput. The number of threads that the CMS starts by default is (CPU+3)/4, as you can see, the more CPU you consume in the case of an increase in the number of CPUs.
2, unable to handle floating garbage , because the CMS in the concurrent cleanup phase, the user process is still running, so it will also generate garbage, but the CMS collector can no longer be the collection process to dispose of them, so left to the next GC in the cleanup. This rubbish is called floating garbage. Therefore, the CMS should reserve enough memory space for the user thread to use, so the CMS collector cannot wait until the old age is almost completely filled up in the same time as the other collectors, and it needs to reserve a portion of the space to provide concurrent cleanup programs.
3, because the CMS is based on the tag-clearing algorithm to implement the collector, it means that there is a large amount of space debris generation, when the large enough continuous space can not be found, will trigger a full gc,cms can open-xx:usercmscompactatfullcollection, The merge process that is used to turn on memory fragmentation when the CMS collector is not on top of the full GC. The memory grooming process cannot be concurrent, so the pause time must not be longer.

G1 Collector
G1 collectors can use multiple CPUs to shorten the stop time of the world, unlike CMS, G1 is a collector based on the tag-collation algorithm, but is implemented locally based on a replication algorithm, which means that no memory space fragmentation is generated, It will not trigger the GC in advance. In addition to reducing the pause time, a predictable pause-time model can be established that allows the user to explicitly spend less than n milliseconds on the garbage collector within a time fragment of length m milliseconds.
The heap memory layout of the G1 has also changed greatly, dividing the Java heap into multiple independent regions of equal size (region), where the Cenozoic and Laosheng generations are no longer physically isolated, and are part of the regional (no continuous) collection.
The process of GC1 collectors is divided into four stages

初始标记、并发标记、最终标记、筛选回收

Concurrency and parallelism in the garbage collector:

并行(Parallel):多条垃圾收集线程并行工作,此时用户线程处于等待停止状态。并发(Concurrent):用户线程与垃圾收集线程同时执行,即用户程序继续运行,而垃圾收集程序运行在另一个CPU中。
    • Memory allocation and recycling policy:

The memory allocation of the object, the large direction is allocated on the heap, the object is mainly allocated in the new generation Eden region, if the local thread allocation buffer (TLAB) is started, it will be assigned on the Tlab by thread precedence.

1. object priority in Eden area assignment
2, large objects directly into the old age
Add the parameter-xx:pretenuresizethreshold parameter to avoid a large amount of memory duplication between Eden and the two survivor zones.
3, Long-term survival of the object into the old age : Virtual machine to each object defines an Object age counter, if the object is born in Eden and after the first minor GC survived, and can be survivor accepted, will be moved to this area, age set to 1. The object didn't get through survivor once. Minor GC age is added 1, know that increase to a certain age, will be promoted to the old age, through the-xx:maxtenuringthreshold parameter to set the maximum age.
4. Dynamic Age Interpretation
If the sum of all objects of the same age in the survivor space is greater than half the size of survivor space, objects older than or equal to that age can enter the old age directly.

5. Space Allocation Guarantee
Before minor GC, it is first checked whether the largest available contiguous space in the old age is greater than the total space of the Cenozoic objects, and if this condition is true, then the minor GC is safe. If not, the virtual opportunity to see if the Handlepromotionfailure setting value allows the guarantee to fail, if allowed, then will continue to check the old age of continuous space greater than the total size of the new generation of objects or the average size of successive promotions, greater than the minor GC, If less than or handlepromotionfailure is set to not allow adventure, then a full GC is performed instead

JDK6 Update 24 becomes:

 如果老年代连续空间大于新生代对象的总大小或者历次晋升的平均大小就会 进行Minor GC,否则会进行Full GC。

requires two survivor for fur
Online Search Answer:
In the original copying collection algorithm, the space is divided into two halves, called Semispace. The process of spatial allocation and recycling is to allocate half of the space as a from, and to copy the reachable object to the to when the from is nearly full or satisfies some other condition, and to exchange the from and to logical processes. Simple copying collection does not respond well to long-lived objects, as objects like that are still alive every time they are collected, resulting in a copy overhead. In a trade-off, hotspot currently uses a similar approach to implement GC in addition to G1, and the copying collection algorithm is used in young Gen. However, the copying algorithm used is a variant of the original algorithm, leaving a larger area as Eden, set semispace between Eden and Old Gen as a buffer, so that "medium-life" objects are collected as much as possible before entering Old Gen. This is the survivor spaces of the hotspot. Well, it's a similar approach, but the parallel scavenge in hotspot is a little different from the other GC implementations. The others are shared with a set of sub-generational GC frameworks that start with JDK 1.4, and PS is a similar function that does not use that set of frameworks.

In fact, it is designed to be a parallel area from and to two, in order to filter objects that really meet the requirements of the old area (that is, objects that require long-held references), and then put them in the old area.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Garbage collection mechanism and memory allocation policy

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.