Deep understanding of Java Virtual Machine-Reading Notes (2): common garbage collection algorithms

Source: Internet
Author: User
Document directory
  • Spam memory judgment Algorithm
  • Garbage collection Algorithm

This series of Reading Notes for deep understanding of Java Virtual Machine (by Zhou Zhiming)

There are three core issues of garbage collection: (1) memory to be recycled (2) when to recycle (3) how to recycle

In Java, the memory areas to be recycled include the heap and method areas. The method area is also called the "Eternal generation" in hotspot. It mainly collects the content of these two aspects: discard constants and useless classes. It is easy to understand discarded constants. For example, a String constant with "ABC" exists in the constant area. If no string points to "ABC" in the system, "ABC" can be recycled. The judgment of useless classes must be more complex and meet the following three conditions:

  1. Java heap does not have any instances of this class
  2. Classloader for loading this class has been recycled
  3. The Java. Lang. Class Object corresponding to this class is not referenced anywhere and cannot be accessed anywhere through reflection.
The reference counting method of the spam memory judgment algorithm is the most famous spam memory judgment algorithm, which has appeared in many textbooks. The principle is: Add a reference counter to the object. When it is referenced elsewhere, the counter increases by 1; when a reference is lost, the counter value decreases by 1; if the counter is 0, this object can be recycled. This algorithm is simple and efficient, but cannot handle mutual references. For example, in the following code, both the object A and object B have the field instance, so that. instance = B and B. instance = A. In addition, the two objects are not referenced. In fact, both objects can be recycled. However, because they reference each other, the reference count is always greater than 1, therefore, this method cannot be used to recycle such junk memory.
public Class RCGC{    public Object instance = null;    private staic final int _1MB = 1024 * 1024;    public staic void test(){        RCGC A = new RCGC();        RCGC B = new RCGC();        A.instance = B;        B.instance = A;        A = null;        B = null        System.gc()    }}
Currently, root search algorithms are used in mainstream programming languages to determine whether an object is alive. The idea of this algorithm is to start from a series of "GC roots" objects as the starting point and start to search downward. The path searched through becomes "reference chain ", when there is no reference link between an object and GC roots, this object cannot be reached. For example, if object 5, 6, and 7 do not have a reference chain of GC roots, they can be recycled. Objects that cannot be reached by the Finalize method root search algorithm will not be immediately recycled, or even "Revived" through the Finalize method ". To declare an object's "death penalty", it must go through at least two marking processes: if the object is not found to be connected to GC roots after root search, it is marked for the first time and filtered. The filtering process is to check whether the object needs to execute the Finalize method. If the object does not overwrite the Finalize method or the Finalize method has been executed, then the Virtual Machine regards these two cases as "there is no need to execute the Finalize method. If the JVM determines that the Finalize method is required, the object will be put into an F-quene queue, later, it will be executed by a low-priority finalize thread automatically Established by JVM. The "execution" here refers to triggering this method, but it does not promise to wait until it ends. The reason for this is that if the Finalize method of an object is slow or in an endless loop, the objects in the F-quene queue may remain in the waiting state and cannot be recycled. This will cause the entire JVM memory recovery system to crash. After the garbage collection algorithm determines the garbage memory to be recycled, the next step is to determine how to recycle the memory. The following describes the ideas of several common garbage collection algorithms: The tag-Purge algorithm can be seen from the name. This algorithm is executed in two stages: Mark the objects to be recycled first, after the tag is complete, the object is recycled. A major defect of this algorithm is that it is prone to a large number of discontinuous memory fragments, as shown in figure: the replication collection algorithm uses two pieces of memory. When one piece of memory is insufficient, copy the surviving objects to another memory to avoid fragmentation. The replication algorithm is currently used by commercial virtual machines. Because the collection rate of object garbage collection in the new generation is very high, you do not need to allocate memory in a ratio of. Instead, you need to divide the memory into two larger Eden spaces and two smaller consumer vor spaces, each time you use Eden and a dedicated vor space. When the Eden and the remaining objects in the same vor are recycled, copy them to another vor, and finally clear the Eden and the same vor. By default, the size ratio of Eden to VOR is, so that the available memory space of each new generation is 90% of the capacity of the new generation, and only 10% of the memory space is "wasted ". What should I do if 10% of the memory vor space cannot store GC memory? In this case, you need to rely on other memory (this refers to the old age) for the space allocation guarantee (handle
Promotion ). The running process is shown in: Before recycling:
After recycling:
Before introducing the following content, space allocation guarantee first introduces two terms: minor GC and full GC. Minor GC: garbage collection in the new generation. Full GC: Garbage collection is performed in both the old and new generations. In general, full GC takes far more time than minor GC. In the ideal replication algorithm, two equal-size memories should be used for mutual replication and cleaning, however, this will cause a huge waste of space. In the new generation, most objects have a short life cycle. Therefore, Eden and two pair vor are sufficient in most cases. In order to cope with the situation that a large number of objects in the new generation are still alive after minor GC, space allocation is generally guaranteed in the old age. After the space allocation guarantee is made in the old age, objects that cannot be stored in the VOR will be directly stored in the old age. Therefore, the premise of the guarantee is that the remaining space in the old age is sufficient to accommodate these objects. Because the JVM cannot know the remaining object size in the next GC when the master guarantees, it can only take the average size of the object capacity that was previously reclaimed and promoted to the old age as the experience value, compare with the remaining space in the old age, and decide whether to perform full GC to free up more space in the old age. The guarantee of space allocation in the old age cannot be guaranteed. If the number of surviving objects after a minor GC spike is much higher than the average value, the guarantee will still fail (hanlde promotion failure ). If the guarantee fails, you have to initiate a full GC immediately. Tag-sorting algorithm copy collection algorithms perform more replication operations when the object survival rate is high, resulting in lower efficiency. Therefore, in the old age, the replication algorithm is generally not used directly. Based on the high object survival rate in the old age, someone proposed an improved version of the tag-clearing algorithm: The tag-sorting algorithm. The marking process is the same as the marking-clearing algorithm, but the subsequent steps are not to clear directly recyclable objects, but to move all surviving objects to one end, then, the memory outside the boundary is cleared directly, as shown in: the current commercial jvm gc basically adopts the "generational collection algorithm ", objects are stored in different memory spaces according to their lifecycles. Java heap is generally divided into the new generation and the old generation, so that the most appropriate collection algorithm can be used according to the characteristics of each age. In the new generation, the survival rate of objects is not high during each collection, which is suitable for the replication collection algorithm because a small amount of object replication costs are required. In the old age, because the object has a high survival rate and no extra space to allocate a guarantee for it, it is more appropriate to use the "tag-clean" algorithm or the "tag-sort" algorithm for recovery.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.