Graphical JVM garbage Collection Algorithm _ diagram

Source: Internet
Author: User

1 Brief introduction of-----> Garbage Collection Concept

Garbage in a GC, which is an object that exists in memory and is not used again. Garbage collection is the elimination of those objects that are no longer in use, and reclaim the memory space occupied. If the garbage in memory is not cleaned up in time, the memory space occupied by the garbage objects will remain until the end of the application, and the reserved space cannot be used by other objects. If a large number of objects that are not used consistently occupy space, if the application requires memory space and there is no extra memory space for it to use, it can cause a memory overflow. Therefore, for the management of memory space, it is very important to identify and clean up garbage objects.

But how to identify if an object survives. Which is to be reached. According to what strategy to judge the accessibility of an object.

The root search algorithm (GC Roots tracing) is used in Java to determine whether an object is up to date. The basic idea of the algorithm is through a series of root node "GC Roots" objects as the starting point, starting from these nodes downward search, the search through the path referred to as reference chain, when an object to the GC Roots is not linked to the reference chain, then this object is not up to. will be judged as an object that can be recycled.

Generally what kind of object can be used as gcroots?

There are several objects in Java that can be used as gcroots:

1 objects referenced in the virtual machine stack (local variables table in the stack frame)

2 The object referenced by the class static property in the method area.

3 The objects referenced by the constants in the method area

4 Objects referenced by JNI (usually said native method) in the local method stack

2 Illustrated Introduction-----> Garbage Collection algorithm

First of all, we will introduce the main ideas of the algorithm, and then use the graphical way to visualize the workflow of the algorithm ...

Focus on the idea of its algorithm, as well as the advantages and disadvantages of its algorithm and the application of the scene ....

(1) Reference counting method

The reference counting method is the most classic one of the garbage collection algorithms. The implementation is simple, for a object, as long as any object references a, then A's reference calculator adds 1, when the reference is invalid, the reference counter is reduced by 1. Object A can no longer be used if a reference counter value is 0.

Although its thought implementation is simple (with an integer counter for each object), the algorithm has two serious problems:

1 cannot handle the problem of circular references, so the algorithm is not used in the Java garbage collector.

2 The reference counter requires that each time a reference is generated and eliminated, it is accompanied by an addition operation and subtraction operation, which will have a certain impact on the system performance.

A simple circular reference problem description:

Object A and Object B, object A contains a reference to object B, and object B contains a reference to object A. Both the reference counters for objects A and B are not 0, but there is no third object reference A and B in the system. That is, A and B are garbage objects that should be recycled, but because of the mutual reference between garbage objects, the garbage collector is unrecognized, causing a memory leak (for some reason, the memory space occupied by garbage objects cannot be reclaimed).

The following figure: An unreachable object appears as a circular reference, its reference counter is not 0,

Note: Because the reference counter algorithm has circular references and performance problems, the Java Virtual machine does not use this algorithm as a garbage collection algorithm.

Reachable object: A reference to the root object to search for the object that can eventually be reached.

"Unreachable Object": A reference search through the root object, and ultimately no object to be referenced.

(2) Mark removal method

The mark elimination method is the thought foundation of the modern garbage collection algorithm.

The tagging process divides garbage collection into two phases: the marking phase and the purge phase.

In the markup phase, all the objects that start from the root node are marked through the root node, so the unmarked object is an unreferenced garbage object. Then, in the purge phase, clears all objects that are not marked. This approach solves the problem of circular references, where only two objects are unreachable, even if they reference each other. will also be sentenced to locate unreachable objects.

The biggest problem with the tag cleanup algorithm is space debris.

The following illustration shows a simple description of the use of the tag-purge method to reclaim a contiguous memory space.

Starting from the root node (only two root nodes are shown here), all objects that have referential relationships are marked as surviving objects (arrows represent references). From the root node, the unreachable objects are garbage objects. After the tag operation is complete, the system reclaims all unreachable objects.

As you can see from the diagram above, the memory space is no longer contiguous. In the process of allocating objects to space, especially the memory allocation of large objects, the efficiency of discontinuous memory space is lower than that of continuous space, which is the disadvantage of this algorithm.

Note: The tag cleanup algorithm first marks all accessible objects through the root node, then clears all unreachable objects and completes garbage collection. Then we'll talk about the markup compression algorithm, notice the difference between the two ...


(3) Replication algorithm

Algorithm idea: The original memory space is divided into two identical storage space, one at a time, in the garbage collection, the living object in the memory block is used to copy to the unused memory space, then clear all the objects in the memory block in use, complete garbage collection.

If there are many garbage objects in the system, the replication algorithm will need to replicate a relatively small number of surviving objects (applicable scenarios). Therefore, in the real need for garbage collection at the moment, the replication algorithm is highly efficient. Also, because the surviving object is assigned to another memory space in the garbage collection process, it ensures that the reclaimed memory space is not fragmented. Advantages

But the cost of the replication algorithm is to binary the system memory space, using only half of the space, and if the garbage objects in the memory space, the replication object is time-consuming, therefore, simple replication algorithm is also undesirable. Shortcomings

Graphical algorithm Recycling process:

A, b two pieces of the same memory space (the original memory space binary get two of the same size of memory AB), a in the garbage collection, the surviving objects to B, b in the space after replication to maintain continuous. After the copy is complete, clear a. and set space B to the currently used memory space.

In the new generation serial garbage collector in Java, using the idea of the replication algorithm, the Cenozoic is divided into the Eden Space, the from space and the to space 3 parts, where the from and to space can be considered for replication of the two blocks of the same size, A memory block of interchangeable characters (one time can only be used as the current memory space and the other works in garbage collection), and the From and to spaces are also called survivor spaces for objects that are not reclaimed.


"Cenozoic Object": A heap space for young objects, which are objects that have just been created, or that have experienced little garbage collection.

"Old Age object": Storage of old age object heap space. The object that is still alive for many times of garbage collection.


In garbage collection, the surviving objects in the Eden space are copied into the unused survivor space (to), and the young objects in the Survivor space (from in the figure) are copied into the to space (large objects or older objects go straight into the old age, If the to space is full, the object will also enter the old age. At this time Eden and from space the remaining objects are garbage objects, directly empty, to space to store the recovery after the object survived.

Advantage: This kind of replication algorithm guarantees the continuity of memory space, and avoids a lot of space waste.

Note: The replication algorithm is more suitable for the Cenozoic. Because in the Cenozoic, the garbage object usually will be more than the survival object, the algorithm effect will be better.

(4) Labeling compression algorithm

The efficiency of the replication algorithm is based on the existence of fewer objects and more garbage objects, this situation is more common in the Cenozoic,

But in the old age, most objects are living objects, if there is a replication algorithm, the cost will be relatively high. Therefore, based on this feature of the old age, other recycling algorithms should be used.


The labeling compression algorithm is a collection algorithm of the old age, which is optimized on the basis of the mark elimination algorithm. (Recall, the shortcomings of the mark removal algorithm, garbage collection after the memory space is no longer continuous, affecting the use of memory space efficiency ...) )

As with the Mark Elimination algorithm, the tag compression algorithm begins with the root node and marks all the objects that can be reached at once.

But then, instead of simply cleaning up unmarked objects, it compresses all the surviving objects to one end of the memory space, and then clears all the space outside the bounds.

This avoids the production of fragments, and does not require two pieces of the same memory space, so cost-effective.

Diagram of its algorithm work process:

After all the accessible objects are marked by the root node, the object is moved along the dotted line, moving all the accessible objects to one end and maintaining the referential relationship between them, and finally, clearing the space outside the boundary.

The final effect of the markup compression algorithm is equivalent to the completion of the Mark cleanup algorithm, and then the defragmentation of the memory fragment, which is also called the tag Purge compression algorithm.

(5) Generational algorithm

In the garbage collection algorithm mentioned earlier, there is not an algorithm that can completely replace other algorithms, each has its own characteristics and advantages, so we need to choose the appropriate garbage collection algorithm according to the characteristics of the garbage object.


The idea of generational algorithm: dividing the memory space according to the characteristics of the object, choosing the appropriate garbage collection algorithm to improve the efficiency of garbage collection.

Typically, Java Virtual opportunities Place all new objects into the memory space called the Cenozoic.

The characteristics of the new generation are: the object to the death, about 90% of the object will be quickly recycled, so the Cenozoic is more suitable to use the replication algorithm.

When an object is still alive after several garbage collections, objects are placed in the memory space of the old age, where almost all objects are still alive after several garbage collections, and therefore are considered to be resident memory for a period of time, even during the entire lifecycle of the program.

The survival rate of the old age is very high, if the replication algorithm is still used to recycle the old age, will need to replicate a large number of objects. This approach is not advisable, according to the idea of generational, the recycling of old age using tag cleanup or labeling compression algorithms can improve garbage collection efficiency.


Note: Generational ideas are widely used by existing virtual machines, and almost every garbage collector distinguishes the new generation from the old age.

For the Cenozoic and the old, the frequency of recycling is usually very high, but each recovery time is very short, while the old age recovery is relatively low, but it is consumed a lot of time. To support high frequency Cenozoic recovery, a virtual machine may use a data structure called a card table, card-table as a bit set, and each bit can be used to indicate whether all objects in an area of the old age hold a reference to a new generation of objects.

So since the Cenozoic GC, instead of spending a lot of time scanning all old age objects to determine the referential relationships of each object, you can scan the card table first, and only when the card-table mark is 1 o'clock, you need to scan the old age object for a given area, and card-table is the old age object in the 0 region, Must not contain references to Cenozoic objects.


The following figure indicates:

Each of the cards in the table shows the age of 4KB space, the old age region recorded as 0 has no object pointing to the Cenozoic, only the card-table 1 of the region has a reference to the new generation of objects, so in the new generation GC, only need to sweep face card-table for 1 of the old age space, using this method, Can greatly accelerate the new generation of recovery speed.

(6) Partitioning algorithm

Algorithm idea: The partitioning algorithm divides the whole heap space into successive different cells,

As shown in the figure:

Each cell is used independently, independently recycled.

The advantage of the algorithm is: You can control how many small interval recovery

Generally, the larger the heap space is, the longer the time required for a GC, and thus the longer the pause time is generated. In order to control the pause time produced by GC, a large memory area is divided into several small blocks, each reasonable recovery of several small intervals, rather than the whole heap space, can reduce a GC pause time according to the target pause time.


from:http://blog.csdn.net/wen7280/article/details/54428387

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.