JVM garbage collection mechanism and algorithm

Source: Internet
Author: User

JVM memory consists of several parts: heap, Method area, stack, program counter, local method stack
JVM garbage collection is only done for public memory areas, that is, the heap and the method area, because only those two zones will be aware of the need to create objects at run time, and their memory allocations and collections are dynamic.

first, garbage collection strategy
1.1 Sub-generational management
Divide the heap and method areas by age by object:
Objects are frequently created in the heap (young Generation and Old Generation), and based on a generational idea, the heap is divided into two parts of the Cenozoic and the older generation in accordance with the object's survival time, not a single garbage collection of the new generation of surviving objects into the old generation,
Instead of going through several GC-surviving objects before putting them into the old generation, the new generation is again divided into Eden and two survivor areas, where objects are created in the Eden area and then replicated between two survivor.
The last surviving object is copied to the old generation.

(Permanent Generation) method area is a constant, loaded bytecode file information, etc., the information is relatively stable. Because objects are not created frequently, there is no need for a generational, direct GC.

Cenozoic:
1. All new object creation occurs in the Eden area, where the Eden area is full and the minor GC on the Cenozoic is triggered, and the Eden Zone and the non-idle survivor zone surviving objects are copied to another free survivor area.
2. Always guarantee that a survivor is empty, and the Cenozoic minor GC is the replication of the surviving objects between the two survivor zones until the survivor area is full.

Old generation:
1.Eden Zone full after the triggering minor GC copies the surviving object to the Survivor area, the Survivor Zone is full and triggers the minor GC to replicate the surviving object to the old generation.
2. After the new generation of two survivor between several copies, still survived the object is relatively old age, can be put into the old generation, over time,
If the old generation is full, it will trigger an entire GC to be garbage collected for the whole heap, including the new generation, the old generation, and the persistent generation.

Persistent Generations:
Store class classes, constants, method descriptions, and so on, and if full, will trigger all GC to reclaim obsolete constants and useless classes

1.2 Garbage collection
To perform garbage collection, the key is two points, one is to detect garbage objects, and the other is to release the space occupied by garbage objects.
1. Detecting Garbage objects
1) Reference counting method:
For an object A, as long as any one object references a, the reference counter of A is incremented by 1, and when the reference is invalidated, the reference counter is reduced by 1.
Object A can no longer be used as long as the value of the reference counter for object A is 0.
When implemented, you only need to configure an integer counter for each object.
However, the reference counter has a serious problem, that is, the case of a circular reference cannot be handled.
Object A contains a reference to object B, and object B contains a reference to object A. At this point, the reference counters for objects A and B are not 0. But there is no 3rd object in the system that references a or B.
That is, A and B are garbage objects that should be recycled, but because garbage objects are referenced to each other, the garbage collector is not recognized, causing a memory leak.

2) Accessibility Analysis:
The reference counting method cannot detect the problem of circular references between objects, so it is not necessary. In garbage collection, the detection of garbage objects is mainly the "accessibility analysis" method.

Accessibility Analysis algorithm:
By starting with a series of objects called GC Root, searching down from these nodes, the path traversed by the search is called the reference chain (Reference Chain),
When an object is connected to the GC root without any reference chain, the object is unreachable, the object is unusable, and the garbage collector reclaims the memory it occupies.
So the principle for the JVM to determine the survival of an object is that it can be reached by a root object.
Up to: Object A refers to object B, that is, a through B can be reached.

GC Root Object collection:
A:java the Reference object in the virtual machine stack (the local variable table in the stack frame). (object reference for the current stack frame)
B: The object referenced by the class static property in the method area. (Static object reference)
C: The object referenced by a constant in the method area. (Final object reference)
D: The reference object for the Jni local method in the local method stack.

In addition to the heap, "obsolete constants" and "useless classes" in the method area need to be recycled to ensure that the permanent generation does not have a memory overflow, and methods for detecting garbage objects in the method area:
1. Methods for judging obsolete constants (constants that are no longer required): if a constant in a constant pool is not referenced by any reference, the constant is an obsolete constant
2. Judging useless classes (class files that are no longer needed):
1) All instances of the class have been reclaimed, that is, the instance object of the class does not exist in the Java heap
2) The class loader that loaded the class has been recycled
3) The Java.lang.Class object of the class is not referenced anywhere and cannot access the method of the class from anywhere through the reflection mechanism

When the persistence generation (method area) is full, the fully GC is triggered to purge obsolete constants and useless classes based on the above criteria.

2. Free up space
Garbage collection algorithm
1) Mark-Clear (Mark-sweep): The most basic garbage collection algorithm
The tag-purge algorithm is divided into two stages: the tagging phase and the purge phase.
The task of the tagging phase is to mark out all objects that need to be recycled, and the purge phase is to reclaim the space occupied by the tagged objects.
The tag-purge algorithm is easy to implement, but the problem is that it is prone to memory fragmentation, and too many fragments can cause the subsequent process to allocate space for large objects without finding enough space to trigger a new garbage collection action ahead of time.

2) Copy (Copying):
Divide available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is used up, copy the surviving object to another piece, and then clean up the used memory space once, so the memory fragmentation problem is not easy.
The replication (Copying) algorithm is simple, efficient, and less prone to memory fragmentation, but can use less memory than half the original.
The efficiency of the copying algorithm is very much related to the number of surviving objects, and if there are many surviving objects, the efficiency of the copying algorithm will be greatly reduced.

3) Marking-finishing (Mark-compact):
The algorithm marks the same stage as the tag-purge (Mark-sweep), but after the token is completed, it does not clean the recyclable object directly, but instead moves the surviving object to one end and then cleans up memory outside the end boundary.

4) Generational (Generation Collection), with the use of the previous three algorithms to achieve:
The generational collection algorithm is the algorithm used by most of the JVM's garbage collectors today.
The core idea is to divide the memory into several different regions based on the life cycle of the object's survival. Under normal circumstances, the heap zoning is divided into the old age (tenured Generation) and the Cenozoic (young Generation),
The characteristics of the old age is that only a small number of objects per garbage collection need to be recycled, and the new generation is characterized by a large number of objects to be recycled each time the garbage collected, then according to the characteristics of different generations to take the most appropriate collection algorithm.

At present, most of the garbage collectors take the copying algorithm for the new generation, because each garbage collection in the Cenozoic has to reclaim most of the objects, that is, the number of operations that need to replicate is relatively small, but the actual is not in accordance with the 1:1 ratio to divide the new generation of space,
In general, the Cenozoic is divided into a larger Eden area and two smaller survivor areas, each using the Eden area and one of the survivor zones, and when reclaimed, copies objects that are still alive in Eden and survivor to another survivor area.
Then clean up Eden and the survivor area you just used.

Because of the characteristics of the old age is that each recycling only a small number of objects, the general use of the mark-compact algorithm.

second, a typical garbage collector
The HotSpot (JDK 7) virtual machine provides several garbage collectors that users can assemble to suit their needs in each era.
1, serial/serial old (serial collector)
The serial/serial old collector is the most basic and oldest collector, which is a single-threaded collector and must suspend all user threads when it is garbage collected.
Serial Collector is for the new generation of collectors, the use of the copying algorithm, Serial old collector is a collector for the older era, using the mark-compact algorithm.
The advantage is simple and efficient, the disadvantage is to bring a pause to the user.

2, Parnew
The Parnew Collector is a multithreaded version of the serial collector that uses multiple threads for garbage collection.

3, Parallel scavenge
The Parallel scavenge collector is a new generation of multi-threaded collectors (parallel collectors) that do not need to suspend other user threads during recycling, using the copying replication algorithm,
This collector differs from the first two collectors in that it is primarily designed to achieve a manageable throughput.

4, Parallel old
Parallel old is the older version of the Parallel scavenge collector (parallel collector), using multithreading and mark-compact (Tag-collation) algorithms.

5. CMS
The CMS (Concurrent Mark Sweep) collector is a collector that targets the shortest recovery pauses, a concurrency collector that uses the Mark-sweep (mark-and-sweep) algorithm.
(Parallel collector: multiple garbage collection threads work in parallel while the user thread is still waiting)
(Concurrent collector: the garbage collection thread works with the user thread for a period of time (not in parallel, but alternately))

6, G1
The G1 collector is a collector for service-side applications that leverages multi-CPU, multi-core environments.
The G1 collector is a parallel and concurrent collector, and it can establish a predictable pause-time model.






JVM garbage collection mechanism and algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.