Garbage collection algorithm and JVM garbage collector (i)

Source: Internet
Author: User

Bibliography: In-depth understanding of Java Virtual Machine Zhou Zhiming

Java Programming Ideas Bruce Eckel

Why do we need to know about GC after automated garbage collection: When we need to troubleshoot various overflow, memory leak issues, when garbage collections become a bottleneck for the system to reach a higher concurrency, we have to implement the necessary monitoring and tuning of these "automated" technologies.

Main areas of garbage collection: Java heap and method area

So, how can you tell if an object is already recyclable? There are two main ways of doing this:

One, the reference counting algorithm: Each object has a reference counter, when a reference is connected to the object, the reference counter plus 1, when the reference counter left the reference scope is NULL, the reference count minus 1. The garbage collector iterates over the list of all objects, finds the object with the reference count of 0, and frees up the space it occupies. COM, Flashplayer, and Python all use reference counting algorithms to implement memory management. However, there is a flaw in this approach, which is that you cannot solve the problem of interacting with self-referencing groups of objects (that is, circular references).

The Accessibility analysis algorithm: The basic basis-through a series of objects called "GC Roots" as a starting point, starting from these nodes to search down, search through the road called the reference chain (Reference Chain), when an object to the GC Roots no reference chain connected, It proves that this object is not available.

In the Java language, the objects that can be used as gcroots include the following:

A) The object referenced in the virtual machine stack (the local variable table in the stack frame).

b) The object referenced by the class static property in the method area.

c) The object referenced by the constant in the method area.

d) The object referenced by the JNI (native method) in the local method stack.

So what about the definition of references? After JDK1.2, the reference is divided into four classes: Strong references (Universal objects), soft references (useful but not required, two recoveries), weak references (as long as the garbage collector starts to work, must be recycled, regardless of heap memory is sufficient), virtual references (often used when the object is collected by the collector can receive a system notification).

In the accessibility search algorithm unreachable objects, nor is "not dead", when they are temporarily in the "probation" stage, to truly declare an object to die, at least two times to go through the marking process: If the object is found to have no reference chain connected to the GC roots after the accessibility analysis, Then it will be marked for the first time and filtered for the condition that it is necessary for this object to execute the Finalize () method. When the object does not overwrite the Finallize () method or the object's Finalize () has been called, the virtual machine treats both cases as "no need to execute."
If an object is judged to be necessary to execute the Finalize () method, then the object will be placed in a queue named F-queue and executed by a low-priority finalizer thread that is automatically established by a virtual machine later. The so-called "execution" here refers to the virtual opportunity to trigger this method, but does not promise to wait for it to run over. The reason for this is that if an object executes slowly in the Fianlize () method, or if a dead loop occurs, it is likely to cause other objects in the F-queue queue to be permanently waiting, or even to crash the entire memory-recycling system. The Finalize () method is the last chance for an object to escape the fate of death, and later the GC will make a second small-scale mark on the object in F-queue, if the object succeeds in finalizing () saving itself-as long as it is re-associated with any object on the reference chain. It will be removed from the collection of "going to be recycled" at the second mark, and it will be recycled if the object has not escaped at this time.

All of this is about the decision algorithm for garbage collection of objects in the heap, and the next step is to reclaim the contents of the method area. Garbage collection in the method area mainly recycles two parts: obsolete constants and useless classes. Reclaiming obsolete constants is very similar to reclaiming objects in the Java heap. Suppose there is a string constant, there is no string object that references the constant pool of strings, and there is no other reference to this literal, and if memory is reclaimed at this point, and if necessary, the constant is cleared out of the constant pool. The symbolic references to other classes, methods, and fields in a constant pool are similar.

To judge a class as a useless class, you need to satisfy 3 conditions:

A) All the strength of the class has been recycled.

b) The classloader that loaded the class have been recycled.

c) The corresponding Java.class.Lang object of this class is not referenced anywhere, and the method of accessing the class can be accessed anywhere by reflection.

Scenarios such as a large number of bytecode frameworks, such as reflection, dynamic proxies, Cglib, dynamically generated JSPs, and OSGi such as frequent custom classloader require virtual machines to have class offload capabilities to ensure that the permanent generation does not overflow.

The following is the core of the garbage collection algorithm:

1. Tag-Clear algorithm
The algorithm is the most basic collection algorithm, divided into "mark" and "clear" two stages: first mark out all the objects that need to be recycled, after the mark is complete, all the tagged objects are collected uniformly. All of them say that it is the most basic algorithm because the subsequent collection algorithms are based on this idea and improve their shortcomings. Its shortcomings are mainly two: one is the efficiency problem, the labeling and removal process is not efficient; the other is a space problem, the mark will result in a large number of discontinuous memory fragmentation, too much memory fragmentation caused when the program ran into the need to allocate a large object can not find enough contiguous memory and have to trigger another garbage collection operation ahead. Tag-Clears the execution process of the algorithm such as:

1. Tag-Clear algorithm
The algorithm is the most basic algorithm, divided into "mark" and "clear" two stages: first mark out all the objects that need to be recycled, after the mark is complete, the unified collection of all tagged objects. All of them say that it is the most basic algorithm because the subsequent collection algorithms are based on this idea and improve their shortcomings. Its shortcomings are mainly two: one is the efficiency problem, the labeling and removal process is not efficient; the other is a space problem, the mark after the removal of a large number of non-wired memory fragmentation, too much memory fragmentation caused when the program ran into the need to allocate a large object can not find enough contiguous memory and have to trigger another garbage collection operation ahead.

2. Copying algorithms
To solve the efficiency problem, the "copy" algorithm appears. It divides the available memory by capacity into two blocks of equal size, using only one piece at a time. When this piece of memory is used, copy the surviving object to another piece, and then clean up the used memory space once. So that each time is a piece of memory recovery, memory allocation is not considered the problem of memory fragmentation, only need to move the heap top pointer, in order to allocate memory, easy to implement, efficient operation. Just the cost of this algorithm is to reduce the memory to half the original, the price is a little too high. Modern virtual machines divide memory into a larger Eden space, from space, to 3 parts of space, where the from space and to space can be considered as two blocks of the same size, equal status, and role-interchangeable spaces for replication. The From and to spaces can also be called survivor spaces for storing objects that are not reclaimed. During garbage collection, the surviving objects in the Eden space are copied into unused survivor space, and young objects in the survivor space being used are also copied into the to space (large objects, or older objects will enter the old age directly, if the to space is full, The object will also go straight into the old age). The hotspot virtual machine default Eden and Survivor Space Scale is 8:1. The basis for this ratio is that most of the new generation's objects are "dying in the Face of life".

3. Labeling-Sorting algorithm
The replication algorithm will perform a lot of replication when the object has a high survival rate, the efficiency would be lower, if you do not want to waste 50% of space, you need to have additional space to guarantee to deal with the memory of all the objects in use 100% survival extreme situation, so in the old age generally can not directly select this algorithm. According to the characteristics of the old age, the "tag-collation" algorithm is proposed, the tagging process is still the same as the "mark-purge" algorithm, but the next step is not directly to the recoverable object cleanup, but instead of all the surviving objects to move to one end, and then directly clear the memory outside the end of the boundary.

Garbage collection algorithm and JVM garbage collector (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.