Overview
- Garbage collection takes three things into account:
- What memory needs to be recycled?
- When do I recycle?
- How to recycle?
- Focus on the dynamically allocated and reclaimed memory in the Java heap.
is Object alive? Reference counting method
Accessibility analysis algorithm
- In mainstream business programming languages (Java, C #, LISP), the main implementation is to determine whether an object survives through the accessibility analysis (reachability) .
- The basic idea of this algorithm is to search from these nodes starting from a series of objects called "GC Roots" as starting points. The path to search is called the reference chain (Reference Chain).
- When an object is not connected to a GC roots (that is, unreachable), the object is unavailable. Objects that cannot be reached are recyclable. As follows:
- In Java, the objects that can be used as GC roots include the following:
- Reference in the virtual machine stack (local variable table in the stack frame);
- The object referenced by the class static property in the method area;
- The object referenced by the constant in the method area;
- The object that is referenced by JNI in the local method stack.
Talk about references again
- Both of the above methods use references to determine whether an object is alive.
- But before JDK1.2, the reference definition in Java was traditional: if the value stored in the reference type of data is the starting address of another piece of memory, it is said that this memory represents a reference.
- But consider this scenario: We want to describe a class of objects that can be left in memory when the memory space is sufficient, otherwise it can be discarded. <--Many of the system's caching capabilities are consistent with this scenario.
- So after JDK1.2, Java expanded the concept of referencing: dividing references into strong references (strong Reference), soft references (Soft Reference), weak references (Weak Reference), virtual references (Phantom Reference) 4 species, intensity in turn weakened.
- Strong references: Most common, like object obj = new Object ()
- Soft references: Used to describe some objects that are also useful but not necessary. For this class of objects, the system reclaims these objects before a memory overflow exception occurs. The SoftReference class is provided after JDK1.2.
- Weak references: Also used to describe non-essential objects, but weaker than soft references. An object that is associated with a weak reference can survive only until the next GC, regardless of whether the current memory is sufficient.
- Virtual reference: Phantomreference class.
To survive or to die?
- Even objects that are unreachable in the Accessibility analysis algorithm are not "dead".
- To really declare an object dead, at least two times to go through the tagging process: TBD ...
Recycling Method Area
- Generally speaking, the "Price/performance" of garbage collection in the method area is relatively low: in the heap, especially in the Cenozoic, the general application of garbage collection can generally reclaim 70%~95% space, and the garbage collection efficiency of the permanent generation is much lower than this.
- the garbage collection of the permanent generation mainly recycles two parts: obsolete constants and useless classes .
- Reclaiming obsolete constants is very similar to recovering objects in the Java heap. Assuming that a string "ABC" has entered a constant pool, but there is no string object in the current system that references the "ABC" constant in the constant pool, the "ABC" constant is cleaned out of the constant pool by the system.
- In contrast, the criteria for determining whether a class is a "useless class" are relatively harsh. Need to meet both:
- All instances of this class have been recycled;
- The ClassLoader that loaded the class have been recycled;
- The corresponding Java.lang.Class object of this class is not referenced anywhere and cannot be used to access the class's methods at any place.
Garbage collection algorithm tag-purge algorithm
- The algorithm is divided into two stages: "Mark" and "clear".
- The algorithm is the basis of the subsequent collection algorithm.
- Insufficient:
- Efficiency problem: The efficiency of marking and clearing two processes is not high;
- Space problem: After the mark clears, there is a lot of discontinuous memory fragmentation, and too much space fragmentation can result in the subsequent allocation of large objects in the course of the program running and the inability to find enough contiguous memory to trigger another GC in advance.
Replication Algorithms
- It divides the available memory by capacity into two blocks of equal size, using only one piece at a time.
- When the block used is finished, copy the surviving object to the other piece, and then clean up the used memory space once.
- This makes every time the entire half of the memory collection, memory allocation is not considered memory fragmentation, and so on, only with the mobile heap top pointer, the order of allocation can be.
- The cost is that the memory shrinks to half the original.
- This collection algorithm is used by commercial JVMs today. IBM's research has shown that object 98% in the Cenozoic is "dying", so there is no need to divide the memory space by a 1:1 ratio. Instead, it divides the memory into a larger Eden space and two smaller survivor spaces.
Tagging-sorting algorithms
- The replication collection algorithm requires more replication operations when the object has a higher survival rate and will become less efficient . More crucially, if you do not want to waste general space, there is a need for additional space to allocate security, in order to deal with all the objects in the memory used 100% survival extreme situation, so the old age generally cannot use this algorithm.
- Tag-Grooming algorithm: The tagging process is still the same as the tag-purge algorithm. However, the next step is not to clean up the recyclable objects directly, but rather to have all the surviving objects move toward one end and then clean out the memory directly out of the end.
Generational collection Algorithms
- The GC of the current commercial JVM uses the "generational collection" (generational Collection) algorithm.
- This algorithm divides the memory into several blocks according to the object's inventory cycle.
- In general, the Java heap is divided into new sound generation and laosheng generation, thus using the most appropriate collection algorithm according to the characteristics of each age.
Summary
- This article covers two major GC issues:
- which objects need to be recycled : The most basic is the reference counting method, which is simple but does not solve the circular reference problem. On this basis, the more commonly used is the accessibility analysis algorithm.
- How to recycle : The most basic is the tag-clear method, which is to mark out which objects need to be recycled, and then clear each one, insufficient to produce a lot of memory fragmentation, resulting in subsequent possible triggering of multiple GC. Also more efficient is the replication method, which ensures continuous memory, but reduces memory to the original memory and introduces a large number of replication operations when the object has a high survival rate. There are also marker-collation algorithms. In addition, the commercial JVM will use a generational collection algorithm, the memory is divided into several pieces, according to the age of the appropriate collection algorithm.
[In-depth understanding of Java Virtual Machine]< garbage collector and memory allocation policy >