Garbage collection and several common garbage collection algorithms and several garbage collection algorithms
Preface:
First, think about three things that need to be done by Garbage Collection (GC ).
1) What memory needs to be recycled?
2) When will it be recycled?
3) How to recycle it?
The previous blog mentioned various parts of the Java memory runtime region. The program counters, virtual machine stacks, and local method stacks are generated with threads and destroyed with threads; stack frames in the stack are methodically executed along with the method. The amount of memory allocated in each stack frame is known when the class structure is determined. Therefore, the memory allocation and recovery in these regions are deterministic, in these regions, you do not need to worry too much about recycling, because the memory is recycled at the end of the method or the end of the thread. The Java heap and method zones are different. The memory required for the Implementation classes in an interface may be different, and the memory required for multiple branches in a method may be different, only when the program is running can we know which objects will be created. The allocation and recycling of this part of memory are dynamic, and the garbage collector focuses on this part of memory.
1. The object is dead.
1. The so-called object is dead. In fact, the garbage collector needs to determine which of these objects are "alive" and which are "dead" before recycling the heap"
2. Judgment Algorithm
1) Reference Counting: this algorithm is used in many textbooks to determine whether an object is alive. However, this algorithm is not used to manage memory in mainstream Java virtual machines, next, we will briefly introduce this algorithm. In fact, it is to add a reference counter to an object. Every time a counter is referenced, 1 is added to the counter; when the reference fails, 1 is subtracted from the counter; objects whose counter is 0 at any time cannot be used again.
2) Accessibility Analysis algorithms: in mainstream commercial programming languages, the Accessibility Analysis method is used to determine whether an object is alive. The basic idea of this algorithm is to use a series of objects called "GC Roots" as the starting point and start searching down from these nodes. The path to the search is referred to as Reference Chain ), when an object is connected to GC Roots without any reference chain, it is proved that this object is unavailable.
In Java, the objects that can be used as GC Roots include:
Objects referenced in the Virtual Machine stack (the local variable table in the stack frame)
Objects referenced by constants in the method Area
Objects referenced by class static attributes in the Method Area
Objects referenced by JNI (Native method) in the local method Stack
3. Reference
In fact, no matter which algorithm is used to determine whether the object is dead or not, the judgment is related to "Reference ".
References in Java can be divided into four types: strong reference, soft reference, weak reference, and virtual reference. In these four types, the reference intensity is reduced once.
1) strong reference: Generally, there are references between programs, such as "Object obj = new Object ()". As long as a strong reference still exists, the garbage collector will never recycle referenced objects.
2) soft reference: used to describe some useful but not necessary objects. For soft reference associated objects, these objects will be listed in the recycle range for the second recycle before the system will encounter a memory overflow exception. If the recovery does not have enough memory, a memory overflow exception will be thrown.
3) weak references: it is also used to describe non-essential objects, but its strength is weaker than soft references. objects associated with weak references can only survive until the next garbage collection occurs. When the garbage collector is working, the objects associated with weak references will be recycled no matter whether the current memory is sufficient.
4) Virtual Reference: it also becomes a ghost reference or welcome reference. It is the weakest reference relationship. Whether an object has a virtual reference does not affect its survival time, nor can it be used to obtain an object instance through virtual reference.
Ii. Garbage collection Algorithm
1. Mark-clear Algorithm
The most basic collection algorithm is the Mark-Sweep algorithm. The algorithm is divided into two stages: Mark and clear. First, all objects to be recycled are marked, after marking is complete, all marked objects are recycled in a unified manner. It is the most basic collection algorithm, this is because the subsequent collection algorithms are based on this idea and improved on their shortcomings. There are two main shortcomings:
First, efficiency problems, low marking and clearing efficiency, and second, space problems. After marking is cleared, a large number of discontinuous memory fragments are generated, too many space fragments may cause the program to allocate large objects during the running process, so it cannot find enough continuous memory and has to trigger another garbage collection action in advance.
The execution process is as follows:
2. Copy Algorithms
To solve the efficiency problem, a collection algorithm called Copying emerged, which divided the available memory into two equal-size blocks by capacity, use only one of them at a time. When the memory of this part is used up, copy the remaining objects to the other part, and then clear the used memory space. In this way, the whole half-zone memory is recycled each time, and the memory allocation does not need to consider complicated situations such as memory fragmentation. As long as the heap top pointer is moved, the memory can be allocated in order, which is easy to implement, efficient operation. The cost of this algorithm is to reduce the memory to half of the original size, which is a little higher.
3. Tag-Sorting Algorithm
The replication collection algorithm requires a large number of replication operations when the object survival rate is high, and the efficiency will be reduced. More importantly, if you do not want to waste 50% of the space, you need to use additional space for allocation guarantee (Handle Promotion depends on other memory when there is not enough space ), to cope with extreme situations where 100% of all objects in the memory being used survive
For the "tag-sort" algorithm, the tag process is still the same as the "tag-clear" algorithm, but the subsequent steps do not directly clean the recyclable objects, instead, all the surviving objects are moved to one end, and the memory outside the end boundary is cleared directly. The "mark-sort" algorithm is as follows:
4. Generational collection Algorithm
Currently, commercial Virtual Machine garbage Collection uses the Generational Collection algorithm, which has no new idea, the memory is divided into several parts based on the object lifecycle. Generally, the heap is divided into the new generation and the old generation, so that the most suitable collection algorithm can be used according to the characteristics of each generation. In the new generation, a large number of objects are found to die during each garbage collection, and only a small number of objects survive. The replication algorithm is used, and the collection can be completed only at the replication cost of a small number of surviving objects. In the old age, because the object has a high survival rate and no extra space to allocate a guarantee for it, you must use the "mark-clear" or "mark-sort" algorithm for recovery.
Iii. Memory Allocation and recovery policies
In general, the object memory is allocated on the stack, and the object is mainly allocated in the new generation Eden area (a generational layout method, divide the new generation of memory into a large Eden space and two smaller memory vor spaces. I will not elaborate on it here. I just want to understand it ), if the Local Thread Allocation Buffer TLAB (Thread Local Allocation Buffer) is enabled, the Allocation is prioritized on TLAB by Thread. In a few cases, it may also be directly allocated in the old age. The allocation rules are not fixed. The details depend on which Garbage Collector combination is currently used, memory-related parameter settings in virtual machines
1. objects are preferentially allocated in Eden.
In most cases, objects are allocated in the new generation of Eden. When the Eden area does not have enough space for allocation, the virtual machine will initiate a new generation of garbage collection (Minor GC: occurs in the new generation of garbage collection, because most Java objects have the same features, Minor GC is very frequent, and the collection speed is generally faster. Here we will also mention the garbage collection of Major GC in the old age: at least one Minor GC is often accompanied, and the speed of Major GC is generally 10 times slower than that of Minor GC)
2. large objects directly enter the Old Age
A large object is a Java object that requires a large amount of continuous memory space. The most typical large object is a long string and array. Large objects are bad messages for memory allocation of virtual machines. large objects often trigger garbage collection in advance to obtain enough continuous space for memory allocation.
3. Long-lived objects will enter the Old Age
Since Java virtual machines use the idea of generational collection to manage memory, it is necessary to identify which objects should be stored in the new generation and which objects should be stored in the old generation. To achieve this, the virtual machine defines an object age counter for each object. If the object is born in Eden and remains alive after the first Minor GC and can be accommodated by the same vor, the object will be moved to the same vor space and the age of the object will increase by 1 year, when its age reaches a certain level (15 by default), it will be promoted to the old age.
4. Dynamic Object age determination
In order to better adapt to the memory conditions of different programs, virtual machines do not always require that the age of the object must reach the maximum age to be promoted to the old age, if the total size of all objects of the same age in the primary vor space is greater than half of that of the primary vor space, objects of the same age or age can directly enter the old age without waiting until the maximum age.
5. space allocation guarantee
Before the occurrence of Minor GC, the virtual opportunity first checks whether the maximum available continuous space in the old age is greater than the total space of all objects in the new generation. If this condition is true, the Minor GC can ensure security. If you do not set up a virtual opportunity, check whether the HandlePromotionFailure setting value allows guarantee failure. If yes, check whether the maximum available continuous space in the old age is greater than the average size of objects promoted to the old age. If yes, try a Minor GC, despite the risk of this Minor GC, if it is smaller than or cannot be guaranteed to fail, it should also be changed to a garbage collection in the old age.
To explain the risks, the new generation uses the replication collection algorithm. However, for memory utilization, only one of the slave vor spaces is used as the backup rotation, therefore, when a large number of objects remain alive after Minor GC (the most extreme case is that the objects are alive in the new generation after memory collection), the old generation needs to be allocated for guarantee, direct objects that cannot be accommodated by vor to the old age
Supplement:
Reference: go deep into Java Virtual Machine
The garbage collection part and memory allocation part are simply organized for you to quickly learn about this part of knowledge. If you want to learn more, read this book.