A deep understanding of the java garbage collection mechanism and a deep understanding of garbage collection
An in-depth understanding of java garbage collection mechanism-I. Significance of the Mechanism
A notable feature of Java is the introduction of the garbage collection mechanism, which helps c ++ programmers solve the most troublesome memory management problems, it makes memory management unnecessary for Java programmers when writing programs. Because of the garbage collection mechanism, objects in Java do not have the concept of "Scope", and only objects can be referenced with "Scope ". Garbage collection can effectively prevent memory leakage and effectively use idle memory.
Ps: Memory leakage refers to the fact that the memory space is not recycled after use and does not involve complex data structures, java Memory leakage shows that the lifecycle of a memory object exceeds the length of time required by the program. We sometimes call it "object free ".
Ii. Garbage Collection Algorithms
The Java language specification does not clearly indicate which garbage collection algorithm is used by JVM, but any garbage collection algorithm generally requires two basic tasks: (1) discovering useless information objects; (2) reclaim the memory space occupied by useless objects so that the space can be used by the program again.
1. Reference Counting Collector)
1.1 algorithm analysis
Reference count is an early policy in the garbage collector. In this method, each object instance in the heap has a reference count. When an object is created and assigned to a variable, the variable count is set to 1. When any other variable is assigned a reference value to this object, the count is incremented by 1 (a = B, then the counter of the object instance referenced by B + 1 ), however, when a reference of an object instance exceeds the lifecycle or is set as a new value, the Reference Counter of the object instance is reduced by 1. Any object instance with a reference counter of 0 can be collected as garbage. When an object instance is garbage collected, the Reference Counter of any object instance it references is reduced by 1.
1.2 advantages and disadvantages
Advantages:
The reference counting collector can be executed quickly and intertwined in the program running. It is advantageous for the real-time environment where the program needs not to be interrupted for a long time.
Disadvantages:
Loop references cannot be detected. If the parent object has a reference to the child object, the child object references the parent object in turn. In this way, their reference count will never be 0.
1.3 The reference counting algorithm cannot solve the circular reference problem, for example:
public class Main { public static void main(String[] args) { MyObject object1 = new MyObject(); MyObject object2 = new MyObject(); object1.object = object2; object2.object = object1; object1 = null; object2 = null; }}
In the last two sentences, the values of object1 and object2 are null. That is to say, the objects pointed to by object1 and object2 cannot be accessed again, but because they reference each other, as a result, their reference counters are not 0, so the Garbage Collector will never recycle them.
2. tracing Algorithm (Tracing Collector) or mark-clear algorithm (mark and sweep)
2.1 search algorithms
The ROOT search algorithm is introduced from Graph Theory in discrete mathematics. The program regards all the reference relationships as a graph, starting from a node gc root, to find the corresponding reference node. After finding this node, continue to search for reference nodes of this node. After all the reference nodes are searched, the remaining nodes are considered to be unused nodes.
Java can be used as the GC Root object.
1. Objects referenced in the VM stack (local variable table)
2. Objects referenced by static attributes in the Method Area
3. Objects referenced by constants in the method Area
4. Native object referenced in the local method Stack)
2.2tracing Algorithm
2.3 mark-clear algorithm analysis
The mark-clearing algorithm scans the root set and marks the surviving object. After marking, It scans the unlabeled objects in the whole space for recycling, as shown in. The tag-clearing algorithm does not need to move objects and only processes non-surviving objects. It is extremely efficient when there are many surviving objects, however, because the tag-clearing algorithm directly recycles non-surviving objects, memory fragments may occur.
3. compacting algorithm or tag-Sorting Algorithm
The tag-sort algorithm uses the tag-clear algorithm to mark objects. However, when the object is cleared, all the surviving objects are moved to the left free space and the corresponding pointer is updated. The tag-sorting algorithm moves Objects Based on the tag-clearing algorithm. Therefore, the cost is higher, but the memory fragmentation problem is solved. In the implementation of collectors Based on the Compacting algorithm, the handle and handle tables are generally added.
4. copying algorithm (Compacting Collector)
This algorithm is proposed to overcome the handle overhead and solve the garbage collection of heap fragments. At the beginning, it divides the heap into one object plane and multiple idle planes. The program allocates space for the object from the object plane. When the object is full, garbage Collection Based on the copying algorithm scans activity objects from the root set and copies each activity object to the idle surface (so that there is no idle hole between the memory occupied by the activity object ), in this way, the idle surface becomes the object surface, and the original object surface becomes the idle surface. The program will allocate memory in the new object surface. A typical garbage collection Algorithm Based on the coping algorithm is the stop-and-copy algorithm, which divides the heap into the object plane and the free area plane. During the switching process between the object plane and the free area, the program is suspended.
5. generation Algorithm (Generational Collector)
The generational garbage collection policy is based on the fact that:Different objects have different lifecycles.. Therefore, objects of different lifecycles can adopt different recycling algorithms to improve the efficiency of recycling.
Young Generation)
1. All newly generated objects are first put in the young generation. The goal of the young generation is to quickly collect objects with short lifecycles as much as possible.
2. The new generation memory is divided into one eden zone and two VOR (same vor0, same vor1) zones according to the ratio. One Eden zone and two vor zones (generally ). Most objects are generated in the Eden area. When recycling, copy the surviving objects in the eden area to a region vor0, and then clear the eden area. When the region vor0 is full, then, copy the surviving objects in the eden and region vor0 areas to another region vor1, and then clear the eden and region vor0 areas. At this time, Region vor0 is empty, and then switch between region vor0 and region vor1, that is, keep region vor1 empty.
3. When region vor1 is insufficient to store the surviving objects of eden and region vor0, the surviving objects are directly stored in the old age. If the old age is Full, a Full GC will be triggered, that is, the new generation and old generation will be recycled.
4. the GC of the new generation is also called Minor GC. the occurrence frequency of MinorGC is relatively high (not necessarily triggered when the Eden zone is full)
Old Generation)
1. objects that are still alive after N garbage collection in the young generation will be put into the old generation. Therefore, it can be considered that objects with long lifecycles are stored in the old generation.
2. the memory size is much larger than that of the New Generation (about). When the old generation has Full memory, the Major GC is triggered, that is, Full GC. The frequency of Full GC is relatively low, objects in the old age have a long survival time and a high survival rate.
Permanent Generation)
It is used to store static files, such as Java classes and methods. Persistent generation has no significant impact on garbage collection, but some applications may dynamically generate or call some classes, such as Hibernate, in this case, you need to set up a large persistent storage space to store the classes added during the running process.
Iii. GC (Garbage Collector)
New Generation collectors: Serial, PraNew, Parallel Scavenge
Collectors used in earlier years: Serial Old, Parallel Old, and CMS
Serial collector (replication algorithm)
The new generation of Single-thread collectors, marking and cleaning are single-threaded, with the advantage of being simple and efficient.
Serial Old collector (tag-sorting algorithm)
The single-thread collector of the old generation and the old version of the Serial collector.
ParNew collector (STOP-copy algorithm)
The new generation collector can be considered as a multi-threaded version of the Serial collector, which has better performance than Serial in a multi-core CPU environment.
Parallel Scavenge collector (STOP-copy algorithm)
The parallel collector pursues high throughput and uses the CPU efficiently. Throughput is generally 99%, throughput = user thread time/(User thread time + GC thread time ). Suitable for background applications and other scenarios with low interaction requirements.
Parallel Old collector (STOP-copy algorithm)
Earlier versions of Parallel Scavenge collectors, Parallel collectors, and throughput first
CMS (Concurrent Mark Sweep) Collector (tag-cleanup algorithm)
High concurrency, low pause, pursuit of the shortest GC recovery pause time, high cpu usage, fast response time, short pause time, multi-core cpu pursuit of high response time
Iv. GC execution mechanism
Because the object is divided into generations, the garbage collection area and time are different. There are two types of GC: Scavenge GC and Full GC.
Scavenge GC
Generally, when a new object is generated and the Eden application fails, Scavenge GC is triggered to perform GC on the Eden region to clear non-surviving objects, and move the surviving objects to the same vor area. Then, sort out the two zones in the same vor. In this way, GC is performed on the Eden area of the young generation and will not affect the old generation. Because most objects start from the Eden area and the Eden area is not allocated much, GC in the Eden area is performed frequently. Therefore, it is generally necessary to use fast and efficient algorithms so that Eden can be idle as soon as possible.
Full GC
Organize the entire heap, including Young, Tenured, and Perm. Full GC is slower than Scavenge GC because it is necessary to recycle the entire GC, so the number of Full GC should be minimized. In the process of JVM optimization, a major part of the work is to adjust FullGC. Full GC may occur due to the following reasons:
1. The Tenured is full.
2. The permanent generation (Perm) is full.
3. The System. gc () is displayed and called.
4. Dynamic Change of Heap allocation policies for each domain after the last GC
5. If Java has GC, memory leakage may also occur.
1. static collection classes such as HashMap and Vector are most prone to memory leakage. The lifecycle of these static variables is the same as that of the application, and all Object objects cannot be released, because they will also be applied by Vector and so on.
Static Vector v = new Vector(); for (int i = 1; i<100; i++) { Object o = new Object(); v.add(o); o = null; }
In this example, the reference v of the Vector Object and the reference o of the Object exist in the Code stack. In the For loop, we constantly generate new objects, add them to the Vector object, and then leave the o reference empty. The problem is, if GC occurs after the o reference is left empty, can the Object we created be recycled by GC? The answer is no. Because, when GC traces the reference in the Code stack, it will find the v reference, and continue to trace down, it will find that there is a reference pointing to the Object in the memory space that the v reference points. That is to say, although the o reference has been left empty, other references still exist in the Object, which can be accessed, so GC cannot release it. If the Object has no effect on the program after this loop, we think the Java program has a memory leakage.
2. Various connections, database connections, network connections, IO connections, etc. are not shown to be closed by calling close, and GC is not collected, resulting in Memory leakage.
3. When an object is released and no listener is deleted, the listener may cause memory leakage.