. NET garbage collection (GC) principle, garbage collection gc
As part of the. NET advanced content, the garbage collector (GC) is required. Based on the "easy to understand" principle, this article will explain how the garbage collector works in CLR.
Managed Heap)
Let's take a look at the MSDN explanation: During the initialization of a new process, a continuous address space area will be reserved for the process during the runtime. The reserved address space is called a managed heap.
"Hosting heap is also a heap". Why? In this case, we hope that you will not be confused by the term. The premise of this knowledge point is the difference between the value type and the reference type ". Here, we assume that the reader knows that "value type is stored in stack, and reference type is stored in heap. (The reference type is stored in the stack. Therefore, according to this theory, except for the value type, CLR requires that all resources be allocated from the managed heap.
The managed heap maintains a pointer named NextObjPtr, pointing to the allocation location of the next object in the heap.
CPU Register)
This is the basic computer knowledge. Here we will review it to help you understand the following "root" concept.
The CPU register is the "Temporary memory" of the CPU itself, which is faster than the memory access. According to the distance from the CPU, the nearest is the register, then the cache (computer one, two, three levels of cache), and finally the memory.
Root (Roots)
All static fields, method parameters, and local variables (reference type variables only) defined in the class are the root, and the object pointer in the cpu register is also the root. The root is the various entry points that the CLR can find outside the heap.
Objects reachable and unreachable)
If a root references an object in the heap, the object is "reachable"; otherwise, it is "inaccessible ".
Reasons for garbage collection
From the perspective of computer composition, all programs must reside in the memory to run. Memory is a limiting factor (size ). In addition, the size of the managed heap is limited. If the managed heap has no size limit, the execution speed of C # is better than that of c (the managed heap structure gives it a faster object allocation speed than the c Runtime heap ). Because of address space and storage restrictions, the managed heap uses the garbage collection mechanism to maintain its normal operation and ensure that the allocation of objects does not "memory overflow ".
Basic Principles of garbage collection
Recycling is divided into two phases: tag-> Compression
The marking process is actually the process of determining whether an object is reachable. After all the roots are checked, the heap will contain reachable (marked) and unreachable (unmarked) objects.
After the mark is complete, enter the compression stage. In this phase, the garbage collector traverses the heap linearly to find contiguous memory blocks of inaccessible objects. And move the reachable object here to compress the heap. This process is similar to disk space fragmentation.
As shown in, the green box indicates reachable objects, and the yellow box indicates inaccessible objects. After the reachable objects are cleared, the reachable objects are moved to implement memory compression (more compact ).
After compression, the variables and CPU registers that point to the pointers of these objects will now become invalid. The Garbage Collector must re-access all the roots and modify them to point to the new memory location of the object. This causes significant performance loss. This loss is also the main drawback of the hosting heap.
Based on the above features, the garbage collection algorithm caused by garbage collection is also a research topic. Because it would be too slow to execute garbage collection until the hosting is full.
Garbage collection algorithm-Generation Algorithm
The proxy is a mechanism adopted by the CLR garbage collector. Its sole purpose is to improve the performance of applications. Generation-based collection, the speed is obviously faster than the whole heap.
The CLR managed heap supports three generations: 0th generations, 1st generations, and 2nd generations. The space of The 0th generation is about 1st kb, the space of the 2nd generation is about 2 M, and the space of the generation is about 10 M. The newly constructed object will be allocated to the 0th generation,
As shown in,When the space for Generation 1 is full, the garbage collector starts to recycle, Inaccessible objects (C, E) will be recycled, and the surviving objects will be classified as 1st generations.
When the space of the 0th generation is full and the 1st generation starts to have many inaccessible objects and the space will be full, the two generations of garbage will be recycled.. For surviving objects (reachable objects), 0th generations are upgraded to 1st generations, and 1st generations are upgraded to 2nd generations.
The actual CLR generation recycle mechanism is more "intelligent". If the life cycle of the newly created object is short, The 0th generation Garbage Collector immediately recycles the object (no need to wait until the space is fully allocated ). In addition, if we recycle 0th generations and find that there are still many objects that are "reachable" and do not release much memory, we will increase the budget of 0th generations to kb, and the recovery effect will be changed: the number of garbage collection times will be reduced, but a large amount of memory will be recycled each time. If the amount of memory has not been released, the garbage collector will execute
Completely recycle (three generations). If it is not enough, a "memory overflow" exception will be thrown.
That is to say, the garbage collector dynamically adjusts the allocated space budget of each generation based on the memory size collected! Achieve automatic optimization!
Summary
There is a basic concept behind garbage collection: programming languages (the majority) seem to always be able to access unlimited memory. Developers can allocate and distribute resources all the time, just like magic.
. The basic working principle of the NET garbage collector is to clear inaccessible objects through the basic mark clearing principle, and then compress and sort out available memory like disk fragment; finally, we use the generational algorithm to optimize the performance.