Garbage collection GC:. NET automatic memory management on (ii) memory algorithms
- Garbage collection GC:. NET automatic memory management on (a) memory allocation
Garbage collection GC:. NET automatic memory management on (ii) memory algorithms
Objective
. NET, the GC completely solves the embarrassment that the developer keeps track of memory usage and controls the release of memory. However, you may want to understand how the GC works in the afternoon. This series of articles will explain how memory resources are properly distributed and managed, and contain very detailed intrinsic algorithm descriptions. At the same time, we will discuss the GC's memory cleanup process and what to clean up and how to force cleanup.
Memory algorithms
GC detection is used to see if an object in the heap is no longer being used by the program. If such objects exist, the memory used by these objects can be recycled. (If there is no free memory space in the heap, the new operator throws a OutOfMemoryException exception) how does the GC know if an object is still being used by the program? You can imagine that this is not an easy question to answer.
Each program has a set of root nodes (roots) that identify the storage spaces that are directed to objects in the managed heap or to empty (null) objects. For example, all global object pointers or static object pointers in a program are considered part of the program root node (roots). In addition, line stacks any local variable or parameter object pointer is also considered part of the program root node (roots). Finally, all CPU registers that contain pointers to managed heap objects are also considered part of the program root. This set of activations is based on the JIT compiler and the CLR maintenance and can be accessed by the GC's algorithmic system.
When the GC runs, it assumes that all objects in the heap are garbage. In other words, it assumes that the program root node initially has no contact with the objects in the heap. Now, the GC starts to look at the program root node and creates a map for all objects associated with the program root node. For example, a GC might locate a global variable pointing to any object in the heap.
Depicts a heap with several objects, and the program root node points directly to the object a,c,d,f. All of these objects become part of the map. When Object D is added, the GC detects that it also points to object H, so object H is also added to the map. The GC will always recursively return all associated objects.
The managed heap (which can be used as a reference to heap and stack doubts: diagrams of C # heaps and stacks)) is allocated on the object:
Once the GC has completed this part of the mapping, the GC will check the next root node and then recursively return the associated object, and finally complete the mapping. But the difference is that if the GC finds an object that has been mapped before the recursive associated object, the GC stops until the current node is no longer stretched, but the other nodes continue. Other nodes will stop if they encounter the same situation until all objects have been mapped. This mapping method has two purposes. First, avoiding a duplicate mapping of one or a group of objects greatly improves program performance. Second, it avoids mapping a dead loop.
When all the root nodes are checked and mapped, the GC map will contain objects that can be accessed (directly or indirectly) by all program root nodes, and if an object is not in this map, the program root node will never be able to access it, then this object is considered
Garbage。 Now the GC can access the heap in a straight line, looking for a contiguous block of memory occupied by the garbage object. The GC then moves the memory space occupied by the non-garbage object to the block of memory occupied by the garbage object (the standard memcpy operation) and removes the memory gap in all heaps (the gap between the memory blocks occupied by the object). Of course, the movement of this block of memory affects all associated memory pointers, because the memory address changes. Therefore, the GC must modify the program root node (roots) and ensure that all the affected pointers point to the new address of the object. In addition, if any object contains a pointer to another object, the GC will also be responsible for correcting the pointers. You can also refer to one of my other articles: in-depth illustration of C # heap and Stack C # heap (ing) VS stack (ing) section Sixth understand garbage collection GC, get the program performance。 Is the managed heap after the garbage collection is performed:
Managed heap After GC is recycled:
After all the garbage objects are recycled, all non-junk objects become compact again, and the non-spam pointers are all fixed, and the nextobjptr is placed after the last non-spam object. At this point, the new operator starts to try again to create the object, and the resource requested by the program is created successfully.
As you can see, the GC produces a noticeable performance drain, which is a major disadvantage of using GC. However, remember that the GC is only recycled when the heap is full, and before recycling, the managed heap is significantly faster than the C language runtime heap. The GC also provides some optimizations that can greatly improve the performance of garbage collection. Subsequent articles will discuss how GC optimizes garbage collection.
There are some very important points to be pointed out. You no longer need to write code to manage the lifetime of the program resources, and the two bugs mentioned at the beginning of the article will no longer exist. First, it is no longer possible to generate a resource leak because any resource (that is, garbage) that is inaccessible to any program root node (roots) will be recycled. Second, you can no longer access a freed resource because it will never be freed if the resource can be accessed. If a resource cannot be accessed, we have no reason to visit it.
The following code shows how resources are allocated and managed:
Class Application {public static int Main (string[] args) { //Create ArrayList object in the heap, MyArray now as the program root node ArrayList MyArray = new ArrayList (); Create 10,000 objects in the heap for (int x = 0, x < 10000, x + +) { Myarray.add (new Object ()); } Now, MyArray is a root (online Cheng). ///So, myarray and 10,000 objects are accessible Console.WriteLine (myarray.length); After myArray the last reference in the Code (Console.WriteLine (Myarray.length)), MyArray is no longer a root //Do not have to wait until this method returns. The JIT compiler will know that after the last reference to myarray it is identified as a non-root node //Because myarray is no longer the root node, all 10,001 objects are no longer available/They are considered to be garbage // But they will always exist until the GC recycles them }}
If the GC is so good, why doesn't C + + use it? The reason is that the GC must be able to identify the program root node (roots) and must be able to find all the object pointers. C + + allows pointers to type conversions, so it is not possible to determine what a pointer is pointing to. In the general language runtime CLR, the managed heap is always able to determine the exact type of the object, and the metadata metadata information determines which other objects the members of an object point to.
Summarize
This article describes the algorithm for GC garbage collection in the. NET framework, and simply mentions the differences from C and C + +. Understanding the memory algorithm lets you know why the GC is fast, and what the GC does is very performance-intensive, giving you a clear idea of the performance of your own programs. In the next article I will continue to introduce the automatic memory management of garbage collection GC: the finalization of the end node.
translation: Http://msdn.microsoft.com/en-us/magazine/bb985010.aspx