GC, the path to freedom This article describes how. net memory management, GC mechanism, memory release process, various memory release methods, and so on, and use a large number of examples to illustrate how to operate and optimize the recovery, and also discuss some precautions. Directory introduction Automatic Memory Management and GC working methods destructor decline, finalizer birth object resurrection the release of unmanaged resources weak Reference Usage summary reference information about the author Introduction as a. net Program Staff, we know that hosting Code Memory Management is automatic .. Net can ensure that all our hosting programs are released at the end, which saves a lot of trouble for our programmers. We can even think about how to manage the memory, anyway. net itself will ensure everything. Well, it makes sense. The problem is that when we use unmanaged resources,. Net cannot be automatically managed. This is because the unmanaged code is not controlled by CLR (Common Language Runtime) and beyond the scope of CLR management. So how do we deal with these unmanaged resources? How does. Net manage and release managed resources? Automatic Memory Management and GC heap memory allocation in the original program are as follows: Find the first memory address with sufficient space (not occupied), and then allocate the memory. When the program no longer needs information in the memory, the programmer needs to manually release the memory. The heap memory is public. That is to say, all processes may overwrite the memory of another process. This is why many poorly designed programs even let the operating system down. Sometimes the program we encounter is inexplicably dead (random phenomenon), and it is also caused by improper memory management (may be caused by memory problems of our program or external programs ). Another common example is the trainer of games that people often see. They directly modify the memory of the game to achieve "Invincible" effect. With this understanding, we can imagine how dangerous it would be if the memory address is in disorder, and we can also imagine why C ++ programmers (some) have a headache when speaking of pointers. In addition, if the program memory is not manually released by the programmer, the memory will not be re-allocated until the computer restarts, that is, the memory leakage we call. In unmanaged code, CLR achieves code isolation through appdomain to avoid these memory management problems, that is to say, an appdomain generally cannot read/write the memory of another appdomain. GC (Garbage Collector) is responsible for the release of managed memory. We will further talk about this GC, but before that, we should first talk about the memory allocation in the managed code. The memory allocation in the managed heap is sequential, that is, the allocation is one by one. In this way, the memory allocation speed is higher than that of the original program, but the higher speed will be retrieved by GC. Why? You will know the answer after you have read the working method of GC. GC method first, we need to know when the objects in the managed code will be recycled. (unless GC. Collect is used to force GC. This is not recommended. I will explain why later ). GC will perform a recovery when it is "happy" (this is due to many reasons, such as when the memory is insufficient. This is done to improve memory allocation and recovery efficiency ). What if we use destructor? Similarly, it does not work because the Destructor concept in. Net does not exist and it becomes finalizer, which will be discussed later. Remember that an object can be recycled only when there is no reference. To illustrate this, see the following code:
[C #] object obja = new object (); object objb = obja; obja = NULL; // force reclaim. GC. Collect (); objb. tostring (); [Visual Basic] Dim obja as new object () dim objb as object = objaobja = nothing 'forced collection. GC. Collect () objb. tostring ()
Here, the object referenced by obja is not recycled, because this object has another reference, objb. Objects can be recycled without any reference. When GC is collected, it performs the following steps: Make sure the object has no reference. Check whether the object has records on the finalizer table. If there is a record in the finalizer table, move the record to another table. Here we call it finalizer2. If no records exist in the finalizer2 table, the memory will be released. The finalizer of the object on the finalizer2 table will be executed on another low priority thread and deleted from the table. When an object is created, GC checks whether the object has a finalizer. If so, a record is added to the finalizer table. The record we mentioned here is actually a pointer. If you carefully read these steps, we will find that objects with finalizer will not be recycled for the first time, that is, objects with finalizer will be recycled only after more than one collect operation, this takes a longer step, so I recommend that you do not create finalizer unless you absolutely need it. In order to prove that GC does this work, rather than the author's nonsense, we will give an example in the chapter on the resurrection of the object! ^_^ GC uses the concept of generation to improve the efficiency of collection. The principle is as follows: for the first time, the objects created before collection belong to generation 0, every time this generation number is recycled, it will be moved backwards. That is to say, the original generation 0 is changed to generation 1 during the second recycling, the objects created after the first recycle and before the second recycle will belong to generation 0. GC will first try to recycle the objects belonging to generation 0, because these are the latest, so it is most likely to be recycled, for example, some local variables in a function are not referenced (can be recycled) When exiting the function ). If enough memory is recycled in generation 0, GC will not be recycled again. If not, GC will try to recycle in Generation 1, if it is not enough, it will be recycled in generation 2, and so on. Generation also has a maximum limit, depending on the Framework version, which can be obtained using GC. maxgeneration. After memory is recycled, GC will re-arrange the memory so that there is no space between the data. This is because the CLR allocates memory sequentially, So there cannot be free memory between memories. Now we know that CPU time will be wasted every time we recycle it. This is what I said, generally, do not manually GC. the reason for collect (unless you are writing examples about GC like me! ^_^ ). The decline of destructor and the birth of finalizer is a new concept for Visual Basic programmers. Therefore, the previous part focuses on C ++ programmers. We know that in C ++, when an object is deleted (delete), the code in destructor will immediately execute some memory release (or others ). However. in net, due to the special working method of GC, destructor does not actually exist. In fact, when we use the Destructor syntax, the compiler will automatically write it as protected virtual void finalize (), this method is what I call finalizer. As its name says, it is used to end some things, not to destroy (destruct) things. In Visual Basic, it is in the form of finalize, so Visual Basic programmers don't have to worry about it. C # the programmer must use the Destructor syntax to write finalizer, but do not confuse it. Net has no destructor. In C ++, we can accurately know when destructor will be executed, but in. net, we cannot know when the finalizer will be executed, because it is executed only after the first object recycling operation. We cannot know the execution sequence of finalizer. That is to say, in the same case, the Finalize of A may be executed first, B may be executed later, or a may be executed later than B. In other words, our code in finalizer cannot have any time logic. In the following example, we calculate the number of instances in a class, point out the difference between finalizer and destructor, and point out the time logic error in finalizer, because Visual Basic does not have destructor, the example only has C:
[C #] public class countobject {public static int COUNT = 0; Public countobject () {count ++ ;}~ Countobject () {count -- ;}} static void main () {countobject OBJ; For (INT I = 0; I <5; I ++) {OBJ = NULL; // this step is redundant. This write is only for clarity! OBJ = new countobject ();} // count will not be 1, because finalizer will not be triggered immediately and will not be triggered until there is a recycle operation. Console. writeline (countobject. Count); console. Readline ();}
Note that if you use C ++ to write the above Code, the memory will leak, because we do not use the delete operator to manually clean up the memory, but the memory will not leak in the managed code, GC automatically detects and recycles unreferenced objects. Here, we recommend that you use finalizer only when implementing the idisposable interface. In other cases, do not use it (there may be special cases ). In the release chapter of unmanaged resources, we will have a better understanding of the idisposable interface. Now let's be Jesus! What is the object's resurrection? Can the recycled object be "Revived? That's right, although the definition is not accurate. Let's take a look at the Code:
[C #] public class resurrection {public int data; Public resurrection (INT data) {This. Data = data ;}~ Resurrection () {main. instance = This ;}} public class main {public static resurrection instance; public static void main () {instance = New Resurrection (1); instance = NULL; GC. collect (); GC. waitforpendingfinalizers (); // have you seen it? Here, "resurrected. Console. writeline (instance. data); instance = NULL; GC. collect (); console. readline () ;}} [Visual Basic] public class resurrection public data as integer public sub new (byval data as integer) me. data = data end sub protected overrides sub finalize () Main. instance = me mybase. finalize () end subend classpublic class main public shared instance as resurrection sub main () instance = New Resurrection (1) in Stance = nothing GC. Collect () GC. waitforpendingfinalizers () ': Have you seen it? Here, it is "Revived. Console. writeline (instance. Data) instance = nothing GC. Collect () console. Readline () end subend class
You may ask: "Since this object can be revived, will this object be recycled after the program ends? ". Yes, "Why? ". Let's go through GC and you will understand what is going on. 1. Execute collect. Check references. No problem. The object has not been referenced. 2. When a new instance is created, a record has been recorded in the finalizer table, so we have checked that the object has finalizer. 3. Because the finalizer is found, the record is moved to the finalizer2 table. 4. There are records on the finalizer2 table, so the memory is not released. 5. Collect execution is complete. GC. waitforpendingfinalizers is used, so we will wait for the execution of finalizers on all finalizer2 tables. 6. After finalizer is executed, our instance references our object. (Revived) 7. Remove all references again. 8. Execute collect. Check references. No problem. 9. Because records have been deleted from the finalizer table last time, no finalizer object is found this time. 10. The finalizer2 table does not exist, so the object memory is released. Now that you understand the reason, let me tell you the usefulness of "Resurrection. Well, this ...... Okay, I don't know. In fact, the resurrection is useless, and it is also very dangerous. It seems that this is only a vulnerability of the GC mechanism. (See GC. reregisterforfinalize and try again to find out why it is a vulnerability ). The author suggests that you forget about the Resurrection and avoid such use. You may ask, "Why do you want to say this to us? "I am talking about this to help you better understand the working mechanism of GC! Pai_^ Since the release of unmanaged resources till now, we have talked about the management of managed memory. What if we use unmanaged resources such as databases and files? In this case, we need to use the standard idisposable interface in. NET Framework. According to the standard, all classes that require manual release of unmanaged resources must implement this interface. This interface has only one method, dispose (), but there is a relative guidelines indicating how to implement this interface. Here I will explain it to you. Classes that implement the idisposable interface need to have the following structure:
[C #] public class base: idisposable {public void dispose () {This. dispose (true); GC. supressfinalize (this);} protected virtual void dispose (bool disposing) {If (disposing) {// Managed class} // release of unmanaged resources }~ Base () {This. dispose (false) ;}} public class derive: Base {protected override void dispose (bool disposing) {If (disposing) {// Managed class} // unmanaged resources release the base. dispose (disposing) ;}} [Visual Basic] public class base implements idisposable public overloads sub dispose () implements idisposable. dispose me. dispose (true) GC. suppressfinalize (me) end sub protected overloads overridable sub dispose (byval disposing as Boolean) If disposing then 'managed class end if 'unmanaged resources are released end sub protected overrides sub finalize () me. dispose (false) mybase. finalize () end subend classpublic class derive inherits base protected overloads overrides sub dispose (byval disposing as Boolean) If disposing then 'managed class end if 'unmanaged resource releases mybase. dispose (disposing) end subend class
Why is it designed like this? Let me explain it later. Now let's talk about several principles for implementing this dispose method: it cannot throw any errors, and duplicate calls cannot throw errors. That is to say, if I have already called the dispose of an object, the program should not make an error when I call the dispose for the second time. Simply put, the program will not do anything when it calls the dispose for the second time. These can be achieved through one flag or multiple if judgments. The dispose of an object must release all resources of this object. Take an inheritance class as an example. The Inheritance class uses unmanaged resources, so it implements the idisposable interface. If the inheritance class's base class also uses unmanaged resources, the base class must be released, how can the base class resources be released in the inheritance class? Of course, a virtual/overridable method is used to ensure that every dispose is called. This is why we have a virtual/overridable dispose method. Note that you must first release the inherited class resources and then release the base class resources. Because unmanaged resources must be guaranteed to be correctly released, we need to define a finalizer to avoid programmers forgetting to call dispose. The above design adopts this form. If we manually call the dispose method, there is no need to retain finalizer. Therefore, in dispose, we use GC. supressfinalize to remove the object from the finalizer table, so that the recovery speed will be faster. So what is the disposing and "hosting class? Yes: write all the managed code in the "Managed class" that you want to make it releasable when calling dispose. Do you remember saying that we don't know when the managed code will be released? Here we just remove the reference of the member object so that it can be recycled, rather than directly releasing the memory. In the "Managed class", we also need to write all member objects that implement idisposable. Because they also have dispose, we also need to call their dispose in the dispose of the object, in this way, the second criterion can be guaranteed. Disposing is used to differentiate the call methods of dispose. If we call it manually, the part of "Managed class" must be executed for the second criterion, but if it is the dispose called by finalizer, at this time, the object has no reference, that is, the object's members will naturally no longer exist (no reference), and there is no need to execute the "Managed class" part, because they are already available for recycling. Okay, that's all about the idisposable interface. Now let's recall that we may have thought that the dispose memory will be immediately released, which is wrong. Only the unmanaged memory will be immediately released. The released managed memory will be managed by GC, so we don't need to worry about it. Weak references use A = B. We call this reference a strong reference. GC checks strong references to determine whether an object can be recycled. There is also a reference called weakreference, which does not affect GC collection, which is its usefulness. You will ask how useful it is. Now let's assume we have a very fat object, that is, it occupies a lot of memory. We have used this object and intend to remove its reference so that GC can recycle the memory, but we don't need this object anymore. No way, re-create the instance, why is creation so slow? Is there any way to solve this problem? Yes. It's not so fast to leave the object in the memory! However, we don't want this fat object to occupy the memory, and we don't want to always create such a fat new instance, because this is very time-consuming. What should we do ......? A wise friend must have guessed that the solution is weak references. That's it. We can create a weak reference for this fat object, so that GC can be recycled when the memory is insufficient, without affecting the memory usage, and we can use this object again before it is recycled by GC. Here is an example:
[C #] public class fat {public int data; Public fat (INT data) {This. data = data ;}} public class main {public static void main () {fat ofat = new fat (1); weakreference ofatref = new weakreference (ofat ); // from here on, the fat object can be recycled. Ofat = NULL; If (ofatref. isalive) {console. writeline (FAT) ofatref. Target). Data); // 1} // forcibly recycle. GC. collect (); console. writeline (ofatref. isalive); // false console. readline () ;}} [Visual Basic] public class fat public data as integer public sub new (byval data as integer) me. data = data end subend classpublic Module main sub main () dim ofat as new fat (1) dim ofatref as new weakreference (ofat) 'from here, the fat object can be recycled. Ofat = nothing if ofatref. isalive then console. writeline (directcast (ofatref. Target, fat). Data) '1 end if' forcibly recycles. GC. Collect () console. writeline (ofatref. isalive) 'false console. Readline () end subend module
Here, our fat is not very fat, but it can reflect the intention of the example: How to Use Weak references. What if Fat has finalizer? If fat has finalizer, we may use another weakreference constructor, which has a parameter called trackresurrection. If it is true, we can use it as long as the fat memory is not released, that is to say, after the finalizer of fat is executed, we can still restore the fat (equivalent to restoring the fat after the first recycle operation). If trackresurrection is false, the fat object cannot be restored after the first recycle operation. Summary: I wrote the main article here. Article Key Point: an object is recycled only when there is no reference. The memory of an object is not immediately released. GC recycles it at any time. Generally, do not forcibly recycle jobs. If there is no special need, do not write finalizer. Do not write time logic code in finalizer. Implement the idisposable interface in any class that has unmanaged resources or contains dispose members. Write your own dispose code according to the given dispose design. When using a fat object, you can consider the use of weak references. Now let's talk about it. We hope that your GC knowledge will make your code more stable, concise, and faster! More importantly, there will no longer be memory management issues, whether hosted or unmanaged! Reference 1. GC classms-help: // Ms. VSCC/MS. msdnvs.2052/cpref/html/frlrfsystemgcclasstopic.htm 2. Programming for garbage collectionms-help: // Ms. VSCC/MS. msdnvs.2052/cpguide/html/cpconprogrammingessential tialsforgarbagecollection.htm 3, Curso de Erik, septima Entrega (Spanish) http://guille.costasol.net/NET/cursoCSharpErik/Entrega7/Entrega7.htm Author: kefroth)