Garbage collection mechanism in. net

Source: Internet
Author: User


1. Automatic Memory Management and GC
Heap memory allocation in the original program is as follows: Find the first memory address with sufficient space (not occupied), and then allocate the memory. When the program no longer needs information in the memory, the programmer needs to manually release the memory. The heap memory is public. That is to say, all processes may overwrite the memory of another process. This is why many poorly designed programs even let the operating system down. Sometimes the program we encounter is inexplicably dead (random phenomenon), and it is also caused by improper memory management (may be caused by memory problems of our program or external programs ). Another common example is the Trainer of games that people often see. They directly modify the memory of the game to achieve "Invincible" effect. With this understanding, we can imagine how dangerous it would be if the memory address is in disorder, and we can also imagine why C ++ programmers (some) have a headache when speaking of pointers. In addition, if the program memory is not manually released by the programmer, the memory will not be re-allocated until the computer restarts, that is, the memory leakage we call. In unmanaged code, CLR achieves code isolation through AppDomain to avoid these memory management problems, that is to say, an AppDomain generally cannot read/write the memory of another AppDomain. GC (Garbage Collector) is responsible for the release of managed memory. We will further talk about this GC, but before that, we should first talk about the memory allocation in the managed code. The memory allocation in the managed heap is sequential, that is, the allocation is one by one. In this way, the memory allocation speed is higher than that of the original program, but the higher speed will be retrieved by GC. Why? You will know the answer after you have read the working method of GC.
2. GC method
First, we need to know when the objects in the managed code can't be recycled (unless GC. Collect ** GC is used for collection, This is not recommended, and I will explain why later ). GC will perform a recovery when it is "happy" (this is due to many reasons, such as when the memory is insufficient. This is done to improve memory allocation and recovery efficiency ). What if we use Destructor? Similarly, it does not work because the Destructor concept in. NET does not exist and it becomes Finalizer, which will be discussed later. Remember that an object can be recycled only when there is no reference. To illustrate this, see the following code: Copy codeThe Code is as follows: view sourceprint? Object objA = new object ();
Object objB = objA;
ObjA = null;
// ** Recycle.
GC. Collect ();
ObjB. ToString ();

Here, the object referenced by objA is not recycled, because this object has another reference, ObjB. Objects can be recycled without any reference.
When GC is collected, it performs the following steps:
1. confirm that the object has no reference.
2. Check whether the object has records on the Finalizer table. If there is a record in the Finalizer table, move the record to another table. Here we call it Finalizer2. If no records exist in the Finalizer2 table, the memory will be released. The Finalizer of the object on the Finalizer2 table will be executed on another low priority thread and deleted from the table. When an object is created, GC checks whether the object has a Finalizer. If so, a record is added to the Finalizer table. The record we mentioned here is actually a pointer. If you carefully read these steps, we will find that objects with Finalizer will not be recycled for the first time, that is, objects with Finalizer will be recycled only after more than one Collect operation, this takes a longer step, so I recommend that you do not create Finalizer unless you absolutely need it.
GC uses the concept of Generation to improve the efficiency of collection. The principle is as follows: for the first time, the objects created before collection belong to Generation 0, every time this Generation number is recycled, it will be moved backwards. That is to say, the original Generation 0 is changed to Generation 1 during the second recycling, the objects created after the first recycle and before the second recycle will belong to Generation 0. GC will first try to recycle the objects belonging to Generation 0, because these are the latest, so it is most likely to be recycled, for example, some local variables in a function are not referenced (can be recycled) When exiting the function ). If enough memory is recycled in Generation 0, GC will not be recycled again. If not, GC will try to recycle in Generation 1, if it is not enough, it will be recycled in Generation 2, and so on. Generation also has a maximum limit, depending on the Framework version, which can be obtained using GC. MaxGeneration. After memory is recycled, GC will re-arrange the memory so that there is no space between the data. This is because the CLR allocates memory sequentially, So there cannot be free memory between memories. Now we know that CPU time will be wasted every time we recycle it, which is why I generally do not manually GC. Collect.
When we use the Destructor syntax, the compiler will automatically write it as protected virtual void Finalize (). This method is what I call Finalizer. As its name says, it is used to end some things, not to destroy (Destruct) things. In Visual Basic, it is in the form of Finalize, so Visual Basic programmers don't have to worry about it. C # the programmer must use the Destructor syntax to write Finalizer, but do not confuse it. NET has no Destructor. In C ++, we can accurately know when Destructor will be executed, but in. NET, we cannot know when the Finalizer will be executed, because it is executed only after the first object recycling operation. We cannot know the execution sequence of Finalizer. That is to say, in the same case, the Finalize of A may be executed first, B may be executed later, or A may be executed later than B. In other words, our code in Finalizer cannot have any time logic. In the following example, we calculate the number of instances in a class, point out the difference between Finalizer and Destructor, and point out the time logic error in Finalizer:Copy codeThe Code is as follows: view sourceprint? Public class CountObject {
Public static int Count = 0;
Public CountObject (){
Count ++;
}
~ CountObject (){
Count --;
}
}
Static void Main (){
CountObject obj;
For (int I = 0; I <5; I ++ ){
Obj = null; // this step is redundant. This write is only for clarity!
Obj = new CountObject ();
}
// Count won't be 1, because Finalizer won't be triggered immediately. It will not be triggered until there is a recycling operation.
Console. WriteLine (CountObject. Count );
Console. ReadLine ();
}

Note that if you use C ++ to write the above Code, the memory will leak, because we do not use the delete operator to manually clean up the memory, but the memory will not leak in the managed code, GC automatically detects and recycles unreferenced objects. Here, we recommend that you use Finalizer only when implementing the IDisposable interface. In other cases, do not use it (there may be special cases ).
3. Object resurrection
What? Can the recycled object be "Revived? That's right, although the definition is not accurate. Let's take a look at the Code:Copy codeThe Code is as follows: view sourceprint? Public class Resurrection {
Public int Data;
Public Resurrection (int data ){
This. Data = data;
}
~ Resurrection (){
Main. Instance = this;
}
}
Public class Main {
Public static Resurrection Instance;
Public static void Main (){
Instance = new Resurrection (1 );
Instance = null;
GC. Collect ();
GC. WaitForPendingFinalizers ();
// Have you seen it? Here it is "Revived.
Console. WriteLine (Instance. Data );
Instance = null;
GC. Collect ();
Console. ReadLine ();
}
}

You may ask: "Since this object can be revived, will this object be recycled after the program ends? ". Yes, "Why? ". Let's go through GC and you will understand what is going on.
1. Execute Collect. Check references. No problem. The object has not been referenced.
2. When a new instance is created, a record has been recorded in the Finalizer table, so we have checked that the object has Finalizer.
3. Because the Finalizer is found, the record is moved to the Finalizer2 table.
4. There are records on the Finalizer2 table, so the memory is not released.
5. Collect execution is complete. GC. WaitForPendingFinalizers is used, so we will wait for the execution of Finalizers on all Finalizer2 tables.
6. After Finalizer is executed, our Instance references our object. (Revived)
7. Remove all references again.
8. Execute Collect. Check references. No problem.
9. Because records have been deleted from the Finalizer table last time, no Finalizer object is found this time.
10. The Finalizer2 table does not exist, so the object memory is released.
Since the release of unmanaged resources till now, we have talked about the management of managed memory. What if we use unmanaged resources such as databases and files? In this case, we need to use the standard IDisposable interface in. NET Framework. According to the standard, all classes that require manual release of unmanaged resources must implement this interface. This interface has only one method, Dispose (), but there is a relative Guidelines indicating how to implement this interface. Here I will explain it to you. Classes that implement the IDisposable interface need to have the following structure:Copy codeThe Code is as follows: view sourceprint? Public class Base: IDisposable {
Public void Dispose (){
This. Dispose (true );
GC. SupressFinalize (this );
}
Protected virtual void Dispose (bool disposing ){
If (disposing ){
// Managed class
}
// Release an unmanaged Resource
}
~ Base (){
This. Dispose (false );
}
}
Public class Derive: Base {
Protected override void Dispose (bool disposing ){
If (disposing ){
// Managed class
}
// Release an unmanaged Resource
Base. Dispose (disposing );
}
}

Why is it designed like this? Let me explain it later. Now let's talk about several principles for implementing this Dispose method: it cannot throw any errors, and duplicate calls cannot throw errors. That is to say, if I have already called the Dispose of an object, the program should not make an error when I call the Dispose for the second time. Simply put, the program will not do anything when it calls the Dispose for the second time. These can be achieved through one flag or multiple if judgments. The Dispose of an object must release all resources of this object. Take an inheritance class as an example. The Inheritance class uses unmanaged resources, so it implements the IDisposable interface. If the inheritance class's base class also uses unmanaged resources, the base class must be released, how can the base class resources be released in the inheritance class? Of course, a virtual/Overridable method is used to ensure that every Dispose is called. This is why we have a virtual/Overridable Dispose method. Note that you must first release the inherited class resources and then release the base class resources. Because unmanaged resources must be guaranteed to be correctly released, we need to define a Finalizer to avoid programmers forgetting to call Dispose. The above design adopts this form. If we manually call the Dispose method, there is no need to retain Finalizer. Therefore, in Dispose, we use GC. SupressFinalize to remove the object from the Finalizer table, so that the recovery speed will be faster. So what is the disposing and "hosting class? Yes: write all the managed code in the "Managed class" that you want to make it releasable when calling Dispose. Do you remember saying that we don't know when the managed code will be released? Here we just remove the reference of the member object so that it can be recycled, rather than directly releasing the memory. In the "Managed class", we also need to write all member objects that implement IDisposable. Because they also have Dispose, we also need to call their Dispose in the Dispose of the object, in this way, the second criterion can be guaranteed. Disposing is used to differentiate the call methods of Dispose. If we call it manually, the part of "Managed class" must be executed for the second criterion, but if it is the Dispose called by Finalizer, at this time, the object has no reference, that is, the object's members will naturally no longer exist (no reference), and there is no need to execute the "Managed class" part, because they are already available for recycling. Okay, that's all about the IDisposable interface. Now let's recall that we may have thought that the Dispose memory will be immediately released, which is wrong. Only the unmanaged memory will be immediately released. The released managed memory will be managed by GC, so we don't need to worry about it.
4. Use of weak references
A = B. We call such A reference strongly. GC checks strong references to determine whether an object can be recycled. There is also a reference called WeakReference, which does not affect GC collection, which is its usefulness. You will ask how useful it is. Now let's assume we have a very fat object, that is, it occupies a lot of memory. We have used this object and intend to remove its reference so that GC can recycle the memory, but we don't need this object anymore. No way, re-create the instance, why is creation so slow? Is there any way to solve this problem? Yes. It's not so fast to leave the object in the memory! However, we don't want this fat object to occupy the memory, and we don't want to always create such a fat new instance, because this is very time-consuming. What should we do ......? A wise friend must have guessed that the solution is weak references. That's it. We can create a weak reference for this fat object, so that GC can be recycled when the memory is insufficient, without affecting the memory usage, and we can use this object again before it is recycled by GC. Here is an example:Copy codeThe Code is as follows: view sourceprint? Public class Fat {
Public int Data;
Public Fat (int data ){
This. Data = data;
}
}
Public class Main {
Public static void Main (){
Fat oFat = new Fat (1 );
WeakReference oFatRef = new WeakReference (oFat );
// From here on, the Fat object can be recycled.
OFat = null;
If (oFatRef. IsAlive ){
Console. WriteLine (Fat) oFatRef. Target). Data); // 1
}
// Forcibly recycle.
GC. Collect ();
Console. WriteLine (oFatRef. IsAlive); // False
Console. ReadLine ();
}
}

Here, our Fat is not very Fat, but it can reflect the intention of the example: How to Use Weak references. What if Fat has Finalizer? If Fat has Finalizer, we may use another WeakReference constructor, which has a parameter called TrackResurrection. If it is True, we can use it as long as the Fat memory is not released, that is to say, after the Finalizer of Fat is executed, we can still restore the Fat (equivalent to restoring the Fat after the first recycle operation). If TrackResurrection is False, the Fat object cannot be restored after the first recycle operation.
5. Summary
I have written the main points of this article:
An object is recycled only when there is no reference.
The memory of an object is not immediately released. GC recycles it at any time. Generally, do not forcibly recycle jobs.
If there is no special need, do not write Finalizer.
Do not write time logic code in Finalizer.
Implement the IDisposable interface in any class that has unmanaged resources or contains Dispose members.
Write your own Dispose code according to the given Dispose design.
When using a fat object, you can consider the use of weak references.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.