C # GC management [Post]

Source: Internet
Author: User

I have always felt that resource management in C ++ is very painful. I have to read a lot of empirical manuals to ensure good management of memory and other resources. In contrast,. Net (CLR) introduces the garbage collection mechanism (GC) to recycle managed heap resources, which undoubtedly reduces the burden on developers. However, there is no free lunch in the world. If you want to know the operation of GC clearly, you should have a good grasp of resource management (especially when it comes to unmanaged resources ), it is not easy for people like me who have no breakthrough in AI.
Well. Let's take a look at the Resource Management Mechanism of CLR. First, let's look at the CLR's memory (managed heap) allocation. The simplest statement is that the managed heap is allocated by a continuous block. Specifically, the managed heap allocates the block starting with the empty heap and points it to the beginning of the new empty heap. The method is similar to stack allocation. If the managed heap only has such a allocation process, its speed will undoubtedly be terrible. Net programmers are also so happy. Unfortunately, the memory is limited and GC must be started at the right time to recycle invalid objects (that is, no one uses them.
GC's collection of managed heap memory is a complex process. In one sentence, GC is started every time, and a part of rootless objects will be recycled. A rootless object is an object that can be used again without variables in the current execution domain. For example:
Object A = new object ();
A = NULL;
At this time, when initializing a, the new object becomes a rootless object, which is a recyclable object. Does GC know what a rootless object is? Well, in most cases, we need to trust our comrades and believe that it will not leak a rootless object or take a rootless object as a rootless object. Let's take a look at the recycling algorithm. This will be a bit complicated. In short, the GC collection algorithm is based on the assumption that the new object has a short lifetime (this assumption is very reliable. Consider the code we wrote to understand it, code on the outermost layer is always used for a long time, and the variables in a for are often discarded when they are used ). Therefore, it preferentially recycles memory occupied by new objects. More specifically, it uses a recycling algorithm called the aging algorithm. By default, all objects in the heap are divided into three age generations: 0, 1, and 2. The older the object, the more likely it will be to be used, the zero generation will be allocated. The total space of each aging object has a threshold value. CLR dynamically adjusts the Threshold Value Based on GC execution. When the space occupied by the 0th generation object exceeds the threshold, GC starts to recycle the memory. First, The 0th generation of space is reclaimed, and then the 0th generation is upgraded to the 1st generation, if the first generation is full, 1 will be recycled (otherwise it will end). So on (I also have a question: What should I do if the second generation is full, whether to expand in time or expand in the middle of the virtual memory ). Of course, this is a simple description. Many factors may be considered in the algorithm and a series of optimization measures may be taken, these details do not need to be understood by most people (I can't understand them either ). From this algorithm, we can also see that the GC mechanism is particularly suitable for the scenarios where a large number of temporary objects are created and all are destroyed. Therefore, GC has outstanding performance in Asp.net.
The above describes a macro process that considers how managed stacks are allocated and recycled as a whole. Let's further consider the allocation and recycling of a specific object. Assume that this object is named faint (^_^ ). First, the faint is allocated in the managed heap. At this time, it is a generation 0. Unfortunately, when faint has not been upgraded to the first generation, it is discarded and becomes a rootless object. At this time, the GC boss collected the memory, and faint certainly did not escape the boss's sharp eyes. The critical moment is approaching. The old Conference asked it a matter of life and death (not real-time judgment, but saved in advance using a structure). Is your Finalize method the original object. finalize () (sorry, It's so disgusting. Finalize is a method called when the object is recycled. It has a default implementation in the object class. If one of the faint's ancestors overrides the Finalize method, this method is not an existing object. finalize method). If yes, it will be killed on the spot and no longer exist. If not, the faint will still be put in the heap, it will be killed at the next GC startup (early death and late death ). The story is not well written. It should be emphasized that if a class is overloaded with the Finalize method in the inherited structure, it will not be recycled after two GC starts. Understanding this situation will often pave the way for your epiphany.
Another question is, when will CLR start GC? How can I ensure that the memory allocation status will not change during GC operation? The policy adopted by CLR is to hijack the current thread and start the garbage collection thread at the appropriate time (called the security point). At this time, all other threads are suspended and wait until the collection is completed. There is no doubt that Microsoft will continue to work hard to reduce this kind of overhead when there is a huge performance loss. At least it will not feel this kind of overhead in general use. In addition, not only does GC recycle the memory (that is, call the finalize function), but when the CLR uninstalls the appdomain or closes the CLR, the CLR also traverses the finalize function of all objects, to reclaim all memory space.
After reading this for a long time, we found that this was all done by the system. The system automatically allocates memory, automatically judges rootless objects, automatically starts GC, and automatically calls the recycle algorithm. Can we change it? Yes, of course. You can change the aging threshold (if you haven't played it, it seems OK). The most common thing is to call GC. collect () requires that GC be started to reclaim memory (sometimes two GC calls are required. collect (), think about why, the answer is above Oh ^_^), of course, many times it is not recommended to do this, because it will bring performance loss. Unless you do need to recycle large memory and perform a very time-consuming task at this time, the GC time can be well masked (consider a very modest mm, what will happen to the XX Film and Television College and XX Science and Technology School? =. = !!, Haha, a joke ).
Garbage collection is not just a simple process. There is also a lot of content such as weak type and burst garbage collection. If you are interested, you can read the description of related content in the. NET Framework programming. It may take a bit of patience, but it will definitely benefit a lot.
How can these processes be used by General developers? At least you can be comforted by knowing how your new things are sacrificed. Haha, of course there are other benefits. For example, I have clearly written code similar to faint = NULL, so that the meaning of the Code is gone. This is to help the system make it clear that my object has no root and can be recycled (you don't have to wait to jump out of the valid domain ). In addition, I won't confuse GC. Collect () to tease CLR. In addition, it may be helpful to understand the management of other unmanaged resources later.

PS: writing so much energy is exhausted, so we can only split it up and down. The recycling of unmanaged resources will be completed tomorrow...

 

 

The working principle of GC is described above. Note that GC can only Recycle resources in the managed heap. Other unmanaged resources, such as file resources, buffers, and mutex, cannot be automatically recycled by GC. It must be implemented through the developer's own programming (sometimes it may be difficult to manage CLR resources, because it has some automatic and manual functions, however, compared with C ++, we should be satisfied with the limit _^ ).
A natural encoding method is to write the function that recycles resources into finalize, when GC starts to recycle managed resources, it recycles the unmanaged resources (sometimes I will simply say GC is collected, this means that the code is executed and recycled at GC startup ). However, this may cause many problems. The biggest problem is timing. Developers cannot determine when to start GC (including calling GC. collect () does not work either. This makes it impossible to determine when the unmanaged resources will be released, or determine the order of the resources occupied by different objects.
Therefore, we need to manage unmanaged resources in a mode called dispose. First, let's take a look at the typical dispose mode encoding:
Public class unsafesourceholder: idisposable
{
Private intptr buffer;
Private safehandler resource;
Private bool disposed;

Public unsafesourceholder ()
{
Buffer = ...;
Resource = ...;
Disposed = false;
}

Protected virtual void dispose (bool disposing)
{
If (disposed) return;

Releasebuffer (buffer );
If (disposing)
{
If (resource! = NULL) resource. Dispose ();
}

Disposed = true;
}

~ Unsafesourceholder ()
{
Dispose (false );
}

Public void dispose ()
{
Dispose (true );
GC. suppressfinalize (this );
}

Public void close ()
{
Dispose ();
}

Public void dosomething ()
{
If (disposed) throw new objectdisposedexception ("released resources cannot be used ");
}
}
From the above Code, let's take a look at the key points of the dispose mode. First, if a class (unsafesourceholder in the previous example) contains unmanaged resources (in the previous example, both buffer and resource represent unmanaged resources), it needs to implement the idisposable interface, this interface only has one public void dispose () method. Of course, if you do not implement this interface (for example, put the content in dispose () in close (), there is no problem in theory. However, there are at least two bad points:
1. you have lost a language for talking to the caller. (by default, when the caller sees a class that implements the idisposable interface, the class contains unmanaged resources ), the caller may not know that this class contains unmanaged resources;
2. You have lost some convenient syntax support (for example, the using in C # will automatically help you call the dispose method of this interface ).
Therefore, this is required. Why is there a close () method? There are two reasons:
1. The caller will get used to the name close;
2. CLR uses the dispose () method as an explicit interface method. You must call: (idisposable I). Dispose () in this way, which is troublesome.
Well, when there are managed resources in a class, you need to implement the idisposable interface and expose the above two APIs for the caller to use.
When it comes to calling, let's take a look. There are several ways for a person who calls this type of object to trigger Resource Recycling. One is to explicitly call the dispose or close method (or use the syntax to secretly call it) and notify the user to recycle resources. Another method is to enable GC to automatically Recycle resources. What? I have said that GC cannot recycle unmanaged resources. Well, yes, but I have reloaded the (equivalent to) Finalize () method. What? You didn't see it, so please note ~ Unsafesourceholder. This is not a C ++ destructor. I wrote this function in the above example, which is equivalent to writing the following code:
Protected override void finalize ()
{
Try
{
Dispose ();
}
Finally
{
Base. Finalize ();
}
}
The reason for writing in the method of seemingly destructor is that many developers do not write the code according to the rules, so that the final function throws an exception, causing the process to go down or resource leakage. Therefore, please write in this way that makes many c ++ programmers feel uncomfortable (I personally think this design is a bit superfluous, A developer who does not write such code as try-finally, how are you sure it will write code like destructor ??).
However, we can find out the code carefully. The two call methods are different. Actually, the dispose (bool disposing) function is used to process unmanaged resources. The dispose () method calls dispose (true ), GC calls dispose (false ). What is the difference? To clarify this problem, we need to divide the unmanaged resources into two categories (these are my own statements, and there may be bugs ), the first type is the unmanaged resources enjoyed by only one object, which is expressed in buffer in the preceding example. Another type is the unmanaged resources enjoyed by many objects, which are represented by resouce in the preceding example. The exclusive resource must be released when the object is released, so the resource will be considered to be released during GC calls. However, the other type of shared resources cannot be released by GC, because the GC release order is unstable, and it cannot be confirmed whether other objects are using the resource after the resource is released. Only the caller knows this. Therefore, the caller can only manually call dispose () or close () to release the resource. That is to say, manual calls will release all the unmanaged resources used by the object, and automatic calls can only release the exclusive unmanaged resources of the object.
However, what if the caller encounters an error and uses it again after the unmanaged resource is released? At this time, the object is required to be able to tell the Caller: Are you craze ?. In the above example, the disposed variable is used to record the resource usage status and an objectdisposedexception exception is thrown when it is used illegally (this is also part of the content recommended by the dispose mode ).
In the above Code, there is another GC. suppressfinalize (this); then, the GC is told not to recycle the object automatically. I am not very clear about the reason here. It is said that the object has been explicitly recycled and there is no need to recycle it automatically. I personally think it is wrong, because the explicit recovery is only for unmanaged resources, and the managed resources are not recycled. I checked msdn and some books and thought it would be better to explain that if the garbage collection happens during dispose collection (because the thread is suspended when garbage collection is started, does not affect the collection of unmanaged resources, so the resource will be recycled from two places to generate conflicts), so GC is called. suppressfinalize makes the garbage collection mechanism stop trying to recycle the object for a period of time. Please note that this is a period of time rather than forever.
The above is a basic structure of the dispose mode. The following points are briefly summarized:
1. Implement the idisposable interface.
2. Provides the close () function and calls the dispose () function internally.
3. Implement a dispose (bool disposing) function to process different unmanaged resources.
4. Implement the Destructor (add a pseudo-character to prepare ^_^) so that it can be automatically recycled at GC startup.
5. An objectdisposedexception exception is thrown when a released resource is called (the same resource must be cleared multiple times at the same time ).
6. Use the GC. suppressfinalize function to prevent cleanup conflicts.
Of course, you can further improve its implementation based on the situation (for example, lock the resource during recycling), but its overall idea and basic interface should not be changed. When you implement a class that manages the design of unmanaged resources, follow this mode. However, if you do not have any unmanaged resources in your class, please do not have to wait for something to do. Think about the desired place in the class. ^_^.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.