Garbage collection mechanism (GC) to be understood [C #]

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

As a C ++ C #ProgramClerk, I initially had a skeptical attitude towards garbage collection (GC) and wondered if it could operate stably and efficiently. Now, I have to say that I have gradually become accustomed to and rely on GC and my program to "run together", and the keyword "delete", which is used to serving as the culprit, has gradually become unfamiliar. However, practice has proved that my excessive trust in GC has led to many unexpected errors, which has inspired me to gain a deeper understanding of GC operating machine production. Then I began to read the book, check the information, and finally had a complete understanding of GC (but far from being in-depth ). Some people may say, "Is there any value to study the internal mechanism of GC? We develop applications. The customer's machines can achieve high configuration, and the memory resources are not a problem ." This statement clearly states that "garbage collection = memory release", in fact, the most troublesome cause is not the amount of memory, but beyond the memory release, GC secretly performs complicated transactions for US (for example, cleaning and releasing of unmanaged resources ). If you do not know the basic operations of GC, but you do not have time to carefully read manyTechnical materialsThen, my articlesArticleIt may be helpful to you.
The following describes how to allocate and release resources.

1. Allocation of managed resources

During runtime, CLR manages a piece of memory address space (virtual address space, which is mapped to the physical memory address during runtime), which is divided into two parts: "managed heap" and "stack, stack is used to store value type data. It will automatically destroy the referenced value type variables after the method is executed. This part is not within the scope of garbage collection. Managed heap is used to store referenced variables, which is a key position in garbage collection.

The managed heap is a continuous address space, and the allocated space shows a queue structure similar to an array:

Nextobjptr is a memory pointer maintained by the managed heap, indicating the memory start address allocated to the next object, it will continue to move with the memory allocation (of course it will also move with the memory garbage collection), always pointing to the next idle address.

Here, we may wish to compare the efficiency of the memory allocation mechanism with C ++ (skip if you are not interested in the efficiency :)), by the way, let the C ++ friends dispel some doubts about the memory allocation efficiency of CLR. When searching for idle memory space, CLR only needs to set aside a specified size of space at nextobjptr for data initialization, and then calculate the new idle address and reset the nextobjptr pointer. In C/C ++, you must traverse the linked list of memory usage before allocating memory to find memory blocks of the appropriate size, and then modify the linked list, in this way, memory fragments are easily generated, reducing the Memory Allocation performance. Obviously, the. NET allocation method is more efficient. However, this efficiency is at the cost of GC.

Ii. Spam Determination

To collect garbage, you must first know what garbage is. GC traverses the "root" in the application to find the garbage. We can think that the root is a pointer to the memory address of the reference type object. If an object has no root, that is, it is no longer referenced by any position, then it is a candidate for garbage.

It is worth noting that objects may be listed in the spam list before their lifetime ends, or even have been stabbed by GC! This is because the object may no longer be referenced at a certain time in the lifetime. If garbage collection is executed at this time, therefore, this unfortunate object may have been listed as garbage and destroyed (why is it "possible? Because it is not necessarily within the GC's visual acuity range. We will introduce the relevant details later on "age generation ).
1 Public Static Void Main ()
2 {
3 String Sgarbage = " I'm here " ;
4
5 // The followingCodeS is no longer referenced, and it has become garbageObject--- Of course, such code itself is also spam;
6 // At this time, if garbage collection is executed, the sgarbage may have been sent to another day.
7
8 Console. writeline ( " Main () is end " );
9 }

3. Object age generation

Although GC is always quietly working for us, it is created by people after all, people will be lazy, and so will it. In order to reduce the workload, it always wants to reduce the scope of work; it firmly believes that objects created later tend to be short-lived, so it will focus on the memory area of this part, hold other parts temporarily. GC introduces the concept of "age generation" to divide the object survival level.

The first batch of created objects after CLR initialization are listed as zero-generation objects. CLR sets a capacity limit for the 0-generation object. When the size of the created object exceeds the configured capacity limit, GC starts to work, the scope of work is the memory area of the 0-generation object, and then starts searching for junk objects and releasing the memory. After GC is completed, the surviving objects are listed as 1st generation objects and kept in the region of 1st generation objects. After that, the newly created object will be listed as a new batch of zero-generation objects until the memory area of the zero-generation object is filled up again, and a new round of garbage collection will be conducted for the Zero-generation object area, after that, these 0-generation objects will be listed as 1st-generation objects and incorporated into the 1st-generation region. At first, the 1st generation region will also be set with a capacity limit value. After the size of the 1st generation object exceeds this limit, GC will expand the battlefield, garbage collection is also performed for the 1st generation region. After that, the surviving objects will be upgraded to a 2nd generation object.

It can be seen that some objects meet all the garbage conditions, but if they are 1st (or even 2nd) objects, in addition, if the allocation volume of the 1st generation is smaller than the set limit value, these spam objects will not be detected by GC and can survive.

In addition, GC will also learn from the experience in the work process, automatically adjust the capacity of each generation of object area according to the characteristics of the application, so as to work more efficiently.

Garbage collection mechanism to be understood (2)
For most applications, the primary motivation for understanding the garbage collection mechanism is not to "save money" for the memory, but to handle the issue of controlling unmanaged resources, these problems often have nothing to do with the memory size. For example, when to close a file, when to close the file, what should I pay attention to when to close the file, and what will happen if I forget to close the file? These are all things that need to be carefully considered, no matter how large your memory is :)

For this type of operation, we cannot rely on GC for help, because it does not know what we want to do when we release, it does not even know what we should do! We have to write the processing code by ourselves. Of course, Microsoft has already set up a framework for us. These two functions are finalize and dispose. They also represent two methods of unmanaged cleanup: automatic and manual.

I. Finalize

Finalize is like a C ++ destructor. The implementation form in the Code is exactly the same as that in the C ++ destructor, however, the calling process is very different.

~ Classname () {// release your unmanaged resources}

For example, if a Class A implements the finalize function, when an object A of A is created (accurately speaking, it should be before the constructor is called ), its pointer is inserted into a finalization linked list. During GC running, it will find the object pointer in the finalization linked list. If a is already a junk object, it will be moved into a freachable queue, and GC will call a high-priority thread. This thread is responsible for traversing the freachable queue and calling the Finalize method of all objects in the queue. So far, the unmanaged resources in object a are released (the premise is that you have implemented its Finalize method correctly ), the memory resources occupied by a must wait for the next GC to be released. Therefore, an object that implements the Finalize method must wait for two GC times to be completely released.

Finalize is called by GC, so it is an automatic release method. However, you should pay attention to two problems: first, the object resources may not be released for a long period of time because it is unable to determine when the GC will run, this is terrible for some key resources. Second, because the thread responsible for calling finalize does not guarantee the order of the finalize calls of each object, this may cause a subtle dependency problem. If you reference object B in finalize of object A, and both A and B implement finalize, if finalize of object B is called first, then, a problem occurs when calling finalize of A, because it references a released resource. Therefore, in the Finalize method, try to avoid referencing other objects that implement the Finalize method.

It can be seen that this "automatic" method of releasing resources does not meet our needs, because we cannot call it explicitly (only called by GC), and it will produce dependency problems. We need to control resource release more accurately.

Ii. Dispose

Dispose is a method provided for us to display the call. Because the implementation of dispose is prone to problems, a specific implementation mode is provided in some books (such as Objective C # and applied Microsoft. NET Framework programming:

Class disposepattern: idisposable
{
Private system. Io. filestream FS = new system. Io. filestream ("test.txt", system. Io. filemode. Create );

~ Disposepattern ()
{
Dispose (false );
}

Idisposable members # region idisposable members

Public void dispose ()
{
// Tell the GC that the Finalize method does not need to be called,
// The resource has been cleared.
GC. supdivssfinalize (this );

Dispose (true );
}

# Endregion

Protected virtual void dispose (bool disposing)
{
// The dispose method may be called by multiple threads,
// So lock to ensure thread security
Lock (this)
{
If (disposing)
{
// Indicates that the Finalize method of the object is not executed,
// Here, you can safely reference other objects that implement the Finalize method.
}

If (FS! = NULL)
{< br> FS. dispose ();
FS = NULL; // identify that the resource has been cleared, avoid Multiple releases
}< BR >}

A clear description has been provided in the annotations. It also needs to be noted that if the disposepattern class is derived from the base class B, and B is a class that implements dispose, in disposepattern, you only need the dispose method with the parameter of override base class B, instead of rewriting the dispose and finalize methods without parameters. In this case, the implementation of dispose is:

Class derivedclass: disposepattern
{
Protected override void dispose (bool disposing)
{
Lock (this)
{
Try
{
// Clear your own unmanaged resources,
// The implementation mode is the same as disposepattern.
}
Finally
{
Base. Dispose (disposing );
}
}
}
}
Of course, if derivedclass does not have any resources to clean up, you do not need to rewrite the dispose method. Just like some of the dialogs we usually do, though they all inherit from system. windows. forms. form, but we often do not need to rewrite the dispose method of the base class form, because there is no unmanaged mongodoon to release itself.

Understanding the temperament of GC is very necessary in many cases. At least you will not be confused when there is a resource leakage problem. I wrote a control to generate an Excel report, and the release of the Excel object made me busy for a while. If you have worked in Excel Development, you may have encountered problems such as ending the Excel process, especially when encapsulated into a library for others' calls, it is critical to release an Excel object to ensure that the process ends. Of course, the internal mechanism of GC is very complicated, and there is still a lot of content to explore, but the cost of understanding all the details is too high, just need to understand the basics, enough.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Garbage collection mechanism (GC) to be understood [C #]

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Garbage collection mechanism (GC) to be understood [C #]

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support