The third in-depth understanding of the. Net Discovery Series. NET garbage collection mechanism (top)

Source: Internet
Author: User
Tags compact

Objective:

Composition. NET platform is a very important part----garbage collector (garbage Collection), we will talk about it today. Think about no GC,. NET can also be called a platform? Although all languages are compiled into MSIL, the runtime's resource recycling work is "fighting each other", which not only increases the difficulty of programming, but also makes the memory management work complex (small differences in different languages dealing with memory, which will be magnified when recovering resources), and not conducive to platform porting.

This article will give you a comprehensive overview of how. Net garbage collection works, algorithms, and key methods related to garbage collection.

When it comes to garbage collection, few people know that garbage collection does not come with Java, as early as 1958, the Lisp language invented by John the Turing Laureate has provided the function of the GC, which is the first appearance of the GC and a flash of thought! Then, in 1984, Dave Ungar invented the small talk language and formally adopted the GC mechanism for the first time.

. NET garbage collection mechanism is a big topic, if you have not touched a language like C + +, it is difficult to understand how the GC is a very important and exciting thing:

1. Improve the cohesion of the software system.

2. Reduce the complexity of programming, so that programmers do not have to distract from the process of destruction.

3. The designer does not interfere with the system abstraction.

4. Reduce the bug caused by improper memory usage.

5. When the memory management work is successfully removed from the program's writing, the unpredictable management vulnerability becomes predictable.

Body:

This article will be described in two sections: "GC Algorithms and working methods" and "GC Critical method parsing". NET garbage collection mechanism.

The first section. Algorithms and working methods of GC

1. Algorithms

The essence of the garbage collector is to keep track of all referenced objects, organize objects that are no longer referenced, and reclaim the appropriate memory.

This sounds like an algorithm called reference counting (Reference counting), but this algorithm needs to traverse all objects and maintain their references, so it is less efficient and can easily cause memory leaks when "ring references" occur. So. NET uses an algorithm called "Mark Sweep" to accomplish these tasks.

The "Mark and clear" algorithm, as the name implies, has two abilities:

"Mark"--garbage recognition: From the root of the application, the use of mutual reference relations, traverse its dynamically allocated on the heap of all the objects, not the referenced object is not marked, that is, garbage; The surviving object is marked, which is maintained as a "root-object-up graph".

In fact, the CLR will consider the object relationship as a "tree graph", undoubtedly, the students who understand the data structure know that with the "tree map" concept, will speed up the traversal of objects.

Detection, tagging object reference, is a very interesting thing, there are many ways to do it, but only one is the best efficiency,. NET is the use of stacks to complete, in the continuous stack and out of the stack to complete the detection: First in the tree map to select an object to be detected, all references to the object stack, so repeatedly until the stack becomes empty. The stack becomes empty, which means that it has traversed all the objects that the local root (or node in the tree graph) can reach. The Tree graph node scope includes local variables (in fact, local variables are quickly recycled, because they are clearly scoped, well-controlled), registers, static variables, and these elements repeat the operation. Once done, the memory is checked on an object-by-case basis and the unmarked object becomes garbage.

Purge Ability-Reclaim memory: Enable the compact algorithm, move objects that are alive in memory, modify their pointers so that they are contiguous in memory, so that idle memory is contiguous, which resolves the memory fragmentation problem, and when the new object is allocated memory again, The CLR does not have to look for memory space in fragmented memory for new objects, so the allocation speed is greatly increased. Except for large objects (large object heap), the GC does not move an in-memory Big Mac because it knows that the current CPU is not cheap. Typically, large objects have a long lifetime when a large object is in the. NET managed heap, it is allocated in a special part of the heap, and the overhead of moving large objects exceeds the performance that can be improved by organizing this part of the heap.

In addition to increasing the speed of allocating memory again, the performance of the cache will be improved if the newly allocated objects are very compact in the heap, since the allocated objects are often used together (the local principle of the program), so it is important to provide the program with a contiguous space of memory.

2. Delling (Generation)

Delling is to the heap of objects in accordance with the length of time to divide, the shortest points in the No. 0 generation, the longest in the 2nd generation, the 2nd generation of objects tend to be relatively large. The generation hierarchy is related to the framework version and can be learned by calling Gc.maxgeneration.

Typically, GC takes precedence over those recently allocated objects (No. 0 generation), which is the same as the "least recently used" algorithm of the operating system classic memory-paging algorithm. However, this does not mean that the GC collects only recently allocated objects, and typically,. Net GC divides the heap space into 3 generations for the duration of the object: the newly allocated objects are allocated in the No. 0 generation (the maximum length of the 0-generation space is usually 256K), assigned in address order, and they are usually local variables; 1th generation ( A 1-generation space with a maximum length of 2 MB is usually an object that resides in memory after 0 generations of garbage collection, usually objects such as forms, buttons, and so on. The 2nd generation is an object that has been in memory after a few garbage collections, usually some application objects.

When memory is tight (for example, a 0-generation object is full), the GC is called into the execution engine-that is, clr--begins to mark and compress the No. 0 generation of space, which is typically less than 1 milliseconds. If the memory is still tight after the recovery, then the GC will continue to recycle the 1th generation (typically less than 10 milliseconds for recycling), the 2nd generation, and of course the GC does not sometimes collect garbage in the order of No. 0, 1, and 2 generations, depending on the runtime, or manually calling Gc.collect (i) to specify the generation of the collection. The system throws a OutOfMemoryException exception when the 2nd generation of recovery fails to obtain enough memory

After a few GC, an object in the 0 generation still exists, then it will be moved to the 1th generation. Similarly, the 1th and 2 generations also run according to the same logic.

What's more, the number of generations in the GC heap and the capacity are variable (this is controlled by a "policy engine", in the second section, the "Policy Engine"), the following code with WinDbg to illustrate the problem, in the following code, you can click the button "Button1", The memory is constantly allocated, and then the age of the object "A" is obtained, and the age of "a" is obtained when the form is loaded.

public partial class Form1 : Form

    {

        private string a = new string(‘a‘,1);

        public Form1()

        {

            InitializeComponent();

        }

        private void button1_Click(object sender, EventArgs e)

        {

            a = new string(‘a‘, 900000);

            label1.Text = GC.GetGeneration(a).ToString();

        }

        private void Form1_Load(object sender, EventArgs e)

        {

            label1.Text = GC.GetGeneration(a).ToString();

        }

    }

When the program just loaded, "a" Delling for the No. 0 generation, through the WinDbg we also obtained the following information:

As you can see, the GC heap is divided into two segments, three generations, and the decimal difference for each generation start address is 12.

After clicking on the "Button1" button several times, the Delling of "a" is promoted to the 2nd generation, and by WinDbg we get the following information:

One of the key points to note here is that the start of each generation (Generation X starts at) The decimal address difference is no longer the 12,0 and 1 generation difference between the 98904,1 generation and the 2 generation difference of 107908, which indicates that the size of the generation varies with the program and the GC The size of the heap also varies.

The third in-depth understanding of the. Net Discovery Series. NET garbage collection mechanism (top)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.