. Net garbage collection mechanism

Source: Internet
Author: User

This article analyzes the. NET garbage collection mechanism to show you the age andAlgorithm,In essence, the garbage collector is responsible for tracking all the places where objects are referenced, paying attention to the situations where objects are no longer referenced, and recycling the corresponding memory.. The same applies to. Net platforms, which effectively improves the. NET garbage collection performance.ProgramExecution efficiency.

In fact, garbage collection did not come along with Java. As early as 1958, The LISP Language invented by the Tuling prize winner John had provided the GC function, which was the first occurrence of GC, is a flash of thought! Then, the small talk language invented by Dave Ungar in 1984 officially adopted the GC Mechanism for the first time .. Net's garbage collection mechanism is a big topic. If you haven't been familiar with languages like C ++, it's hard to understand how important and exciting GC is:

1. improve cohesion of software systems.

2. Reduce programming complexity, so that programmers do not have to focus on the analysis structure.

3. It does not prevent the designer from abstracting the system.

4. Reduce bugs caused by improper memory usage.

5. The memory management work is successfully removed from program writing to runtime, so that unpredictable management vulnerabilities become predictable.

1. Algorithm

The essence of the garbage collector is to track all referenced objects, sort out objects that are no longer referenced, and recycle the corresponding memory. This sounds like a reference counting algorithm. However, this algorithm needs to traverse all objects and maintain their references, so it is less efficient, in addition, memory leakage is easily caused when "Ring reference" occurs. Therefore,. net uses a mark sweep algorithm to complete the preceding tasks. The "mark and clear" algorithm, as its name implies, has two capabilities:

"Mark" Skills-- Garbage identification:Starting from the root of the application, the system uses the mutual reference relationship to traverse all the objects dynamically allocated on the heap. The objects not referenced are not marked and become garbage; the surviving object is marked as a root-object reachable graph ".Actually, CLR regards object relationships as "tree graphs". Undoubtedly, anyone who understands the data structure knows that the concept of "tree graphs" will speed up object traversal.

Detecting and marking object references is an interesting thing. There are many ways to do this, but only one is the most efficient ,. net is completed by using stacks, and the detection is completed in the continuous inbound and outbound stacks: first, select an object to be detected in the tree graph, press all references of the object to the stack until the stack becomes empty. Empty stack means that all objects that can be reached by the local root (or node in the tree chart) have been traversed. The node range of the tree chart includes local variables (in fact, local variables will be recycled soon, because of its obvious scope and good control), registers, static variables, and these elements need to repeat this operation. Once completed, the memory is checked by objects one by one, and unmarked objects become garbage.

"Clear" Skills-- Reclaim memory: Enable the Compact algorithm,Move the surviving objects in the memory and modify their pointers so that they can be continuously stored in the memory. In this way, the idle memory is continuous, which solves the problem of memory fragmentation, when allocating memory for a new object again, CLR does not have to find the memory space suitable for the new object in the memory filled with fragments, so the allocation speed will be greatly improved..

Except for large objects (large object heap), GC won't move a big bully in memory because it knows that the current CPU is not cheap.Generally, large objects have a long lifetime when a large object is in. when a net managed heap is generated, it is allocated in a special part of the heap. the overhead of moving large objects exceeds the performance that can be improved by sorting out this heap..

In addition to increasing the memory allocation rate, the Compact algorithm improves the cache performance if the newly allocated objects are in a tight position in the heap, because objects allocated together are often used together (Program Locality Principle), it is very important to provide a continuous blank memory space for the program.

2. Generation)

Aging means to divide objects in heap into generations based on the length of time they exist,The shortest score is in The 0th generation, the longest score is in the 2nd generation, and the objects in the 2nd generation are usually relatively large.. The generation level is related to the Framework version, which can be known by calling GC. maxgeneration.

In general, GC collects the most recently allocated objects (0th generations) first, which is exactly the same as the "least recently used" Algorithm of the classic memory paging algorithm in the operating system. However, this does not mean that GC only collects recently allocated objects ,. net GC divides the heap space into three generations based on the lifetime of the object: The newly allocated object is in The 0th generation (the maximum length of the 0 generation space is usually 256 K), and is allocated in the address order, they are usually some local variables; the 1st generation (the maximum space length of the 1 generation is usually 2 MB) is an object that remains in the memory after the 0 generation of garbage collection, they are usually some objects, such as forms and buttons. The 2nd generation is the objects that still reside in the memory after several garbage collection, and they are usually some application objects.

When the memory is tight (for example, the 0-generation object is full), GC is transferred to the execution engine-that is, CLR-to mark, compress, and recycle the 0th-generation space, this is usually less than 1 ms. If the memory is still tight after recovery, GC will continue to recycle 1st generations (usually less than 10 milliseconds) and 2nd generations, of course, GC sometimes does not collect garbage in the order of 0th, 1, and 2, depending on the running situation, or manually call GC. collect (I) specifies the collection generation. If you cannot obtain enough memory after you recycle the 2nd generation, the system will throw an outofmemoryexception. After several GC operations, an object in the 0 generation will still exist, then it will be moved to the 1st generation. Similarly, the 1st and 2 generations run in the same logic.

The number and capacity of generation in GC heap are variable (this is controlled by a "policy engine" and will be introduced to "policy engine" in section 2 "), belowCodeThis problem can be explained in combination with windbg. In the following code, you can click the button "button1" to continuously allocate memory and then get the age of the object ", in addition, the age of "a" is also obtained when the form is loaded.

 Public   Partial   Class Form1: Form

{

Private String A = New String ( ' A ' ,1 );

Public Form1 ()

{

Initializecomponent ();

}

Private Void Button#click ( Object Sender, eventargs E)

{

A = New String ( ' A ' , 900000 );

Label1.text = GC. getgeneration (a). tostring ();

}

Private Void Form1_load ( Object Sender, eventargs E)

{

Label1.text = GC. getgeneration (a). tostring ();

}

}

When the program was just loaded, the generation of "A" was 0th, and the following information was obtained through windbg:

It can be seen that the GC heap is divided into two segments, three generations. The decimal difference between the start address of each generation is 12. After clicking the "button1" button several times, the age of "a" is increased to 2nd generations, we also obtained the following information through windbg:

Note that the difference between the start (Generation X starts at) decimal address of each generation is no longer 12. The difference between generation 0 and generation 1 is 98904, and that between generation 1 and generation 2 is 107908, this indicates that the generation size is changing as the program runs, and the GC heap size also changes.

 Class Program
{

Static Void Main ()
{
Timer T = New Timer (timercall, " The current time is " , 0 ,2000 );
Console. readkey ();
}
Static Void Timercall ( Object OBJ)
{
Console. writeline (obj. tostring () + datetime. Now );
Console. writeline (GC. getgeneration (OBJ )); // 0, 0, 0 .....
// GC. Collect (); // Here the OBJ generation will be 0, 1, 2, 2..., why? You know ..
}
}

 

The CLR Garbage Collector divides Objects Based on the occupied space. The processing methods for large and small objects are quite different. For example, memory fragment-the cost of moving large objects in the memory is expensive. Let's look at how the Garbage Collector handles large objects and what potential impact the large objects have on the procedure.

Large Object heap and garbage collection

In. NET 1.0 and 2.0,If the size of an object exceeds 85000 (the local test is 84988) bytes, this is considered a large object.. This number is obtained based on performance optimization experience. When the applied memory size of an object reaches this threshold, it will be allocated to the large object stack. What does this mean? To understand this, we need to understand the. NET garbage collection mechanism.

Static VoidMain ()
{
Byte[] Bytes =New Byte[84988];//84988 is, And 84987 is 0. Is it related to machine configuration?
Console. writeline (GC. getgeneration (bytes ));
Console. readkey ();
}

As most people know,. Net GC is recycled based on "Generation. The objects in the program have three generations: 0, 1, and 2. The 0 s are the youngest objects, and the 2 s have the longest survival time. GC garbage collection by generation is also out of performance considerations; normally, objects are recycled in the 0 generation. For example, in an Asp.net program, all objects related to each request should be reclaimed at the end of the request. Objects that have not been recycled will become the first-generation object. That is to say, the first-generation object is a buffer between the resident memory object and the object that will soon die out.

From the perspective of generation, a large object belongs to a two-generation object, because the large object is processed only when the two-generation object is recycled. When a generation of garbage collection is executed, the younger generation of garbage collection will be executed at the same time. For example, when the first generation of garbage collection is performed, both the first generation and the zero generation are recycled. When the second generation of garbage collection is performed, the first generation and the zero generation are recycled.

Instead, it is the logical view for the garbage collector to differentiate memory areas. From the perspective of physical storage, objects are allocated to different managed stacks. A managed heap is the memory area requested by the garbage collector from the operating system (by calling Windows API virtualalloc ). When the CLR loads the memory, it initializes two managed heaps, one Loh-large object heap and one small object heap ).

The memory allocation request puts the managed object on the corresponding managed stack. If the object size is less than 85000 bytes, It will be placed in Soh; otherwise it will be placed in LOH.

For Soh, after a garbage collection is executed, the object enters the next generation. That is to say, if the first garbage collection is performed, the surviving object will enter the second generation. if the object is still not recycled as garbage after 2nd garbage collection, it will become a 2-generation object; the 2-generation object is the oldest object and will not be upgraded to algebra.

When garbage collection is triggered, the garbage collector fragment the small object heap to move the surviving objects together. For large object heap, due to the high overhead of the mobile memory, the CLR team chooses to only clear them and form a list of recycled objects so that the memory can be applied for by large objects next time, adjacent junk objects are merged into idle memory blocks.

It should always be noted that, until. Net 4.0 does not fragment large object heaps, and may do so in the future. Therefore, if you want to allocate large objects and do not want them to be moved, you can use the fixed statement.

Reclaim The SOH of the following small object heap

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.