CLR garbage Collection algorithm

Source: Internet
Author: User

Compared with c,c++, C # provides programmers with great convenience in memory management, freeing programmers from dealing with memory addresses and improving programmer productivity. For example, the allocation of malloc heap space in C is not released due to memory leaks, array out of bounds caused by memory errors, using the freed memory space errors and so on. None of this exists in C #, mainly due to the security checks provided by the CLR and the garbage collection mechanism. This article focuses on common garbage collection algorithms and garbage collection algorithms used in the CLR.

The GC performs a garbage collection algorithm in the usual case when an object is allocated with insufficient memory heap space. By default, the process starts, the size of the corresponding heap space is allocated, and the size of the heap space is limited by the process virtual space. The maximum heap space on a 32-bit system is 1.5g,64 bits of 8T,

1. Reference count:

The idea of a reference count recovery algorithm is to maintain a reference count internally for an object created in the heap, but when the object is referenced, the reference count of the object is added to 1, and when the object's reference exceeds the lifetime (such as when the method ends) or when it points to a new object, the reference count of the object is reduced by one. When an object has a reference count of 0, it can be garbage collected. The advantage of this algorithm is that it is simple to implement, garbage collection does not terminate the execution of threads, and is suitable for real-time environments, but it is not possible to resolve circular references when they are flawed. For example, an object a points to B, and B points to a, but no other references point to both, and you can never reclaim both.

2. Reference Tracking algorithm:

The basic steps of the reference tracking algorithm are: When the GC starts, 1. Pause All threads first. 2.GC marking phase: first Traverse all the objects in the heap, place the identity 0 (the identity is included in the Synchronization Block index field), indicating that the initial time can be deleted. Second, the CLR looks for all the root references (the root reference here refers to some reference variables, including class static reference variables, instance reference variables, method local reference variables, and method parameters), and if the root reference is empty, the CLR ignores the reference and finds the next root reference, when the root reference is not empty. The identity of the object to which it is directed is set to 1, and the root reference in the object is found, so that it can be identified, and when it encounters an object that has already been identified, he will no longer look for the root reference in the object, preventing an infinite loop caused by the circular reference.

As an example, 1 shows that the object in the heap has abcdefghij, when the CLR executes the GC, in the second stage of the tag he looks for the root reference, discovers that it points to the ACD, places the corresponding identity bit value of the object at 1, and discovers that D also references H, then the identity position of H 1, when the marking stage is completed, The GC begins to perform compression of the object, compresses the surviving objects into contiguous space, releases the objects that are no longer in use, and the GC performs a compressed object memory allocation of 2. The advantage of performing compression is that objects are in contiguous memory space, reducing the working space of the application and improving the performance of the Access object. Second, prevent the generation of memory fragments. Finally, when the surviving objects are compressed into contiguous space, their memory addresses change accordingly, which is where we need to change the address of the object that the root reference points to. The nextobjptr in the figure point to the location where the object's address will be assigned next. When compression is complete, the CLR resumes all threads so that it continues to execute.

Figure 1, the distribution of the objects in the heap before the GC is not performed.

Figure 2: After the GC is executed, the objects in the heap are distributed

3. Generation-based garbage collection algorithm.

The generation-based garbage collection algorithm is based on three aspects of inference: 1. The more new The object, the shorter the life cycle. 2. Older objects will have a longer life cycle. 3. It is quicker to reclaim part of the heap space at a time than to reclaim the entire heap.

The following example illustrates the idea of generation-based garbage collection:

1. First, there is no object in the initial heap space, and when the object is initially allocated to the heap, it is allocated to the 0-generation heap space, initially only 0 generations. The CLR Initializes a 0-generation space with a corresponding size, a few K bytes. As shown ABCDE is assigned to 0 generation space:

After the program runs for a period of time, C,e unreachable, when to allocate object F, because 0 generation has no space to allocate, then need to perform garbage collection, recycling CE, while performing compression, the abd stored in continuous space, at the same time, Abd was promoted to 1 generation space. The heap space after the garbage collection is performed is as follows:

After garbage collection, you can see that the 0-generation space has no objects, so the new objects will always be allocated to the 0-generation space, and then when we allocate F-K objects, the heap space is as follows:

After the program runs for a period of time, when we allocate a new object L to 0 generation of space, 0 generation of space is full, now we need to perform GC, (when it is necessary to explain the CLR initialization will also allocate the size of the corresponding space to 1 generation of space, the size of the 1-generation space allocation is greater than 0 generation space). When the GC performs garbage collection, it first checks whether the size of the 1-generation space has reached the allocated size, and does not perform a garbage object check. So garbage collection checks only 0-generation space objects, finds that H and J are unavailable, can be recycled, and after compression and promotion, the allocations in the heap space are as follows:

You can see that b in the 1-generation space is unreachable, but not reclaimed, the garbage collector recycles only the unusable objects in the 0-generation space, and then we allocate the objects from L to 0 to the 0-generation memory space, as shown in the following heap space:

When we assign the object P again, the 0-generation space is full, the GC is executed, and as above, the heap space following the GC execution is as follows:

You can see that 1 generations of allocated space is increasing, and the number of unreachable objects is increasing slowly. Let's assign the object p to s again, and the allocation after the heap space diagram looks like this:

When we assign the object T, because the 0-generation space is full, we need to perform a GC, assuming that the 1-generation object allocation has reached the maximum allocated space, the GC examines objects in the 1-generation heap space and objects in the 0-generation heap space, reclaims the unavailable objects, and performs compression, as shown in the reclaimed heap space:

As you can see, the surviving objects in the 1-generation space are mentioned in the 2-generation space, the objects in the 0-generation space are referred to the 2-generation space, and the 0-generation space is empty.

As you can see from the example above, the garbage collector recycles objects in the 0-generation heap space several times before performing object recycling in the 1-generation memory space, primarily to provide program performance. Typically, objects with longer periods of time have a longer life cycle. At the same time, the CLR garbage collection algorithm is adaptive, and he will dynamically adjust the size of the generation space and the corresponding recovery of the 1-generation or 2-generation heap space, based on the lifetime of the objects in each generation space and the degree of recycling each time the object is recycled. For example, when objects in a 0-generation space are garbage, you can point nextobj directly to the bits of the first object without performing compression. When objects in the 0-generation space are found to be garbage collected when the object is very rare, you can increase the size of the 0-generation space, when the 0-generation space objects are recycled for a long time, you can adjust the size of the 0-generation space and so on.

Reference: CLR VIA C # Book

CLR garbage Collection algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.