[C # advanced series] 20 managed heaps and garbage collection,

Last Update:2016-04-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Managed heap Basics

Generally, to create an object is to allocate memory by calling the IL command newobj, and then initialize the memory, that is, when the instance constructor does this.

After the object is used up, the state of the resource is destroyed for cleanup, And the Garbage Collector releases the memory.

In addition to avoiding the error of using the released memory, managed heap also reduces memory leakage. In most types, no resource cleanup is required, and the garbage collector Automatically releases resources.

Of course, you also need to clean up immediately. For example, some types that contain local resources (such as files, sockets, and database connections) can call a Dispose method in these classes. (Of course, some classes encapsulate this method. It may be another name, for example, close the database connection)

Allocate resources on managed stacks

CLR requires that all objects be allocated from the managed heap.

During process initialization, CLR specifies an address space area as the managed heap. CLR also maintains a pointer, that is, NextObjPtr. The pointer pointing to the next object in the heap is unknown.

When a region is filled with non-spam objects, the CLR allocates more regions. This process repeats until the whole process address space is filled up. 32-bit processes can allocate a maximum of 1.5 GB, and 64-bit processes can allocate a maximum of 8 TB.

When creating an object, the number of bytes required for the object type field (including the base class) is calculated first, the number of bytes required to add the object overhead (that is, the type object pointer and synchronization block index ).

Then, the CLR checks whether the area contains the memory size required to allocate the object in bytes. If there is a managed heap, put the address pointed to by NextObjPtr into the object, and NextObjPtr will add the number of bytes occupied by the object to get a new value, that is, the address when the next object is put.

Reclaim resources through the garbage collector (GC)

When creating an object, the CLR finds that there is not enough memory to allocate the object, so garbage collection is performed.

During garbage collection, CLR first suspends all threads,

Tag phase:The CLR then traverses all referenced objects in the heap and sets one of the synchronized block index fields to 0, indicating that all objects will be deleted. Then, check all the active root (that is, all the fields of the reference type and the parameters and local variables of the method) to see which objects they reference. If any root references the object on the stack, the CLR will mark the object and set the bit in the synchronized block index field to 1. If the object has been marked as 1, then, the object fields will not be re-checked. The object marked as 1 is referred to as reachable. The object marked as 0 is not reachable. Now the CLR knows which objects can be deleted and which objects cannot be deleted.

Compression phase: CLR moves the memory location of the marked objects in the heap (and the reference of all the root objects of the object will naturally change), so that the marked objects are closely linked, that is, it occupies continuous memory space. This not only reduces the working set of the application, but also improves the access performance and a large amount of memory space not occupied, and solves the problem of memory fragmentation.

Finally, all threads are restored.

Objects referenced by static fields exist until the AppDomain of the loading type is detached. A common cause of Memory leakage is to allow static fields to reference a set object and add data items to the set without stopping. Static fields keep the set object alive, while the set object keeps all data items alive. Therefore, do not use static fields as much as possible. (Or refer to the previous method. When we don't need static variables, we can immediately set it to null, so the garbage will be recycled ).

There is a special case of garbage collection-Timer. The reason is that it will call the callback function at intervals, but according to the garbage collection method that we learned earlier, we can know that when the Timer variable leaves the scope, and no other function references the Timer object, timer will be recycled during garbage collection. The callback function will not be executed. (So use Timer with caution. Here is a big pitfall)

Generation: improves performance

Clr gc is a generation-based garbage collector. It makes the following assumptions about the Code:

The newer the object, the shorter the lifecycle
The older the object, the longer the lifecycle
Reclaim part of the heap, which is faster than the whole heap

0th generation: the newly added objects to the heap are called 0th generation objects, which have never been checked by the garbage collector.

1st generation: The 0th generation object is recycled once but not released as garbage, so it will be put together in the 1st generation object area during the compression phase.

2nd generation: The 1st generation object has undergone another garbage collection, but it is not released as garbage, so it will be put together in the 2nd generation object area during the compression phase. There are no 3rd generations, and the 2nd generations are the objects that have been recycled twice or more times.

When the 0th-generation memory area is full, garbage collection will be performed. At this time, not only will the 0th-generation area be reclaimed, but the 1st-generation area be determined whether it is full and the 1st-generation area will be reclaimed even if it is full, if there are inaccessible objects in the 1st generation, the 1st generation will not be recycled.

During CLR initialization, the memory budget will be selected for the three generations to determine when to recycle them. However, the CLR garbage collector is self-regulated.

That is to say

If the garbage collector finds that there are few surviving objects after 0th generations of recycling, the budget for 0th generations will be reduced, so that garbage collection will happen more frequently, however, the garbage collector does less work each time, which reduces the working set. If none of them survive, even compression is free.

If the garbage collector finds that there are many surviving objects after the first generation of garbage collection, it will increase the budget of the second generation. In this way, garbage collection will not happen frequently, however, the garbage collector recycles more memory each time. (If enough memory is not collected, the garbage collector will perform a full collection. If there is still not enough memory, an OutOfMemoryException will be thrown ).

The above example uses 0th generations, and the same applies to 1st and 2 generations.

Condition for triggering garbage collection

CLR triggers a GC when The 0th generation exceeds the budget. This is the most common condition for GC triggering. Other triggers are as follows:

Code display: Call the static Collect method of System. GC
Windows reports low memory usage
CLR is uninstalling AppDomain
CLR closing

Large Object

CLR divides objects into large objects and small objects, with a limit of 85000 bytes.

Large objects are not allocated in the address space of small objects, but are allocated elsewhere in the process address space.

The current version of GC does not compress large objects because it is too costly to move them in the memory. (May cause Space fragmentation)

Large objects are always 2nd generations, so they can only generate large objects for resources that require long-term survival. Otherwise, if large objects survive for a short period of time in the second generation, because we mentioned that too much memory is recycled at a time, the budget for generation will be reduced, resulting in more frequent recovery of 2nd generation, compromising performance.

Garbage collection mode

When the CLR starts, a GC mode is selected. The mode does not change before the process ends:

Workstation Mode
- This mode optimizes GC for client applications. GC causes low latency, and application threads are suspended for a short time to avoid user anxiety. In this mode, GC assumes that other applications running on the machine will not consume too much CPU resources.
Server Mode
- This mode optimizes GC for server applications. Throughput and resource utilization are optimized. GC assumes that no other applications are running on the machine, and assumes that all the CPUs on the machine can be used to assist in GC. In this mode, the managed heap is divided into several sections, each with one CPU. When garbage collection starts, the garbage collector runs a special thread on each CPU, and each thread and other threads concurrently recycle its own region. Concurrent collection of server applications with the same worker thread behavior can be well performed. This function requires applications to run on multiple CPU computers so that threads can actually work at the same time to improve performance.

Applications run in workstation GC mode by default. Server applications that host CLR (such as ASP. NET and SQL Server) can request CLR to load the "Server" GC, but if it is running on a single processor computer, CLR will always use the workstation GC mode.

The independent application can add the following configuration items in the configuration file to tell CLR to use the server mode:

<configuration>  <runtime>    <gcServer enabled="true"/>  </runtime></configuration>

In addition to the two modes, GC also supports two submodes: concurrency (default) and non-concurrency.

In concurrency mode, GC has an additional thread that can mark objects concurrently at runtime.

The other thread determines whether to compress the object. GC is more inclined to decide not to compress, which is conducive to performance enhancement, but will increase the application working set. The concurrent Garbage Collector consumes more memory than the non-concurrent garbage collector.

Add the following configuration items to tell CLR to use non-concurrent mode:

<configuration>  <runtime>    <gcConcurrent enabled="false"/>  </runtime></configuration>

Use the type that requires special cleanup

Most types only require memory, but some types still require local resources. For example, for the System. IO. FileStream type, you need to open a file (local resource) and save the file handle.

When GC contains the type of local resources, GC recycles the memory used by the object in the managed heap. But this will cause the leakage of local resources (GC knows nothing about it), so CLR providesEndAllows an object to execute some code after it is determined as spam, but before the object memory is recycled.

Any type that encapsulates local resources (files, network connections, sockets, mutex) supports termination. When the CLR determines that an object is unavailable, the object ends itself and releases the local resources it wraps. GC then recycles the object from the managed heap.

The C # syntax is similar to the destructor, but it represents different meanings.

Public class Troy {~ Troy () {// The code here is the code executed before garbage collection. This code will be placed in a try block, and the finally part will be placed in base. Finalize }}

This syntax finally generates a Finalize method in the IL code.

The Finalize method is called only after the garbage collection is completed. Therefore, the memory of these objects is not recycled immediately, because the Finalize method may need to execute the code for accessing fields.

An endable object must survive when it is recycled, resulting in it being promoted to another generation, so that the object remains longer than normal. This increases memory consumption, so we should try to avoid termination.

Internal principle of termination

When a new object is created, the memory is allocated in the heap. If the Finalize method is defined for the object type, the pointer pointing to the object is put intoEnd list.

End listIs an internal data structure controlled by the garbage collector. Each item in the list points to an object. Before reclaiming the object's memory, call its Finalize method.

The end list will be scanned after the mark phase is completed each time the spam object is to be recycled. If a reference to the spam object exists, the reference is removed from the end list and appendedFreachable queue. (In this case, the object will no longer be considered as garbage and its memory cannot be recycled. It is called the object resurrection)

Freachable queueIt is also an internal data structure of the garbage collector. The Finalize method is ready to be called for each object referenced in the queue.

CLR uses a special, high-priority dedicated thread to call the Finalize method to avoid deadlocks.

If the freachable queue is empty, the thread will wake up, remove each item from the queue, and call the Finalize method of each object at the same time.

Then, we enter the compression stage to promote these resurrected objects to the next generation.

Then, the freachable queue is cleared and the Finalize method of each object is executed.

When garbage collection is executed next time, because the final list does not have pointers to these objects, they are now considered real garbage and will be released.

During the whole process, the memory is released only after two garbage collection operations. In the actual process, because the object may be upgraded to another generation, more than two garbage collection operations may be required.

Manual Monitoring and Control of object lifetime

CLR provides a GC handle table for each AppDomain, allowing applications to monitor and manually control the lifetime of objects. This is too much. I don't feel like I can use it. Come back and check it out when I get it.

PS:

The efficiency of the last two chapters is really slow. On the one hand, because I did not read books and some unexpected situations during the double break, on the other hand, it was also because I had already started the core mechanism of CLR, and many of the content was really not heard, the difficulty began to increase.

In this process, the keyboard was inexplicably broken, and half of the blogs were written after two shutdowns. Today, I found that I was forced to shut down and then boot again. Half of my blog posts in my browser can be recovered.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[C # advanced series] 20 managed heaps and garbage collection,

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

[C # advanced series] 20 managed heaps and garbage collection,

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support