Comprehensive and thorough analysis of CLR: Revealing the secrets of Large Object heaps

Source: Internet
Author: User
Tags rounds
Transferred from: http://msdn.microsoft.com/zh-cn/magazine/cc534993.aspx maoni Steens

Directory large object heap and GC
When to recycle large objects
LOH performance Significance
Reclaim LOH performance data
Use the debugger

The CLR garbage collector (GC) divides objects into two types: large and small. If it is a large object, its related attributes are more important than the object's hour. For example, the cost of compressing large objects (copying memory to other locations on the heap) is quite high. In this month's column, I will dive into large object heaps. I will discuss what qualified objects can be called large objects, how to recycle these large objects, and what performance significance large objects have.

Large Object heap and GC are in Microsoft. NET Framework 1.1 and 2.0. objects larger than or equal to 85,000 bytes are considered as large objects. This number is determined based on the performance optimization results. After an object allocation request is passed in, if this threshold value is met, the object is allocated to a large object heap. What does this mean? To understand this content, first understand some basic knowledge about the. NET garbage collector, which may be helpful. As we all know, the. NET garbage collector is a generational recycler. It contains three generations: 0th, 1st, and 2nd. Generation division is because in a well-tuned application, you can clear most objects in generation 0th. For example, in a server application, the allocation associated with each request is cleared after the request is completed. The existing allocation request will be forwarded to the 1st generation and cleared there. Essentially, the 1st generation is the buffer zone between the new object region and the object region with a long lifetime. From the perspective of generation division, large objects belong to the 2nd generation, because they can be recycled only during the 2nd generation recycling process. When the first generation is recycled, all previous generations are also recycled. For example, the 1st and 1st generations of garbage collection will be recycled at the same time. The entire heap will be reclaimed when the 2nd-generation garbage collection is executed. Therefore, 2nd-generation garbage collection is also called complete garbage collection. In this column, I will use the term "2nd-generation garbage collection" instead of "full garbage collection", but they can be exchanged. Each generation of the garbage collector heap is divided by logic. In fact, the object exists in the managed stack segment. The managed stack segment is the memory block that the Garbage Collector retains on the operating system by calling VirtualAlloc. When the CLR is loaded, two initial stack segments will be allocated (one for small objects and the other for large objects). I will call them small object heaps (SOH) and large object heap (LOH ). Then, the allocation request is met by placing the managed object in any managed stack segment. If the object is smaller than 85,000 bytes, it is placed on the SOH segment; otherwise, it is placed on the LOH segment. As more and more objects are allocated to each segment, these segments are submitted in smaller pieces. For SOH, unprocessed garbage collection objects will enter the next generation. Therefore, the 0th-generation garbage collection objects will be treated as 1st-generation objects, and so on. However, the last generation will still be treated as the objects in the last generation. That is to say, the 2nd-generation garbage collection unprocessed objects are still 2nd-generation objects; the LOH unprocessed objects are still LOH objects (from 2nd-generation garbage collection ). User code can only be allocated in generation 0th (small object) or LOH (large object. Only the garbage collector can recycle unprocessed objects in the 1st generation (by upgrading the 0th generation) and the 2nd generation (by upgrading the 1st generation and 2nd generation to recycle unprocessed objects). After garbage collection is triggered, the garbage collector searches for existing objects and compresses them. However, for LOH, due to the high compression cost, the CLR team will select to scan all objects and List objects that have not been cleared for future reuse, so as to meet the allocation request of large objects. Adjacent purged objects are grouped into a free object. It is important to note that although we will not compress LOH at present, it may be compressed in the future. Therefore, if you allocate large objects and want to ensure they are not moved, you should fix them. Note that the following figure is only used for description. I use a few objects, only to illustrate the events that occur on the stack. In fact, there are still many objects. Figure 1 illustrates a situation in which the first generation of GC is formed after the first generation of GC, where Obj1 and Obj3 are cleared, and the first generation of GC is formed after the first generation of GC, obj2 and Obj5 are cleared. Figure 1SOH allocation and garbage collection(Click the image to view the larger image.) Figure 2 shows that after the garbage collection of generation 2nd, you will see that Obj1 and Obj2 are cleared, in the memory, the space originally stored in Obj1 and Obj2 will become an available space, which can then be used to meet the allocation requests of obj4. The space from the last object Obj3 to the end of this segment can still be used for future allocation requests. Figure 2LOH allocation and garbage collection(Click an image to view the larger image) If there is not enough space available to accommodate large object allocation requests, I will first try to get more segments from the operating system. If it fails, I will trigger 2nd generation garbage collection to release some space. During the 2nd generation garbage collection, I will seize the opportunity to release segments that do not contain any activity objects back to the Operating System (by calling VirtualFree ). Memory from the last object to the end of the segment will be returned. Besides, although the available space has been reset, they are still submitted, which means that the operating system does not need to re-write the data to the disk.Figure 3I released a segment (Segment 2) back to the operating system and returned more space in the remaining segment. If you need to use the returned space at the end of the Section to meet the new large object allocation request, I can submit the memory again. Figure 3Removed segments released on LOH during garbage collection(Click an image to view the larger image.) for more information about submission and return, see the MSDN documentation for VirtualAlloc. The URL is go.microsoft.com/fwlink /? LinkId = 116041.

When to recycle large objects to determine when to recycle large objects, we will first discuss when garbage collection will usually be performed. In any of the following cases, garbage collection is performed:Allocation exceeds 0th generation or large object thresholdMost GC operations are performed because they need to be allocated on the managed stack (this is the most typical case ).Call System. GC. CollectIf GC. Collect is called for the 2nd generation (by not passing a parameter to GC. Collect or passing GC. MaxGeneration as a parameter), LOH and other managed heaps are immediately reclaimed.System memory is too lowThis happens when you receive high memory notifications from the operating system. If I think the execution of 2nd generation garbage collection will be helpful, a garbage collection will be triggered. The threshold value is the attribute of each generation. When an object is assigned to a generation, the memory of the generation is increased to bring it closer to the threshold of the generation. When the threshold of a generation is exceeded, garbage collection is triggered in this generation. Therefore, when you allocate small or large objects, use the threshold values of generation 0th and LOH respectively. When the garbage collector is allocated to the 1st and 2nd generations, the 1st generation threshold is used. These thresholds are dynamically adjusted when you run this program.

For the performance of LOH, let's look at the allocation cost. CLR makes sure that the memory of each new object I provide is cleared. This means that the allocation cost of large objects is completely determined by the cleaned memory (unless garbage collection is triggered. If two rounds are required to clear 1 byte, this means that the smallest large object can be cleared in 170,000 rounds. This is common for people who allocate large objects. For 16 MB objects on 2 GHz computers, it takes about 16 ms to clear the memory. These costs are quite large. Now let's take a look at the recovery costs. As mentioned above, LOH and generation 2nd will be recycled together. If any of the two thresholds is exceeded, 2nd-generation recovery is triggered. If 2nd is triggered for LOH, the 2nd generation itself may not necessarily become smaller after garbage collection. Therefore, if there is not much data in the 2nd generation, this will not be a problem. However, if the 2nd generation is large, triggering multiple 2nd generation garbage collection may cause performance problems. If you want to temporarily allocate many large objects and you have a large SOH, it may take a long time to run garbage collection. Without a doubt, if you continue to allocate and process real large objects, the allocation cost will definitely increase significantly. The extra-large objects on the LOH are usually arrays (there are very few very large instance objects ). If the array element contains many references, the cost will be high. If the element does not contain any references, you do not need to process the array at all. For example, if an array is used to store nodes in the binary tree, one method is to reference the Left and Right nodes of a node based on the actual node:

Copy code

class Node{    Data d;    Node left;    Node right;};Node[] binary_tr = new Node [num_nodes];

If num_nodes is large, it means at least two references must be processed for each element. Another method is to store the indexes of the left and right nodes:

Copy code

class Node{    Data d;    uint left_index;    uint right_index;};

In this way, you can reference the data on the left node as binary_tr [left_index]. d instead of as left. d reference. The garbage collector does not need to view any references on the left and right nodes. Among the three recycling reasons, the first two usually appear more than the third. Therefore, it is better to allocate a large object pool and reuse these objects instead of allocating temporary objects. Yun Jin's blog log (go.microsoft.com/fwlink /? LinkId = 115870) describes an example of this type of buffer pool. Of course, you may want to increase the buffer size.

LOH performance data can be recycled in some ways. However, before introducing them, let's talk about why we need to recycle them. Before you start to recycle performance data for a specific region, you must find the reason for this region, if you have viewed other known regions but have not found any problems, you can explain the performance issues you need to solve. For more information, see go.microsoft.com/fwlink /? LinkId = 116467 ). In the log, I introduced the basic knowledge of memory and CPU. In addition, in the November 2006 journal, the "complete and thorough analysis of CLR" was used to investigate memory problems, describes how to diagnose performance issues that may be related to the managed heap during hosting (see msdn2.microsoft.com/magazine/cc163528 ).. Net clr memory performance counters are usually the first step to investigate performance problems. The LOH-related counter displays the number of 2nd-generation recyclers and the size of large object heaps. The number of 2nd-generation garbage collection shows the number of times the 2nd-generation garbage collection has been executed since the recycling process. The counter will increase at the end of the 2nd generation garbage collection (also known as full garbage collection. This counter displays the final value. The size of a large object heap refers to the current size (in bytes, including available space) of a large object heap ). This counter will be updated at the end of garbage collection, rather than at each allocation. A common way to view performance counters is to use performance monitor (PerfMon.exe ). You can use "add counter" to add the counters you are interested in, as shown in figureFigure 4. Figure 4Add counters to Performance Monitor(Click an image to view the big chart) You can save the performance counter data in the log file of the Performance Monitor, or query the performance counter programmatically. Most people use this method during routine testing. If you find that the value displayed by the counter is abnormal, you can use other methods to obtain more details for investigation.

Before using the debugger, note that the Debugging commands I mentioned in this section are only applicable to Windows debugger. To view the actual objects on LOH, you can use the SoS debugger extension provided by CLR, which has been introduced in the November 2006 journal mentioned earlier.Figure 5The output example of LOH analysis is displayed.Figure 5The size of the LOH heap is 16,754,224 + 16,699,288 + 16,284,504 =) 49,738,016 bytes. Between 023e1000 and 033db630, The System. Object [] Object occupies 8,008,736 bytes, The System. Byte [] Object occupies 6,663,696 bytes, and the available space occupies 2,081,792 bytes.

Figure 5 LOH output

Copy code

0:003> .loadby sos mscorwks0:003> !eeheap -gcNumber of GC Heaps: 1generation 0 starts at 0x013e35ecgeneration 1 starts at 0x013e1b6cgeneration 2 starts at 0x013e1000ephemeral segment allocation context: none segment    begin allocated     size0018f2d0 790d5588  790f4b38 0x0001f5b0(128432)013e0000 013e1000  013e35f8 0x000025f8(9720)Large object heap starts at 0x023e1000 segment    begin allocated     size023e0000 023e1000  033db630 0x00ffa630(16754224)033e0000 033e1000  043cdf98 0x00fecf98(16699288)043e0000 043e1000  05368b58 0x00f87b58(16284504)Total Size  0x2f90cc8(49876168)------------------------------GC Heap Size  0x2f90cc8(49876168)0:003> !dumpheap -stat 023e1000  033db630total 133 objectsStatistics:      MT    Count    TotalSize Class Name001521d0       66      2081792      Free7912273c       63      6663696 System.Byte[]7912254c        4      8008736 System.Object[]Total 133 objects

Sometimes, you can see that the total LOH size is less than 85,000 bytes. Why? This is because the runtime actually uses LOH to allocate objects smaller than large objects. Since LOH is not compressed, sometimes people suspect that LOH is a shard source. In fact, before you come to this conclusion, you 'd better find out what fragments are. There is a managed heap fragment, which is indicated by the amount of available space between managed objects (in other words, it is executed in SoS! Dumpheap-type Free content), and virtual memory (VM) address space fragments, that is, markedMEM_FREEMemory and content that can be seen through various debugger commands in windbg (see go.microsoft.com/fwlink /? LinkId = 116470 ).Figure 6Displays fragments in the virtual memory space (note the bold text in the figure ).

Figure 6 VM space fragmentation

Copy code

0:000> !address    00000000 : 00000000 - 00010000                    Type     00000000                     Protect  00000001 PAGE_NOACCESS                    State    00010000 MEM_FREE                    Usage    RegionUsageFree    00010000 : 00010000 - 00002000                    Type     00020000 MEM_PRIVATE                    Protect  00000004 PAGE_READWRITE                    State    00001000 MEM_COMMIT                    Usage    RegionUsageEnvironmentBlock    00012000 : 00012000 - 0000e000                    Type     00000000                     Protect  00000001 PAGE_NOACCESS                    State    00010000 MEM_FREE                    Usage    RegionUsageFree... [omitted]-------------------- Usage SUMMARY --------------------------    TotSize (      KB)   Pct(Tots) Pct(Busy)   Usage     701000 (    7172) : 00.34%    20.69%    : RegionUsageIsVAD   7de15000 ( 2062420) : 98.35%    00.00%    : RegionUsageFree    1452000 (   20808) : 00.99%    60.02%    : RegionUsageImage     300000 (    3072) : 00.15%    08.86%    : RegionUsageStack       3000 (      12) : 00.00%    00.03%    : RegionUsageTeb     381000 (    3588) : 00.17%    10.35%    : RegionUsageHeap          0 (       0) : 00.00%    00.00%    : RegionUsagePageHeap       1000 (       4) : 00.00%    00.01%    : RegionUsagePeb       1000 (       4) : 00.00%    00.01%    : RegionUsageProcessParametrs       2000 (       8) : 00.00%    00.02%    : RegionUsageEnvironmentBlock       Tot: 7fff0000 (2097088 KB) Busy: 021db000 (34668 KB)-------------------- Type SUMMARY --------------------------    TotSize (      KB)   Pct(Tots)  Usage   7de15000 ( 2062420) : 98.35%   : <free>    1452000 (   20808) : 00.99%   : MEM_IMAGE     69f000 (    6780) : 00.32%   : MEM_MAPPED     6ea000 (    7080) : 00.34%   : MEM_PRIVATE-------------------- State SUMMARY --------------------------    TotSize (      KB)   Pct(Tots)  Usage    1a58000 (   26976) : 01.29%   : MEM_COMMIT   7de15000 ( 2062420) : 98.35%   : MEM_FREE     783000 (    7692) : 00.37%   : MEM_RESERVELargest free region: Base 01432000 - Size 707ee000 (1843128 KB)

As mentioned above, fragments on the managed Stack are used to allocate requests. Generally, we can see more virtual memory fragments caused by temporary large objects. Garbage collection is required frequently to obtain new managed heap segments from the operating system, release the empty managed heap block to the operating system. To verify whether LOH generates VM fragments, you can set a breakpoint on VirtualAlloc and VirtualFree to check who called them. For example, if you want to know who has tried to allocate a VM block larger than 8 MB from the operating system, you can set the breakpoint as follows:

Copy code

bp kernel32!virtualalloc "j (dwo(@esp+8)>800000) 'kb';'g'"

If the allocated size is greater than 8 MB (0x800000) When VirtualAlloc is called, this code will interrupt the debugger and display the call stack. Otherwise, the debugger will not be interrupted. In CLR 2.0, we added the VM Hoarding function. If you need to frequently obtain and release segments (including segments used for large object heaps and small object heaps ), you can use this function. To specify the VM Hoarding function, use the host API to specify the startup flag named STARTUP_HOARD_GC_VM (see go.microsoft.com/fwlink /? LinkId = 116471 ). After this flag is specified, only the memory on these segments will be returned and added to the slave list, instead of releasing the empty segments back to the operating system. Segments on the backup list can be used to meet new segment requests. Therefore, if you need a new segment next time, you can use it if it is large enough from the backup list. Note that this function does not work for too many segments. This feature can also be used by some applications to hold the segments they have obtained, such as some server applications, to avoid generating VM space fragments as much as possible to prevent memory insufficiency errors. These operations can be performed because they are generally the main applications on the computer. We strongly recommend that you carefully test your application when using this function to ensure stable memory usage. Large objects are expensive. Since CLR needs to clear some memory for newly allocated large objects to ensure that CLR can clear all memory for newly allocated objects, the allocation cost is quite high. LOH will be recycled together with the rest of the heap, so carefully analyze the impact on your application performance. If possible, we recommend that you reuse large objects to prevent fragments from being generated in the hosted heap and VM space. Finally, LOH cannot be compressed in the recycling process until now, but it should not be relied on in this implementation details. Therefore, to ensure that some content is not moved by GC, always fix it. Now, use the LOH knowledge you just learned to control the heap.

Please send your questions and comments to clrinout@microsoft.com.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.