CLR full introduction-Study of memory problems

Source: Internet
Author: User
Tags high cpu usage
DirectoryTools used
GC performance counters
Windows performance counters
Confirm OOM exceptions in the hosting Process
Determine the cause of OOM exception
Measure the size of a managed heap.
What if the object is retained?
Will fragments on the hosting stack cause problems?
Time spent on garbage collection
Study high CPU usage

It may be very difficult to discover and correct memory problems in hosting applications. Memory problems are manifested in a variety of forms. For example, you will see that your application's memory usage is increasing, resulting in "out of memory" (OOM) exception (your application may even cause memory insufficiency exceptions when there is a large amount of available physical memory ). However, either of the following conditions indicates that the memory may be faulty:

  • Outofmemoryexception (memory insufficiency exception ).
  • The process occupies too much memory and you cannot determine any obvious reason.
  • It seems that the garbage collection function does not quickly clear objects.
  • Too many managed heap fragments.
  • The CPU usage of applications is excessive.

 

This topic discusses the study process and shows you how to collect the data you need to determine the type of memory problems in your applications. This column does not cover how to actually fix your problems, but provides you with an in-depth analysis of the root cause of the problem.

First, we will briefly introduce the most practical performance counters that can be used to study hosting memory problems. Next we will introduce the tools commonly used in the research process, and then introduce a series of common managed memory problems and how to study these problems.

However, before you start, you should first familiarize yourself with some basic concepts:

  • Microsoft. NET Framework. For more information, see blogs.msdn.com/156626.aspx and blogs.msdn.com/234273.aspx.
  • WindowsThe working principle of the virtual memory in. This includes the concept of memory retention and memory allocation.
  • Use Windows debugging programs (windbg and CDB ).

 

Tools used

Before getting started, we should spend some time discussing some of the tools that are commonly used to diagnose memory-related problems.

Generally, You Want To Know performance counters first. With these counters, you can collect the necessary data to determine the exact location where the problem occurs. Although some other tools are worth noting, the most useful performance counters are the performance counters described on. NET Framework.

Here, we will use windbg, which is provided with the Windows debugging tool. The son of strike extension (SOS) provided in SOS. dll is used to debug managed code in windbg. After you start the debug program and attach it to a managed process (or load a Failover), you can load SOS. DLL by typing the following code:

.loadby sos mscorwks

If the application you are debugging uses different versions of mscorwks. DLL, the command cannot be executed, so you should find the mscorwks used by the application. the SOS. DLL, and then run the following command:

.load <path_to_sos>/sos.dll

SOS. dll is installed with the. NET Framework in the % WINDIR %/microsoft.net/framework/ <. Net version> directory. The SOS. dll extension provides a large number of useful commands for checking managed heaps. For documentation on all these commands, see SOS debugging extensions (SOS. dll ).

 

Windows Task Manager taskmgr.exe can easily detect memory usage that exceeds expectations and check the trend of some simple process indicators over a period of time. Note: taskmgr has two frequently-Misunderstood metrics: Mem usage (memory usage) and VM Size (virtual memory size ). Mem usage indicates the process working set (like the process/working set performance counter ). It does not indicate the number of bytes used. VM Size reflects the number of bytes used by the process (like the process/dedicated byte Performance Counter ). VM Size provides the first clue about whether you are faced with Memory leakage (If your application is leaked, the VM Size will increase over time.

Most of the research techniques described in this column depend on memory dump. There are two ways to use the debug program-you can attach it to a running process or use it to analyze the fault dump. The first method provides a direct view so that you can understand the running status of the application, but this technique is not always feasible.

The memory dump has the advantage of analyzing the data collection phase from the actual problem research phase. If you want to diagnose a problem on a working server, it is easier to analyze the memory dump using different machines offline.

The. Dump/Ma dmpfile. dmp command in the debugging program can be used to create a full memory dump. Make sure that you always capture the full dump when studying memory issues because the small dump does not contain all the information you need.

The adplus tool (included in Windows debugging tools) is very helpful for collecting fault dump. For more information, see the bugslayer column of John Robbins since January 1, March 2005.

In this column, we will assume that the dump file is always loaded in the debugging program (the fault dump can be loaded using the file | open crash dump command), or that the debugging program is always attached to the process, and stop the execution at the breakpoint.

GC performance counters

The first step of each study is to collect relevant data and make assumptions about the locations where problems may occur. Generally, the performance counter starts. Counters can be used through the. NET Framework performance console, which provide useful information about the garbage collector (GC) and garbage collection process. Note that. Net memory performance counters are updated only when collected, rather than based on the sampling rate used in Performance Monitor applications.

You should first check % time in GC (percentage of time spent on GC ). It indicates % time in GC since the end of the last collection. If you find this value very high (assuming 50% or higher), you should check what happens inside the managed heap. If the % time in GC does not exceed 10%, it usually does not have to spend time trying to reduce the GC time spent on collection, because the benefits of doing so are minimal.

If you think your application spends too much time on garbage collection, the next performance counter to be checked is allocated Bytes/sec (number of bytes allocated per second ). This counter shows the allocation rate. However, this counter is not very accurate when the allocation rate is very low. If the sampling frequency is higher than the collection frequency, the counter may be displayed as 0 bytes/second, because the counter is updated only at the beginning of each collection. This does not mean that no allocation operation is performed, but the counter is not updated because no collection occurs during this interval. Since we have learned that the time spent in garbage collection is an important factor, we will learn more about % time in GC later.

If you think you want to collect a large number of large objects (85,000 bytes or larger), you need to check the size of the large object heap (LOH. It is updated with allocated Bytes/sec at the same time.

High distribution rate may cause a large amount of collection work, so % time in GC may be relatively high. One factor that can mitigate this phenomenon is whether objects often die very early, because they are usually collected during the collection process of level 1. To determine the impact of object lifecycles on collection, check the performance counters collected at all levels: # Gen 0 collections (0th collection times), # Gen 1 collections (1st collection times), # Gen 2 collections (2nd-level collection times ). These performance counters show the number of times objects at all levels are collected after the process is started. Collection at levels 0th and 1st is usually very costly, so they do not have a big impact on the performance of the application. The overhead of the 2nd-level collector is very high.

The primary principle is that the reasonable ratio between different levels of collection is that every 10 times a 1st-level collection is performed and a 2nd-level collection is performed. If you find that a large amount of time is spent on garbage collection, it may be caused by the high frequency of collection at level 2nd. You should check the ratio mentioned above to make sure that the ratio of the number of collection at level 2nd to that at level 1st is not too high.

You may find that % time in GC is very high, but the allocation rate is not high. This situation occurs if many objects you allocate can be retained after garbage collection and upgraded to the next level. Increase counter-the promoted memory from gen 0 and the promoted memory from Gen 1 memory can tell you if there is a problem with the increase rate. We hope to avoid increasing the speed from 1st. This is because you may have a large number of objects for a long time, which is enough to increase to 2nd, but the time is not enough to keep them in 2nd. Once it reaches 2nd, the collection overhead of these objects will be greater than they die in 1st. (This phenomenon is called the middle-aged crisis. For more information, see blogs.msdn.com/41281.aspx .) CLR profiler helps you know which objects have been in existence for too long.

A large number of heap sizes of hundreds of thousands and hundreds of thousands is usually related to a high value in the increase rate counter. You can check the size of the GC heap by using the heap size of level 1st and level 2nd. There is a 0th-level heap size counter, but it is not used to measure the 0th-level heap size. It is used to represent a 0th level space budget-the number of bytes that you can allocate at The 0th level before triggering the next 0th level collection.

If you use a large number of objects that require termination-for example, objects that depend on the COM component for some processing-in this case, you can take a look at the promoted finalization-memory from gen 0 (terminal memory upgraded from level 0th) counter. This counter tells you the amount of memory that cannot be reused because the memory-used object needs to be added to the end queue and cannot be collected immediately. Idisposable and C # and Visual BasicThe Using statement in can help reduce the number of objects ended in the end queue, thus reducing the related overhead.

Use # total committed bytes (total number of bytes provided) and # total reserved bytes (total number of reserved bytes) to find detailed data about the heap size. These counters indicate the total number of memories currently provided and retained on the GC stack. (The total byte value is slightly greater than the actual heap size of 0th + heap size of 1st + heap size of 2nd + large object heap size .) When the GC allocates a new heap segment, the memory is reserved for the segment and the memory is provided only when necessary. Therefore, the total number of reserved bytes can be greater than the total number of provided bytes.

Check whether the application has triggered too many collections. # The induced GC (number of GC triggered) Counter tells you how many times a collection has been triggered since the process was started. Generally, it is not recommended that you trigger multiple GC collections. In most cases, if the # induced GC value is high, you should regard it as a bug. In most cases, GC is triggered to reduce the size of the heap, but this is not an ideal choice. You should know why your heap size has increased.

Windows performance counters

So far, we have learned some of the most useful. Net memory counters. However, you should not ignore the value of other counters. There are many Windows performance counters (which can also be viewed through perfmon.exe) that provide useful information for studying memory problems.

The available bytes counters listed below the memory (memory) category report available physical memory. It clearly shows whether your physical memory is too low. If the physical memory of the machine is too low, pagination may occur or paging will soon occur. This data is very useful for diagnosing OOM problems.

The % committed bytes in use (percentage of bytes in use) Counter (also in the memory category) provides a ratio of memory usage to total memory. If this value is very high (assuming it exceeds 90%), you should begin to check for the provided memory failure. This clearly indicates that the system memory is insufficient.

The private bytes (number of private bytes) Counter in the process category indicates the amount of memory that is used and cannot be shared with other processes. If you want to know how much memory your process uses, you should monitor this counter. If you encounter Memory leakage, the number of private bytes will increase over time. This counter also clearly shows the impact of your application on the entire system-using a large number of private bytes will have a great impact on the machine because the memory cannot be shared with other processes. This is important in some situations, such as terminal services. In this case, you need to maximize the amount of memory shared between user sessions.

Confirm OOM exceptions in the hosting Process

The performance counter can clearly indicate whether you are facing memory problems. However, in most cases, memory problems can be detected only when your application encounters an exception of insufficient memory. Therefore, you need to know whether you are actually experiencing an OOM exception caused by hosting code.

After you load SOS. dll, type the following command in the debugging program:

!pe

This is! The abbreviation of printexception. It will output the final hosting exception on the thread (if any) without parameters. Figure 1 shows an example of an OOM managed exception.

 

If there is no hosting exception on the current thread, you do not have to know which thread OOM comes from. To understand this, type the following code in the debugging program:

~*kb

Here, KB is the abbreviation of display stack backtrace. It lists the calls of all threads and their stacks (see figure 2 ). In the output, find the threads and stacks with abnormal calls. The easiest way is to find mscorwks: raisetheexception.

 

The raisetheexception function parameter in mscorwks is a hosted exception object. You can use it! PE dumps it. In addition! PE also has a-nested option to dump all nested exceptions except top-level exceptions.

Another way to find the thread that causes oom is to use SOS! Threads command. The last column of the displayed table contains the managed exceptions recently caused by various threads.

If you do not find an OOM exception when using these techniques, the OOM is not hosted, and the exception you face is caused by the local code. In this case, you need to pay attention to the local code used by your application (the discussion on this issue is beyond the scope of this column ).

Determine the cause of OOM exception

After you confirm that this is an OOM exception, check the cause of the OOM. In either case, the hosted Oom-process consumes virtual memory, or there is not enough physical memory to provide.

GC needs to allocate memory for its segments. When GC decides that it needs to allocate a new segment, it will call virtualalloc to reserve space. If the segment does not have enough contiguous available blocks, the call fails, and GC cannot meet new memory requests.

In the debugging program ,! The address command displays the maximum available regions of the virtual memory. The output will be similar:

0:119>!address -summary... [omitted]Largest free region: Base 54000000 - Size 03b600000:119>? 03b60000Evaluate expression: 62259200 = 03b60000 

If the maximum number of available virtual memory blocks available for a 32-bit operating system is less than 64 MB (less than 1 GB on a 64-bit operating system ), if the virtual memory is exhausted, OOM (out of memory, insufficient memory) may occur ). (In a 64-bit operating system, applications are unlikely to exhaust virtual memory space .)

 

If there are too many fragments of the virtual memory, the process may exhaust the virtual space. Generally, managed heaps do not generate Virtual Memory fragments, but this may also happen. For example, if an application creates a large number of temporary large objects, which causes LOH to continuously acquire and release virtual memory segments, this may happen.

! The eeheap-gc sos command displays the starting position of each garbage collection segment. You can match it! Address output Association to determine whether the virtual memory fragments are caused by the managed heap.

The following are other common scenarios that may cause Virtual Memory fragments.

  • Always load and uninstall many small sets.
  • A large number of com dll files are loaded due to com interoperability.
  • In the managed heap, the Assembly and com dll are not loaded at the same time. A common cause of this problem is that ASP. net sites are compiled when the debug configuration flag is enabled. This will cause each page to be compiled in its own set, and may generate virtual memory space fragments that can cause OOM problems.

 

The operating system does not need to provide physical memory to retain memory. Physical memory is allocated only when the GC (Garbage Collector) provides physical memory. If you use a very low physical memory to run the system, an OOM exception occurs. One easy way to check whether your physical memory is too low is to open Windows Task Manager and view the "memory usage" area on the "performance" tab.

Figure 3 shows that the system provides a total of 1981304 KB of memory, and the total memory is 2518760 kb. When the total provided memory is close to the total memory, the system will exhaust the available memory.


Figure 3 view available memory in Task Manager (click the image to get a smaller view)
Figure 3 view available memory in Task Manager (click the image to get a larger view ))

GC does not provide the entire segment at a time. Instead, provide segments in the form of multiple blocks as needed. (Note: the number of bytes provided by the managed heap is expressed by # total committed bytes instead of # bytes in all heaps (the number of bytes in all heaps ). This is because the size of # bytes in all heaps contained in The 0th generation is not the actual memory used in The 0th generation, but its budget .)

You can use the user mode analyzer (such as the CLR analyzer) to learn which objects cause such high memory usage. However, in some cases, the overhead of running analyzer is unacceptable-for example, this is true when debugging problems need to be performed on the production server. In this case, one alternative is to take a memory dump and use the debugger to analyze it. Next we will introduce how to use the debugger to analyze managed heaps.

Measure the size of a managed heap.

The first thing you need to know when to measure the hosting heap size is when. Should it be measured before, after, or during garbage collection? The best time to measure the heap size is always at the end of the 2nd generation collection, because the entire heap will be collected when the 2nd generation collection is complete.

To view the object at the end of the 2nd generation garbage collection, you can set the following breakpoint in the debugger (for garbage collection on the server, you only need to replace wks with SVR ):

bp mscorwks!WKS::GCHeap::RestartEE "j(dwo(mscorwks!WKS::GCHeap::GcCondemnedGeneration)==2)'kb';'g'"

 

Now you will stop the 2nd generation garbage collection. The next step is to view the objects on the managed stack. These objects are retained after garbage collection, and you want to know why they are retained.

! The dumpheap-STAT command can perform a complete dump of objects hosted on the stack. (Therefore, if the heap is large ,! The dumpheap command may take some time to complete .)! The list generated by the dumpheap command is classified by type and memory usage. This means that you can analyze the last few rows, because these rows represent objects that occupy most of the space.

In the example in figure 4, the string occupies most of the space. If the string is the root cause of the problem, this problem is often easy to solve. The content of a string can reflect its source.

You can also view strings in the bucket. For example, you can check all strings with a size between 150 and 200, as shown in 5. In this example, a large number of strings are very similar. Therefore, instead of retaining so many strings, it is better to separate their common parts ("pendingorder-") and those numbers for better efficiency.

We have seen many times that the managed heap contains identical strings that have already been repeated for thousands of times. The result is a large working set of strings occupying a large amount of memory. In this case, it is better to use strings to reside.

For other types that are not as obvious as strings, you can use! Gcroot to see why these objects are active (see figure 6 ). Note: If the object graph is very large ,! It may take a long time to execute the gcroot command.

In addition to the objects stored on the managed stack, the memory provided for your process also contains the memory allocated in The 0th generation. If you allow 0th generations to increase before the next garbage collection, you may also observe that memory usage increases due to this problem. This situation is more common in 64-bit windows than in 32-bit windows .! The eeheap-gc sos command displays the size of generation 0th.

What if the object is retained?

Sometimes developers think that some of their objects should be in the dead state, but GC does not seem to clear these objects. The most common cause of this phenomenon is:

  • Strong references to these objects still exist.
  • In the last collection object generation, the object is not in the dead state.
  • The object is in the dead state, but collection of the generation of the objects is not triggered yet.

 

For the first and second cases, you can use! Gcroot checks whether there is a strong reference to keep the object. One possibility that people often ignore is that objects are retained because they are still in the final queue when the terminator thread is blocked. The reason is that they cannot call a single thread unit (STA) thread, So messages are not extracted to run the terminator (for more information, see support.microsoft.com/kb/828988 ). You can add the following code to check whether the problem exists:

GC.Collect();GC.WaitForPendingFinalizers();GC.Collect();

The above code can fix this problem because waitforpendingfinalizer can extract messages. However, once this problem is confirmed, you should use thread. Join, because waitforpendingfinalizer is a very heavy thread.

 

You can also run the following SOS command to check whether the problem exists:

!finalizequeue

View the number of objects to be terminated-not the number of objects to be terminated ". When the Terminator is blocked, the terminator thread displays which Terminator is currently running (if any ). (See the final queue example in Figure 7 .)

 

An easy way to understand the terminator thread is to view it! Threads-Special output. The stack shown in Figure 8 shows the usual state of the terminator thread-it is waiting for an event to indicate that there is a terminator to run. When a Terminator is blocked, you will see that the Terminator is running.

The third reason should not be the problem. Generally, collection is triggered only when GC considers this to be effective unless you manually initiate garbage collection. This means that an object may already be in the dead state, but will not immediately recycle the memory it occupies. However, GC operations become more active when the system's physical memory is very tight.

Will fragments on the hosting stack cause problems?

Fragments are the main cause of attention when identifying memory problems. It is important because you need to know how much space is wasted on the hosting stack. The number of fragments in the managed heap is expressed by the space occupied by available objects. You can use it! The dumpheap command shows how much memory is available on the managed stack, as shown below:

0:000>!dumpheap -type Free -stattotal 230 objectsStatistics:      MT    Count    TotalSize   Class Name00152b18      230    40958584          FreeTotal 230 objects

In this example, the output shows that there are 230 available objects, about 39 MB in total. Therefore, the heap shards are 39 MB.

 

When you try to determine whether the fragments cause problems, you need to understand what the fragments mean for different generations. For The 0th generation, fragments do not constitute a problem because GC can be allocated in the shard space. For the 1st and 2nd generations, fragmentation may cause problems. To use fragment space in generation 1st and generation 2nd, GC must collect and escalate objects to fill these gaps. However, because the size of the 1st generation does not exceed one segment, you usually need to pay attention to the 2nd generation.

Too much dingtalk is usually the cause of too many fragments .. Net Framework 2.0 has made a lot of improvements in reducing the fragmentation problem caused by dingtalk (for more information about GC improvement in Net Framework 2.0, see the Blog content on the following website: blogs.msdn.com/476750.aspx). However, if too many applications are still pinned, a large number of fragments are still displayed. You can use an SOS command! Gchandles to check the number of pinned handles (see figure 9 ). You can also use it! Objsize shows which objects are pinned, as shown in 10.

The fragments in the LOH are intended because we didn't compress the LOH. This does not mean that the allocation on Loh is the same as that on the malloc using the NT heap manager! Due to the working characteristics of GC, adjacent available objects are naturally folded into a large available space, which can be used to meet the allocation requests of large objects.

Time spent on garbage collection

Developers often need to understand the time taken by GC for each collection. In real-time software, this data is often very important, because in this case, there are certain restrictions on the conditions such as the response time that the application must comply. This is of course an important consideration, because spending too much time on garbage collection means that the CPU is used for actual processing.

The easiest way to learn about the time spent on garbage collection is to view the % time in GC performance counters. This counter is updated at the end of collection, showing the ratio of the time consumed by the GC just completed to the time experienced after the last GC. If no collection occurs during the sampling interval, the counter is not updated. The value you see is the same as that of the previous one. Because you know the sampling interval in the Performance Monitor application (the default sampling interval in perfmon is 1 second), you can roughly calculate the time.

Figure 11 shows some examples of garbage collection data. You will see 0th generation collection at the second and third intervals. This method is not 100% accurate because we do not know exactly when collection occurs during these intervals. However, it is very useful for predicting GC time.

Consider the following example, which is the most challenging case for 11th GC operations. Assume that the tenth generation of 0th collection is completed at the beginning of the second interval, and the second generation of 11th collection is completed at the end of the third interval. This means that the time between the two collection ends is about two sampling intervals, or two seconds. The % time in GC counter is displayed as 3%, so the 11th generation collection took only 0th (or 60 ms) of 2 seconds ).

Study high CPU usage

When collection occurs, the CPU usage should be high, so that GC can be completed as soon as possible. Figure 12 shows an example of CPU usage that is always very high during collection. All the peaks in the % process time counter correspond directly to the changes in % time in GC. Obviously, this will never happen in practice, because other processes will also use CPU except gc cpu. To determine which processes are occupying the CPU cycle, you can use the CPU analyzer to check which functions occupy most of the CPU time.


Figure 12 when the collection results in CPU usage (click the image to get a smaller view)
Figure 12 when the collection results in CPU usage (click the image to get a larger view ))

If GC takes too much CPU time, it indicates that the collection frequency is too high or the collection process takes too long. Consider the situation when collection is triggered by allocation. The allocation rate is the main factor that determines the collection Trigger frequency.


Figure 13 inaccurate data generated after the collection is dispersed

When collection starts, the allocated Bytes/sec counter is updated by adding the allocated bytes in the 0th generation and LOH. Because the counter is expressed at a rate, the actual value you see is the difference between the last two values divided by the time interval value. For example, figure 13 shows how the sampling rate is 1 second and the collection only happens after a certain interval. When collection occurs, the performance counters are updated as follows:

Allocation = 850-250=600KBAlloc/Sec = 600/3=200KB/sec

 


Figure 14 the data is accurate.

Although the application has been allocated, the performance counter does not reflect this situation because it will not be updated until the next collection. If the collection frequency is higher, you will get a clearer image (see figure 14 ).

A common error is that the allocation rate is used to measure the time spent in garbage collection-the time spent in the collection process is determined by the memory size of the GC check. Since GC only checks the reserved objects, a long collection time means that many objects are retained. If this happens, you can use the techniques discussed above to determine why so many objects are retained.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.