Use ETW to diagnose the performance of. NET Applications

Last Update:2018-12-08 Source: Internet

Author: User

Tags mscorlib

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Download Sample Code

You can write a hosting application for your own experience-you may feel it is slow. Your application is functional normally, but its performance has many shortcomings. If you want to diagnose performance problems and solve these problems, but your application is running in the production environment, you cannot install or interrupt it. Alternatively, your applications may not be widely used and it cannot be proved that it is reasonable to purchase a Visual Studio probe for CPU analysis.

Fortunately, Windows event tracking (ETW) can alleviate these problems. This powerful logging technology is built in multiple parts of the Windows infrastructure, and Microsoft. NET Framework 4 CLR uses this technology to make it easier to analyze your hosted applications. ETW collects system-wide data and analyzes all resources (CPU, disk, network, and memory) to make it useful for obtaining the overall view. In addition, the ETW ecosystem can be adjusted to reduce its overhead and make the system suitable for production diagnosis.

This article aims to help you understand the benefits of using ETW to analyze managed applications. I will not introduce everything-there are several OS events that can be used for diagnosis and CLR ETW events that are not mentioned in this article. However, you will learn how to use the ETW ecosystem to greatly improve the performance and functionality of hosted applications. To start providing you with ETW-based diagnostics for managed code, I will use the free ETW tool (PerfMonitor) that can be downloaded from bcl.codeplex.com/releases/view/49601 to demonstrate the sample investigation.

PerfMonitor

PerfMonitor allows you to conveniently and quickly collect ETW performance data and generate useful reports. This tool is not intended to replace deep analysis tools (such as Visual Studio probes), but to provide you with an overview of application performance characteristics and to allow you to perform quick analysis.

There is also an ETW diagnostic tool called XPerf, which is available free of charge through the Windows performance toolkit. Although XPerf is suitable for analyzing local code on Windows, it does not support hosted code analysis in depth. On the other hand, PerfMonitor exposes the scope and capabilities of using ETW to analyze managed code. PerfMonitor can collect symbolic information associated with the. NET Framework runtime Code, making it valuable for. NET Framework Performance Surveys, although it does not support deep analysis that XPerf can provide.

PerfMonitor is a completely independent tool that you can use to analyze and diagnose hosted applications. The only requirement is that you must at least run Windows Vista or Windows Server 2008. PerfMonitor is a command line tool. If you type PerfMonitor.exe usersGuide from its location, an overview is displayed. If you want to diagnose your customer program in the operating environment (such as on the production server), you need to copy the corresponding files to the computer, and prepare to start collecting configuration files. You can analyze the configuration file offline if necessary.

During any Performance Investigation, four factors are usually checked: CPU, disk I/O, memory, and scalability. Most surveys start with the CPU, and the CPU affects the startup and execution time of the application. Checking disk I/O is the most useful when diagnosing a long start time (disk I/O is the main factor in the cold start time, cold start time refers to the amount of time it takes to start an application in the memory (for example, after restart) without an application, and excessive memory consumption (or leakage) the application may slow down over time. Scalability is critical if you want your application's throughput to be proportional to the number of processors.

PerfMonitor helps you get snapshots of all these factors except scalability, and provides you with enough information to use other professional tools for further exploration. For example, to diagnose problems related to the CLR. NET garbage collection (GC) heap, it is better to use CLRProfiler. However, PerfMonitor will soon tell you if there are any problems and whether you need to use other tools for further exploration. In some cases, PerfMonitor points out the problem and contains all the information you need to solve the performance Bug. This will be introduced soon. Please refer to the "CLR comprehensive Dialysis" column ". NET application memory usage Review" (msdn.microsoft.com/magazine/dd882521), which discusses the importance of reviewing program memory usage and planning performance. With this principle extended, you can use PerfMonitor to quickly review the performance of the hosting program in multiple aspects, not just the memory.

Example: CsvToXml

The sample program I used for ETW diagnosis can convert a CSV file into an XML file. You can obtain the source code and solution package from code.msdn.microsoft.com/mag201012ETW (as well as the sample input CSV file data.csv ). To run the program, run the command CsvToXml.exe data.csv output. xml.

Similar to many programs, CsvToXml has been quickly connected and developers have never expected to use it for large CSV files. When I started using the program in the real world, I found it too slow. It took more than 15 seconds to process a 750 K file! I know there is a problem, but there is no analysis tool. I can only guess that this is caused by slow running speed. (Can you just look at the source code to find out the problem ?) Fortunately, PerfMonitor helps you identify the problem.

Generate and view program Tracing

The first step is to quickly review the application by executing the following command in the administrator Command Prompt window (ETW will collect computer-wide data and therefore need to manage permissions ):

PerfMonitor runAnalyze CsvToXml.exe data.csv out.xml

This will start ETW logging, start CsvToXml.exe, wait for CsvToXml to complete, stop logging, and finally display a Web page showing the analysis information of CsvToXml. With a simple step, you can have a large amount of data that helps you reveal performance bottlenecks in CsvToXml.

Figure 1The result of this command is captured. This page contains the process ID, command line used, detailed advanced performance data (including CPU statistics, GC statistics, real-time (JIT) statistics), and other data. PerfMonitor also provides the first level of analysis on the location where the diagnosis starts by pointing to an informative article or a useful link to other tools.

Figure 1 Performance Analysis of CsvToXml

This report shows that the time spent in format conversion is nearly 14 seconds, and the time spent in the CPU with an average utilization of 99% is 13.6 seconds. Therefore, this solution is closely related to CPU performance.

The total GC time and GC pause time are small, which is good; but the maximum GC distribution rate is 105.1 Mb/s, which is too fast-this requires further investigation.

CPU Analysis

Detailed CPU analysis can provide CPU time details, suchFigure 2As shown in, you can also read the CPU configuration file data in three ways. You can quickly learn from the bottom-up view which methods consume the most CPU time. You should first diagnose this method. A top-down view can be used to determine whether your code requires architecture changes or structure changes, and to help you understand the overall performance of your program. The caller-called view indicates the relationship between each method-for example, the correspondence between the called method and the called method.

Figure 2 bottom-up analysis of CsvToXml.exe

Similar to other CPU probes, the PerfMonitor view provides you with time (the time when a specific method is used, including the time it was used by the caller) and exclusion time (the time used by a specific method, excluding the time used by the caller ). When the inclusion time is the same as the exclusion time, the work will be completed within the specific method. PerfMonitor also provides a CPU usage chart, which segments the CPU usage of a specific method over time. By hovering the cursor over the column title in the report, you can provide more details about its meaning.

Most Performance Surveys start with a bottom-up view, which is a list of Methods Divided by exclusion time (Figure 2). By selecting the bottom-up view, you will see that the mscorlib method System. IO. File. OpenText is the most CPU-consuming method. Click the link to display the caller/called view of the OpenText method, which reveals CsvToXml. csvFile. the get_ColumnNames method is calling OpenText from the program, while the CPU usage of get_ColumnNames is about 10 seconds (Figure 3). In addition, this method will be called from CsvToXml. CsvFile. XmlElementForRow in a loop (XmlElementForRow itself calls the Main method ).

Figure 3 caller-called view of get_ColumnNames

Therefore, some content seems to be incorrect in these methods. Pulling code from these methods will cause problems, suchFigure 4This file is repeatedly opened and analyzed in a loop!

Figure 4 method ColumnNames called by the XmlElementForRow Method

Public string [] ColumnNames
{
Get
{
Using (var reader = File. OpenText (Filename ))
Return Parse (reader. ReadLine ());
}
}
Public string XmlElementForRow (string elementName, string [] row)
{
String ret = "<" + elementName;
For (int I = 0; I <row. Length; I ++)
Ret + = "" + ToValidXmlName (ColumnNames [I]) + "= \" "+ EscapeXml (row [I]) + "\"";
Ret + = "/> ";
Return ret;
}

The frequency of similar situations is far greater than you think. When writing this method initially, developers may think that this method will only be called in rare cases (the same as ColumnNames), so the performance of this method may not be much concerned. However, in the future, this method will often be stopped in a loop, resulting in lower application performance.

In a CSV file, because all rows are in the same format, it is not necessary to do this every time. You can upgrade ColumnNames to Constructor (for exampleFigure 5To provide the cache column name. This ensures that the file is read only once.

Figure 5 cache column names for better performance

Public CsvFile (string csvFileName)
{
Filename = csvFileName;
Using (var reader = File. OpenText (Filename ))
ColumnNames = Parse (reader. ReadLine ());
}
Public string Filename {get; private set ;}
Public string [] ColumnNames {get; private set ;}

We re-run the previous command after the re-build, and found that the application speed is much faster. The current duration is only 2.5 seconds.

However, when you use a patch to review data, you will find that CPU time is still dominant. By looking at the CPU time and looking at the bottom-up analysis, you will find that Regex. Replace is currently the most overhead method, and this method is called from EscapeXml and ToValidXmlName. Because EscapeXml is the most overhead method (the exclusion time is 330 milliseconds), check its source code:

Private static string EscapeXml (string str)
{
Str = Regex. Replace (str, "\" "," & quote ;");
Str = Regex. Replace (str, "<", "& lt ;");
Str = Regex. Replace (str, ">", "& gt ;");
Str = Regex. Replace (str, "&", "& amp ;");
Return str;
}

Because EscapeXml is also called in the XmlElementForRow loop, this method may become a bottleneck. Regular Expressions are redundant for replacement, and the use of the string Replace method should be more efficient. Replace EscapeXml with the following:

Private static string EscapeXml (string str)
{
Str = str. Replace ("\" "," & quote ;");
Str = str. Replace ("<", "& lt ;");
Str = str. Replace (">", "& gt ;");
Str = str. Replace ("&", "& amp ;");
Return str;
}

After this conversion, the total time has been reduced by about 2 seconds, and the CPU time is still dominant. This is acceptable performance-You almost doubled the execution speed.

For ease of practice, I have reserved several performance bugs in the sample program and can use the ETW event to identify these bugs.

Explore GC statistics

The PerfMonitor GC statistics provides a brief overview of the memory configuration file. You may remember that I strongly recommend that you perform a memory usage review, and the information provided through the GC ETW event provides snapshots of any issues related to the. NET GC heap. You can obtain the size, allocation rate, and GC pause time of the GC aggregation heap in the quick summary view. You can select the GC time analysis link on the PerfMonitor results tab to display GC details, GC occurrence time, GC consumption time, and so on.

You can use this information to determine whether to use CLRProfiler or other memory probes to further analyze any memory problems. The article ". NET garbage collection heap Dialysis" (msdn.microsoft.com/magazine/ee309515) discusses in depth the CLRProfiler debugging. net gc heap.

For this particular program, there is no disturbing GC statistics. The allocation rate is relatively high. An effective empirical rule is to make the distribution rate lower than 10 MB/s. However, the pause time is very short. A high distribution rate occurs under the CPU time. In most cases, this means that the CPU gain will be obtained-the same as what you found. However, the allocation rate after repair is still relatively high, which indicates that a large amount of allocation is performed (Can you correct this problem ?). Several milliseconds of GC pause is a strong proof of Self-tuning and efficient GC provided by. NET Framework runtime. Therefore,. NET Framework GC is automatically responsible for memory management.

Explore JIT statistics

To shorten the startup time, one of the first projects to be explored is the time required for JIT compilation of methods. If it takes a long time (for example, most of the time required to start an application is occupied by JIT compilation), the application can benefit from native Image Generation (NGen, it can pre-compile the Assembly and save it to the disk to eliminate the JIT compilation time. That is to say, the Assembly is compiled by JIT and saved to the disk, so that no JIT compilation is required for subsequent execution. Before using NGen, you may also need to consider some methods for JIT compilation to be postponed to a certain point in time in the program so that the JIT compilation time will not affect the startup. For more information, see "NGen performance advantages" (msdn.microsoft.com/magazine/cc163610 ).

The startup cost of the sample application CsvToXml.exe is not high, so it is feasible to allow it to perform JIT compilation on all methods at a time. JIT compilation statistics also indicate that the number of methods compiled by JIT is 17 (all methods called are recommended to be compiled by JIT), and the total JIT compilation time is 23 milliseconds. These are not performance issues related to this application, but for large applications affected by JIT compilation time, using NGen should eliminate any problems. Usually, when an application starts to compile JIT for hundreds or thousands of methods, JIT compilation time will become an influence factor. In this case, NGen is a solution to eliminate JIT compilation costs.

MSDN magazineOther articles in contain more guidance on improving startup, and the ETW event can help identify and solve bottlenecks. Several other JIT events (including JIT inline events) are also provided, which provide an in-depth analysis of the causes of method failure to inline.

Clr etw event in. NET Framework 4

The CLR team wrote a blog article about tracking DLL loading and determining whether to load a specific DLL during startup. By using the ETW event, you can make it easier to determine whether to perform DLL loading during startup. By loading events using the ETW module provided in. NET Framework 4, we can learn which modules are loaded and why. There are also some events for module uninstallation.

. NET Framework 4 also provides several events that make it easier to diagnose hosted applications.Figure 6Summarize these events. You can use the PerfMonitor runPrint command to dump all events triggered during execution. The CLR team will also run events that allow you to connect and separate ETW analysis, and the Team intends to continue to add more ETW events, to make the process of debugging managed applications easier in future versions.

Figure 6. ETW event in NET Framework 4

Event Category Name	Description
Runtime Information ETW Event	Capture information about the runtime, including the SKU, version number, activation mode, command line parameters used when the runtime is started, GUID (if applicable), and other relevant information.
Exception Thrown ETW Event	Capture information about the exception
Contention ETW Events	Capture information about contention for the monitor lock or local lock used during running.
Thread Pool ETW Events	Capture information about the worker thread pool and I/O thread pool.
Loader ETW Events	Capture information about loading and detaching application domains, datasets, and modules.
Method ETW Events	Capture information about the CLR Method Used for symbolic parsing.
Gc etw Events	Capture GC information.
JIT Tracing ETW Events	Capture information about JIT inner and tail calls.
Interop ETW Events	Capture information about the generation and caching of Microsoft intermediate language (MSIL) stubs.
Application Domain Resource Monitoring (ARM) ETW Events	Capture detailed diagnostic information about the application domain status.
Security ETW Events	Capture information about strong names and Authenticode verification.
Stack ETW Event	Capture information that can be used for other events to generate stack traces after an event is triggered.

The execution directory contains two files with the suffix PerfMonitorOutput. These two files are ETW log files. You will also find files suffixed with kernel, which means they contain OS events. PerfMonitor uses the same data as XPerf, so you can use PerfMonitor to simplify data collection and simplify reports and XPerf for more advanced analysis of the same data. The PerfMonitor merge command converts the ETW file to the XPerf readable format.

Summary

Using ETW for Performance investigation is not only simple but also effective. Provides a variety of free, low-cost ETW-based tools that allow effective debugging of hosted code. I just introduced the ETW event that is provided in the. NET Framework runtime. My goal is to enable you to use ETW events and tools to debug hosted applications. By downloading the MSDN document for PerfMonitor and using the ETW event in CLR, and reading the CLR Perf blog, you can quickly start Performance Surveys for hosted applications.

Original article: http://msdn.microsoft.com/zh-cn/magazine/gg490356.aspx

Download Sample Code

PerfMonitor

Example: CsvToXml

Generate and view program Tracing

PerfMonitor runAnalyze CsvToXml.exe data.csv out.xml

Figure 1 Performance Analysis of CsvToXml

The total GC time and GC pause time are small, which is good; but the maximum GC distribution rate is 105.1 Mb/s, which is too fast-this requires further investigation.

CPU Analysis

Figure 2 bottom-up analysis of CsvToXml.exe

Figure 3 caller-called view of get_ColumnNames

Therefore, some content seems to be incorrect in these methods. Pulling code from these methods will cause problems, suchFigure 4This file is repeatedly opened and analyzed in a loop!

Figure 4 method ColumnNames called by the XmlElementForRow Method

Public string [] ColumnNames
{
Get
{
Using (var reader = File. OpenText (Filename ))
Return Parse (reader. ReadLine ());
}
}
Public string XmlElementForRow (string elementName, string [] row)
{
String ret = "<" + elementName;
For (int I = 0; I <row. Length; I ++)
Ret + = "" + ToValidXmlName (ColumnNames [I]) + "= \" "+ EscapeXml (row [I]) + "\"";
Ret + = "/> ";
Return ret;
}

Figure 5 cache column names for better performance

Public CsvFile (string csvFileName)
{
Filename = csvFileName;
Using (var reader = File. OpenText (Filename ))
ColumnNames = Parse (reader. ReadLine ());
}
Public string Filename {get; private set ;}
Public string [] ColumnNames {get; private set ;}

We re-run the previous command after the re-build, and found that the application speed is much faster. The current duration is only 2.5 seconds.

Private static string EscapeXml (string str)
{
Str = Regex. Replace (str, "\" "," & quote ;");
Str = Regex. Replace (str, "<", "& lt ;");
Str = Regex. Replace (str, ">", "& gt ;");
Str = Regex. Replace (str, "&", "& amp ;");
Return str;
}

Private static string EscapeXml (string str)
{
Str = str. Replace ("\" "," & quote ;");
Str = str. Replace ("<", "& lt ;");
Str = str. Replace (">", "& gt ;");
Str = str. Replace ("&", "& amp ;");
Return str;
}

After this conversion, the total time has been reduced by about 2 seconds, and the CPU time is still dominant. This is acceptable performance-You almost doubled the execution speed.

For ease of practice, I have reserved several performance bugs in the sample program and can use the ETW event to identify these bugs.

Explore GC statistics

Explore JIT statistics

Clr etw event in. NET Framework 4

Figure 6. ETW event in NET Framework 4

Event Category Name	Description
Runtime Information ETW Event	Capture information about the runtime, including the SKU, version number, activation mode, command line parameters used when the runtime is started, GUID (if applicable), and other relevant information.
Exception Thrown ETW Event	Capture information about the exception
Contention ETW Events	Capture information about contention for the monitor lock or local lock used during running.
Thread Pool ETW Events	Capture information about the worker thread pool and I/O thread pool.
Loader ETW Events	Capture information about loading and detaching application domains, datasets, and modules.
Method ETW Events	Capture information about the CLR Method Used for symbolic parsing.
Gc etw Events	Capture GC information.
JIT Tracing ETW Events	Capture information about JIT inner and tail calls.
Interop ETW Events	Capture information about the generation and caching of Microsoft intermediate language (MSIL) stubs.
Application Domain Resource Monitoring (ARM) ETW Events	Capture detailed diagnostic information about the application domain status.
Security ETW Events	Capture information about strong names and Authenticode verification.
Stack ETW Event	Capture information that can be used for other events to generate stack traces after an event is triggered.

Summary

Original article: http://msdn.microsoft.com/zh-cn/magazine/gg490356.aspx

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More