Basic Essentials of. NET Program performance

Source: Internet
Author: User
Bill Chiles (Program Manager of the Roslyn compiler) wrote an article "Essential Performance Facts and. NET Framework Tips," a well-known blogger in the Cold River alone fishing for the article selected passage, This article shares some of the recommendations and considerations for performance optimization, such as not to prematurely optimize, the importance of good tools, the key to performance, the memory allocation, etc., and point out that developers do not blindly and not based on the optimization, first locate and find the cause of the performance problem is the most important point.

The full text reads as follows:

This article provides recommendations for performance optimizations that come from using managed code to rewrite C # and VB compilers, and to demonstrate these optimizations by writing some real-world scenarios in the C # compiler. The. NET platform is highly productive in developing applications. The powerful and secure programming language on the platform and the rich library of classes make developing applications effective. But the greater the capacity the greater the responsibility. We should use. NET Framework, but at the same time we need to be prepared to tune our code if we need to process large amounts of data such as files or databases.

Why the performance tuning experience from the new compiler is also applicable to your application

Microsoft uses managed code to rewrite the compilers of C # and Visual Basic, and to provide a list of new APIs for Code modeling and analysis, and for developing compilation tools that enable Visual Studio to have a richer code-aware programming experience. Rewriting the compiler, and the experience of developing Visual Studio on the new compiler, gives us a very useful experience with performance optimizations that can be used in large-scale applications. NET app, or some app that needs to handle a lot of data. You don't need to know about compilers, but you can also derive these insights from the example of the C # compiler.

Visual Studio uses the compiler's API to implement powerful IntelliSense features such as code keyword coloring, syntax fill lists, error wavy hints, parameter hints, code issues, and suggestions for modifications, which are popular with developers. Visual Studio dynamically compiles the code to gain analysis and hints for the code as the developer enters or modifies the code.

When users interact with the app, they usually want the software to be responsive. When entering or executing a command, the application interface should not be blocked. Help or hints can be displayed quickly or stop prompting when the user continues to enter. Today's app should avoid blocking the UI thread while performing long-time calculations to make the user feel that the program is not fluent.

To learn more about the new compiler, you can access the. NET Compiler Platform ("Roslyn")

Basic Essentials

Consider these basic essentials when you are tuning for. NET performance and developing well-responsive applications:

Essentials One: Don't optimize early

Writing code is much more complex than you might think, and the code needs to be maintained, debugged, and optimized for performance. An experienced programmer is usually a natural way of proposing a problem-solving approach and writing efficient code. But sometimes it's possible to get caught up in the problem of prematurely optimizing code. For example, sometimes it is enough to use a simple array, not to be optimized to use a hash table, and sometimes a simple recalculation can be done, rather than using a complex cache that could cause a memory leak. When you find a problem, you should first test the performance issue and then analyze the code.

Essentials Two: No evaluation, is guessing

Anatomy and measurement do not lie. The evaluation can show whether the CPU is running at full capacity or if there is disk I/O blocking. The assessment tells you what and how much memory the application allocates, and whether the CPU spends a lot of time on garbage collection.

You should set performance targets for critical user experiences or scenarios, and write tests to measure performance. The steps to analyze the causes of performance nonconformities by using a scientific approach are as follows: Use the evaluation report to guide, assume what might happen, and write experimental code or modify the code to validate our assumptions or corrections. If we set basic performance metrics and often test, we can avoid some changes that lead to performance rollback (regression), which avoids wasting time in unnecessary changes.

Essentials Three: Good tools are important

Good tools enable us to quickly navigate to the biggest factors that affect performance (CPU, memory, disk) and to help us locate the code that produces these bottlenecks. Microsoft has released a number of performance testing tools such as Visual Studio Profiler, Windows Phone analysis tool, and PerfView.

Perfview is a free and powerful tool that focuses on some of the deep-seated issues that affect performance (disk I/O,GC events, memory), which are shown later in this example. We are able to crawl performance-related event tracing for Windows (ETW) events and can view this information in the scale of applications, processes, stacks, and threads. Perfview is able to show how much the application is allocated, what memory is allocated, what functions are in the application, and how the call stack contributes to memory allocation. For details on these aspects, you can check out the very detailed help on Perfview, demo and video tutorials (such as video tutorials on Channel9) that were released with the tool download.

Essentials Four: All of them are related to memory allocation

You might want to write a response on a timely basis. NET application is the key to using good algorithms, such as using a quick sort instead of bubbling sort, but that's not the case. The biggest factor in writing a well-responsive app is memory allocation, especially when the app is very large or processing large amounts of data.

In the practice of developing a well-responsive IDE with the new compiler API, much of the work is spent on avoiding the creation of memory and managing caching policies. Perfview tracing shows that the performance of the new C # and VB compilers is basically not related to CPU performance bottlenecks. When the compiler reads hundreds or even tens of thousands of lines of code, reading the metadata alive produces compiled code that is actually I/o bound intensive. The latency of the UI thread is almost entirely due to garbage collection. NET Framework has been highly optimized for the performance of garbage collection, and he is able to perform most of the garbage collection operations in parallel while the application code executes. However, a single memory allocation operation may trigger an expensive garbage collection operation, so that the GC temporarily suspends all threads for garbage collection (such as Generation 2 garbage collection)

Common memory allocations and examples

This part of the example, although there are few places behind about memory allocation. However, if a large application executes enough of these small expressions that cause memory allocations, these expressions can result in memory allocations of hundreds of m, or even a few grams. For example, before the performance Test team navigates the problem to the input scenario, a one-minute test simulation developer writing code in the compiler allocates a few grams of memory.

Packing

Boxing occurs when a value type is usually allocated on a thread stack or in a data structure, or when a temporary value needs to be wrapped into an object (such as assigning an object to hold the data, returning a pointer to an object by living it). NET Framework because of the signature of a method or the allocation of a type, sometimes the value type is automatically boxed. Wrapping a value type as a reference type produces a memory allocation. NET Framework and language will try to avoid unnecessary boxing, but sometimes when we do not notice it will produce a boxing operation. Too many boxing operations are allocated to the memory of G on M in the application, which means that garbage collection is more frequent and takes longer.

To view the boxing operation in Perfview, just turn on a trace, and then look at the GC Heap ALLOC item under the application name (remember, Perfview will report resource allocations for all processes), If you see some value types such as System.Int32 and System.Char in the allocation phase, boxing occurs. Selecting a type displays the call stack and the function that the boxed operation occurred.

Example 1 string method and its value type parameter

The following sample code demonstrates potentially unnecessary boxing and frequent boxing operations in large systems.

public class logger{public    static void WriteLine (string s)    {        /*...*/    }}public class boxingexample{ Public    void Log (int id, int size)    {        var s = string. Format ("{0}:{1}", id, size);        Logger.writeline (s);    }}

This is a log base class, so the app will call the log function very frequently for logging, which may be called millons times. The problem is that the string is called. The format method calls its overloaded methods that accept a string type and two object types:

String.Format Method (String, Object, Object)

The overloaded method requires the. NET Framework to box an int into the object type and then upload it to the method call. To solve this problem, the method is to call the ID. ToString () and size. The ToString () method is then passed to string. In the Format method, calling the ToString () method will indeed result in a string assignment, but in string. The format method internally produces a string-type assignment, regardless of how it occurs.

You might think of this basic invocation of string. Format is just a concatenation of strings, so you might write code like this:

var s = ID. ToString () + ': ' + size. ToString ();

In fact, this line of code will also cause boxing, as the above statement will be called at compile time:

String. Concat (Object, Object, object);

In this method, the. NET Framework must boxing a character constant to invoke the Concat method.

Workaround:

It is easy to fix the problem completely, replacing the above single quotation mark with double quotation marks to avoid boxing by changing the character constant to a string constant, because the string type is already a reference type.

var s = ID. ToString () + ":" + size. ToString ();

Example 2 boxing of enumerated types

The following example is the reason why the new C # and VB compilers have allocated a lot of memory because of the frequent use of enumeration types, especially when doing a find operation in dictionary.

public enum Color {Red, Green, Blue}public class boxingexample{    private string name;    private color color;    public override int GetHashCode ()    {        return name. GetHashCode () ^ Color. GetHashCode ();    }}

The problem is very covert, Perfview will tell you Enmu.gethashcode () because the internal implementation causes the boxing operation, the method will be boxed in the form of the underlying enumeration type, if you look closely at Perfview, You will see that each call to GetHashCode produces two boxing operations. Once the compiler is inserted, the. NET framework is inserted another time.

Workaround:

This boxing operation can be avoided by forcing type conversions on the underlying representation of the enumeration when calling GetHashCode.

((int) color). GetHashCode ()

Another use of enumeration types often results in boxed operations when an enum is used. Hasflag. The arguments passed to Hasflag must be boxed, and in most cases repeated calls to hasflag through bit operations are very simple and do not require allocating memory.

Keep in mind the basic essentials first, and do not optimize prematurely. And don't start rewriting all the code too early. It is important to note the cost of these containers and start modifying the code only if it is found through the tool and is located to the main problem.

String

String manipulation is one of the biggest culprits of memory allocation, and is typically the cause of memory allocations in the top five in Perfview. The application uses strings to serialize, representing JSON and rest. In cases where enumeration types are not supported, strings can be used to interact with other systems. When we locate a string operation that results in a serious performance impact, you need to be aware of these methods such as format (), Concat (), Split (), Join (), Substring () of the String class. Using StringBuilder can avoid the overhead of creating multiple new strings when stitching multiple strings, but the creation of StringBuilder also requires good control to avoid potential performance bottlenecks.

Example 3 string manipulation

In the C # compiler, you have the following methods to output comments in the XML format preceding the method.

public void Writeformatteddoccomment (string text) {    string[] lines = text. Split (new[] {"\ r \ n", "\ r", "\ n"},        Stringsplitoptions.none);    int numlines = lines. Length;    bool Skipspace = true;    if (Lines[0]. TrimStart (). StartsWith ("///"))    {for        (int i = 0; i < numlines; i++)        {            string trimmed = lines[i]. TrimStart ();            if (trimmed. Length < 4 | | !char. Iswhitespace (Trimmed[3]))            {                skipspace = false;                break;            }        }        int substringstart = Skipspace? 4:3;        for (int i = 0; i < numlines; i++)            Console.WriteLine (Lines[i]. TrimStart (). Substring (Substringstart));    }    else    {/        * ... */    }}

As you can see, there are many string manipulations in this piece of code. The code uses the class library method to split the rows into strings, to remove spaces, to check whether the parameter text is a comment in the XML document format, and then to remove the string processing from the line.

When the Writeformatteddoccomment method is called each time, the first line of code calls split () to allocate a string array of three elements. The compiler also needs to generate code to allocate this array. Because the compiler does not know that if splite () stores this array, then the other parts of the code may change the array, which will affect the subsequent invocation of the Writeformatteddoccomment method. Each call to the Splite () method also assigns a string to the parameter text and then allocates additional memory to perform the splite operation.

The Writeformatteddoccomment method calls the three TrimStart () method, which is called two times in the memory ring, which are duplicated work and memory allocations. Worse, the unsigned overloaded method of TrimStart () is signed as follows:

namespace system{public    class string    {public        string TrimStart (params char[] trimChars);}    }

The signature of the method means that each call to TrimStart () is assigned an empty array and returns a string type of result.

Finally, a substring () method is called, which usually causes the new string to be allocated in memory.

Workaround:

It is different from the previous one that requires only minor modifications to solve the memory allocation problem. In this example, we need to look at the problem from the beginning and look at the problems and solve them in different ways. For example, you can be aware that the parameter of the Writeformatteddoccomment () method is a string that contains all the information needed in the method, so the code needs to do more index operations than allocate so many small string fragments.

The following methods are not fully solved, but you can see how to use similar techniques to solve the problems in this example. The C # compiler uses the following methods to eliminate all additional memory allocations.

private int Indexoffirstnonwhitespacechar (string text, int start) {while    (start < text. Length && Char. Iswhitespace (Text[start])        start++;    return start;} private bool Trimmedstringstartswith (string text, int start, string prefix) {    start = Indexoffirstnonwhitespacechar ( text, start);    int len = text. Length-start;    if (Len < prefix. Length) return false;    for (int i = 0; i < len; i++)    {        if (prefix[i]! = Text[start + i])            return false;    }    return true;}

The first version of the Writeformatteddoccomment () method assigns an array, several substrings, a trim substring, and an empty params array. Also checked for "///". The modified code uses the index operation only, without any additional memory allocations. It finds the first non-whitespace string and then compares the strings to see if it starts with "//". Unlike using TrimStart (), the modified code uses the Indexoffirstnonwhitespacechar method to return the starting position of the first non-whitespace, which can be removed by using this method Writeformatteddoccomment () All additional memory allocations in the method.

Example 4 StringBuilder

In this example, StringBuilder is used. The following function is used to produce the full name of a generic type:

public class example{    //Constructs a name as "Sometype<t1, T2, t3>" public    string Generatefulltypename (s Tring name, int arity)    {        StringBuilder sb = new StringBuilder ();        Sb. Append (name);        if (arity! = 0)        {            sb. Append ("<");            for (int i = 1; i < arity; i++)            {                sb. Append ("T"); Sb. Append (i.ToString ()); Sb. Append (",");            }            Sb. Append ("T"); Sb. Append (i.ToString ()); Sb. Append (">");        }        Return SB. ToString ();    }}

Focus on the creation of the StringBuilder instance. Call SB in the code. ToString () causes a memory allocation at a time. Internal implementations in StringBuilder also result in internal memory allocations, but these allocations cannot be avoided if we want to get the results of string types.

Workaround:

The cache is used to resolve the allocation of StringBuilder objects. Even caching a single instance object that may be discarded at any time can significantly improve program performance. The following is a new implementation of the function. Except for the following two lines of code, the other code is the same

Constructs a name like "Foo<t1, T2, t3>" public string Generatefulltypename (string name, int arity) {    Stringbui Lder sb = Acquirebuilder (); /* Use SB as before *    /return Getstringandreleasebuilder (SB);}

The key part is the new Acquirebuilder () and Getstringandreleasebuilder () methods:

[Threadstatic]private static StringBuilder Cachedstringbuilder; private static StringBuilder Acquirebuilder () {    StringBuilder result = Cachedstringbuilder;    if (result = = null)    {        return new StringBuilder ();    }    Result. Clear ();    Cachedstringbuilder = null;    return result;} private static string Getstringandreleasebuilder (StringBuilder sb) {    string result = sb. ToString ();    Cachedstringbuilder = SB;    return result;}

The Thread-static field is used in the above method implementation to cache the StringBuilder object because of the reason that the new compiler uses multithreading. Will probably forget the threadstatic statement. The thread-static character retains a unique instance for each thread that executes this part of the code.

If an instance is already in place, the Acquirebuilder () method returns the cached instance directly and, after emptying, sets the field or cache to null. otherwise Acquirebuilder () creates a new instance and returns, and then sets the field and cache to null.

When we have finished processing the StringBuilder, call the Getstringandreleasebuilder () method to get the string result. Then save the StringBuilder to a field or cache it and return the result. This code is likely to be executed repeatedly, creating multiple StringBuilder objects, albeit rarely. Only the last freed StringBuilder object is saved in the code for later use. In the new compiler, this simple caching strategy greatly reduces unnecessary memory allocations. Some modules in the. NET Framework and MSBuild also use similar techniques to improve performance.

A simple cache policy must follow a good cache design because he has a size limit cap. Using a cache can be more code than before, and more maintenance work is required. We should adopt a caching strategy only after we find out that this is a problem. Perfview has shown that StringBuilder contributes quite a lot to memory allocation.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.