. Basic essentials of performance in net development and suggestions for optimization

Source: Internet
Author: User
Tags foreach closure constant garbage collection hash int size static class tostring

Old Zhao's. NET program performance of the basic Essentials





Speaking of Roslyn everyone must have heard, this is the next generation of C # and vb.net compiler implementation. Roslyn is developed using pure managed code, but performance is more than a native implementation written in C + + before. Bill Chiles is Roslyn's PM (Program Manager), who recently wrote an article called The Essential Performance Facts and. NET Framework Tips, which summarizes a few lessons , is currently a PDF file on the CodePlex and may be posted on MSDN later.





He talked about the following points in the article:





Do not make premature optimizations. When programmers have some experience, they tend to have some intuition about performance, but they also need to avoid blind optimization.


There is no evaluation, it is speculation. For example, sometimes it is faster to repeat a calculation than to cache it with a hash table.


Good tools are important. Here he recommends Perfview, a free tool released by Microsoft, that I might use when analyzing certain cases in the future.


The key to performance is memory allocation. Intuitively, many people may find the compiler a CPU-intensive scenario, but in fact it is an IO-intensive program.


Some other details. For example, there are some concepts for the memory overhead of a dictionary, as well as the difference between class and struct that I would ask every time I interviewed.





The 4th is worth saying a few more words. For a managed environment, GC has a significant impact on performance. If a program is not written in GC-friendly, so that the GC occurs more, especially the kind of Stop-the-world GC, which has a far better performance impact than some "more than a few copies of the instructions," such as "exploration." And most of the time, the "performance" in the eyes of the user lies in the "response level (responsiveness)" of the program, and once the GC suspends all threads, the program can easily happen to cotton, which is not even reflected by the simple evaluation of the program's performance.





Compared to the Java platform,. NET is already a relatively GC-friendly operating environment. One of the most important aspects is the custom value type, struct. struct allows programmers to perform a manageable amount of memory allocation to avoid generating objects on the heap. In Java, only a few native types are value types and they cannot contain members. You know, in Java, you can't use an unboxed int value as a key to a dictionary, for one. NET programmer may be hard to imagine, but that's the way it is.





Of course, Java seems to have plans to make improvements in this area, but it is far from being truly usable. At present, Java can only through some of the means such as escape analysis, found that an object is not shared on the heap, so it is allocated on the stack, to avoid pressure on the GC.





But. NET provides more GC-friendly functionality and is not offset by the misuse of the developer. Bill's article presents some common cases, all of which are actually each. NET developers must understand the basics. The last example is quite interesting, he says, and when it comes to performance-sensitive places, it's time to avoid LINQ or Lambda. Because using lambda to construct anonymous functions, the compiler produces closures because the so-called closure is an object that is allocated on the heap to hold the context. In addition, a list<t> iterator is intentionally implemented as a struct, but using a generic LINQ interface, it is converted to ienumerable<t> and Ienumerator<t&gt, which in turn produces boxing.





Coincidentally, not long ago @ Liancheng 404 on the Sina Weibo said:





By Michael's advice, the FP-style code on the Hivetablescan critical path was replaced with a while loop plus a reusable mutable object, which increased the sweep performance by 40%. This is also closely related to this topic.





In the middle of the night is not sober, arithmetic is wrong ... It was actually raised more than 100% points when you sent a tweet. Then continue to scrape the meat off the mosquito's legs, not only the while loop, the pattern-matching code on the critical path can also be scraped (for example, by eliminating the ARRAY.UNAPPLYSEQ call overhead). Currently using Lazysimpleserde's common CVS table-sweeping performance mentions 2.2x,rcfile plus column pruning can refer to 3x. #Spark sql#





And Alan Perlis also said:





Lisp programmers know the value of everything but the cost of nothing.





is quite interesting. The common FP method does bring performance overhead, which is true, but if you immediately come to the conclusion that "do not use FP" or "Do not learn FP", then I can only look at you with a compassionate eye.














. NET program's performance Essentials and optimization suggestions





This article provides some recommendations for performance tuning, from using managed code to rewrite C # and VB compilers, and to illustrate these tuning experiences by writing some real-world scenarios in the C # compiler. The. NET Platform development application is highly productive. The powerful and secure programming language on the platform and the rich class library make the development and application become fruitful. But the greater the capacity, the greater the responsibility. We should use. NET Framework, but also if we need to process large amounts of data such as files or databases, we need to be prepared to tune our code.


Why the performance tuning experience from the new compiler also applies to your application

Microsoft uses managed code to rewrite C # and Visual Basic compilers, and provides a list of new APIs for Code modeling and analysis, and development of compilation tools that make Visual Studio a richer code-aware programming experience. Rewriting the compiler and developing the experience of Visual Studio on the new compiler has enabled us to gain very useful performance tuning experience that can be used in large scale. NET application, or some app that needs to handle a lot of data. You don't need to know about compilers, and you can draw these insights from examples in the C # compiler.

Visual Studio uses the compiler's APIs to implement powerful IntelliSense (IntelliSense) features such as code-keyword coloring, syntax-filled lists, error wave hints, parameter hints, code problems, and modification suggestions, which are well received by developers. When a developer enters or modifies code, Visual studio dynamically compiles the code to obtain analysis and hints about the code.

When users interact with the app, they often want the software to be responsive. When you enter or execute a command, the application interface should not be blocked. Help or hints can be displayed quickly or stop prompting when the user continues typing. The current app should avoid blocking the UI thread while performing long computations to make the user feel that the program is not fluent enough.

To learn more about the new compiler, you can access the. NET Compiler Platform ("Roslyn") http://roslyn.codeplex.com/


Basic Essentials

Consider these basic essentials when you are tuning. NET and developing good, responsive applications:


Essentials One: Do not optimize prematurely

Writing code is more complex than you think, and code needs to be maintained, debugged, and optimized for performance. An experienced programmer usually has a natural way of solving problems and writing efficient code. But sometimes it's also possible to get caught up in the problem of prematurely optimizing the code. For example, sometimes it's enough to use a simple array, not to be optimized to use a hash table, and sometimes a simple recalculation can be done by using a complex cache that could lead to a memory leak. When you discover a problem, you should first test the performance issue and then analyze the code.


The second point: no evaluation, is guessing

Anatomy and measurement do not lie. The evaluation can show whether the CPU is running at full capacity or that there is disk I/O blocking. The test will tell you what and how much memory the application allocates, and whether the CPU spends a lot of time on garbage collection.

You should set performance goals for critical user experiences or scenarios, and write tests to measure performance. The steps to analyze the causes of performance failure by using scientific methods are as follows: Use evaluation reports to guide, assume possible scenarios, and write experimental code or modify code to validate our assumptions or corrections. If we set basic performance metrics and often test, we can avoid some changes that cause performance rollback (regression), so that we can avoid wasting time on some unnecessary changes.


Essentials Three: Good tools are important

Good tools enable us to quickly navigate to the biggest factors that affect performance (CPU, memory, disk) and to help us locate the code that creates these bottlenecks. Microsoft has released a number of performance testing tools such as Visual Studio Profiler, Windows Phone analysis Tool, and Perfview. The

Perfview is a free and powerful tool that focuses on some of the deeper issues that affect performance (disk I/O,GC events, memory), which are shown later. We are able to crawl performance-related event tracing for Windows (ETW) events and view this information on an application, process, stack, and thread scale. Perfview can show how much the application is allocated, what memory is allocated, and how the functions in the application and the call stack contribute to memory allocations. For details on these aspects, you can view very detailed help on Perfview with the tools download, demo and video tutorials (such as video tutorials on Channel9)


Essentials Four: All are related to memory allocation

You might want to write a response in a timely basis. NET application is based on good algorithms, such as using a quick sort instead of a bubble sort, but this is not the case. The biggest factor in writing a responsive app is memory allocation, especially when the app is very large or handles large amounts of data.

In the practice of developing a responsive IDE with a new compiler API, much of the work is spent on avoiding the development of memory and managing caching policies. Perfview tracking shows that the performance of the new C # and VB compilers is basically not related to CPU performance bottlenecks. The compiler reads hundreds or even tens of thousands of lines of code and reads metadata alive to produce compiled code that is actually I/o bound intensive. Almost all of the latency of UI threads is caused by garbage collection. NET Framework has been highly optimized for garbage collection, and he is able to perform most of the garbage collection operations parallel to the execution of the application code. However, a single memory allocation operation can trigger an expensive garbage collection operation, so that the GC suspends all threads for garbage collection (such as Generation Type 2 garbage collection)
Common memory allocations and examples

This part of the example, although behind the memory allocation is very few places. However, if a large application executes enough of these small expressions that result in memory allocations, these expressions can result in memory allocations of hundreds of m, or even several g. For example, before the performance Test team positions the problem in the input scenario, a minute of test simulation developers writing code in the compiler allocates several g of memory.


Packing

Boxing occurs when a value type is usually assigned on a thread stack or in a data structure, or when a temporary value needs to be wrapped into an object (such as assigning an object to hold the data, returning a pointer to an object). NET Framework because of the signature of the method or the type of allocation location, sometimes the value type is automatically boxed. Wrapping a value type as a reference type produces a memory allocation. NET Framework and language will try to avoid unnecessary boxing, but sometimes when we do not notice it will produce boxing operations. Too many boxing operations are allocated in the application to the memory of G on M, which means that garbage collection is more frequent and takes longer time.

To view the boxing operation in Perfview, simply open a Trace (trace) and view the GC Heap ALLOC item under the name of the application (remember, Perfview will report resource allocations for all processes). If you see some value types such as System.Int32 and System.Char in the distribution phase, boxing occurs. Select a type to display the call stack and the function of the boxing action that occurred.




Example 1 string method and its value type parameter





The following sample code demonstrates potentially unnecessary boxing and frequent boxing operations in large systems.





public class Logger


{


public static void WriteLine (string s)


{


/*...*/


}


}


public class Boxingexample


{


public void Log (int id, int size)


{


var s = string. Format ("{0}:{1}", id, size);


Logger.writeline (s);


}


}





This is a log base class, so the app calls the log function very frequently, which may be invoked millons times. The problem is that you call string. The format method calls its overloaded methods that accept a string type and two object types:





String.Format method (String, Object, Object)





This overload method requires the. NET Framework to boxing int into type object and then uploading it to the method call. To solve this problem, the method is to call the ID. ToString () and size. ToString () method, and then incoming to string. Format method, calling the ToString () method does cause a string to be allocated, but in string. The inside of the format method produces a string type of assignment, regardless of how it occurs.





You might think of this basic call string. Format is just a concatenation of strings, so you might write code like this:





var s = ID. ToString () + ': ' + size. ToString ();





In fact, the above line of code also causes boxing because the above statement will be invoked at compile time:





String. Concat (Object, Object, object);





This method, the. NET Framework must boxing the character constants to invoke the Concat method.





Workaround:





Completely fix the problem is simple, replace the above single quotation mark with double quotation marks to change the character constants quantity to a string constant to avoid boxing, because the string type is already a reference type.





var s = ID. ToString () + ":" + size. ToString ();





Example 2 boxing of enumeration types





The following example causes the new C # and VB compilers to allocate a lot of memory because of the frequent use of enumeration types, especially in dictionary.





public enum Color {Red, Green, Blue}


public class Boxingexample


{


private string name;


private color color;


public override int GetHashCode ()


{


return name. GetHashCode () ^ Color. GetHashCode ();


}


}





The problem is very covert, and Perfview will tell you. Enmu.gethashcode () A boxing operation is generated because of an internal implementation, which is boxed in the form of the underlying enumeration type, if you look closely at Perfview, You will see that two boxing operations are generated for each call to GetHashCode. The compiler inserts one time, the. NET Framework inserts another.





Workaround:





This boxing operation can be avoided by forcing type conversions of the underlying representations of the enumeration when GetHashCode is invoked.





((int) color). GetHashCode ()





Another enum that uses an enumeration type to frequently produce a boxed operation. Hasflag. Parameters passed to Hasflag must be boxed, and in most cases it is very simple and unnecessary to allocate memory to repeatedly invoke Hasflag through bit-operation testing.





Keep in mind the basics first, don't optimize prematurely. And don't start rewriting all the code prematurely. You need to be aware of the cost of boxing, and only start modifying the code when you find it by tool and navigate to the main problem.


String





String manipulation is one of the biggest culprits for memory allocation, and typically accounts for the first five of memory allocations in Perfview. The application uses strings to serialize, representing JSON and rest. In cases where enumeration types are not supported, strings can be used to interact with other systems. When we navigate to a string operation that is causing a serious performance impact, we need to pay attention to the format () of the String class, Concat (), Split (), Join (), Substring (), and so on. Using StringBuilder avoids the overhead of creating multiple new strings when stitching multiple strings, but the creation of StringBuilder also requires good control to avoid possible performance bottlenecks.




Example 3 string manipulation





The following methods are used in the C # compiler to output comments in the XML format before the method.





public void Writeformatteddoccomment (string text)


{


String[] lines = text. Split (new[] {"\ r \ n", "\ r", "\ n"},


Stringsplitoptions.none);


int numlines = lines. Length;


bool Skipspace = true;


if (lines[0). TrimStart (). StartsWith ("///"))


{


for (int i = 0; i < numlines; i++)


{


String trimmed = Lines[i]. TrimStart ();


if (trimmed. Length < 4 | | !char. Iswhitespace (Trimmed[3])


{


Skipspace = false;


Break


}


}


int substringstart = Skipspace? 4:3;


for (int i = 0; i < numlines; i++)


Console.WriteLine (Lines[i]. TrimStart (). Substring (Substringstart));


}


Else


{


/* ... */


}


}





As you can see, there are a lot of string operations in this piece of code. Use the class library method in your code to split rows into strings, to remove spaces, to check whether the parameter text is a comment in an XML document format, and then to remove string processing from the row.





Each time the Writeformatteddoccomment method is invoked, the first line of code calls split () to assign a string array of three elements. The compiler also needs to generate code to allocate this array. Because the compiler does not know that if splite () stores this array, other parts of the code might change the array, which would affect the subsequent invocation of the Writeformatteddoccomment method. Each call to the Splite () method also assigns a string to the parameter text, and then allocates additional memory to perform the splite operation.





The Writeformatteddoccomment method calls three TrimStart () methods and two times in the memory loop, which are repetitive work and memory allocations. Worse, the TrimStart () of the parameterless overloaded method is signed as follows:





Namespace System


{


public class String


{


public string TrimStart (params char[] trimChars);


}


}





The method signature means that each call to TrimStart () is allocated an empty array and returns a string of results.





Finally, a substring () method is called, which typically causes a new string to be allocated in memory.





Workaround:





There is a difference between the memory allocation problem and the previous one that requires little modification. In this case, we need to look at it from the beginning, see the problem and then solve it in a different way. For example, you can realize that the Writeformatteddoccomment () method parameter is a string that contains all the information that is required in the method, so the code needs to do more index operations than allocate so many small string fragments.





The following method is not fully solvable, but you can see how to use similar techniques to solve the problem in this example. The C # compiler uses the following methods to eliminate all additional memory allocations.





private int Indexoffirstnonwhitespacechar (string text, int start)


{


while (Start < text. Length && Char. Iswhitespace (Text[start])


start++;


return start;


}





private bool Trimmedstringstartswith (string text, int start, string prefix)


{


Start = Indexoffirstnonwhitespacechar (text, start);


int len = text. Length-start;


if (Len < prefix. Length) return false;


for (int i = 0; i < len; i++)


{


if (Prefix[i]!= Text[start + i])


return false;


}


return true;


}





The first version of the Writeformatteddoccomment () method assigns an array, several substrings, an trim substring, and an empty params array. Also checked the "///". The modified code uses only the index operation and does not have any additional memory allocations. It looks for a string that is not a space, and then a string comparison to see if it starts with "///." Unlike using TrimStart (), the modified code uses the Indexoffirstnonwhitespacechar method to return the beginning of the first non space, by using this method to remove the Writeformatteddoccomment () All additional memory allocations in the method.




Example 4 StringBuilder





In this case, StringBuilder is used. The following function is used to produce the full name of the generic type:





public class Example


{


Constructs a name like "Sometype<t1, T2, t3>"


public string Generatefulltypename (string name, int arity)


{


StringBuilder sb = new StringBuilder ();


Sb. Append (name);


if (arity!= 0)


{


Sb. Append ("<");


for (int i = 1; i < arity; i++)


{


Sb. Append ("T"); Sb. Append (i.ToString ()); Sb. Append (",");


}


Sb. Append ("T"); Sb. Append (i.ToString ()); Sb. Append (">");


}


Return SB. ToString ();


}


}





Focus on the creation of the StringBuilder instance. Call SB in the code. ToString () can cause a memory allocation once. Internal implementations in StringBuilder can also cause internal memory allocations, but if we want to get to string-type results, these allocations cannot be avoided.





Workaround:





The cache is used to resolve the allocation of the StringBuilder object. Even caching a single instance object that may be discarded at any time can significantly improve program performance. The following is a new implementation of the function. In addition to the following two lines of code, the other code is the same





Constructs a name like "Foo<t1, T2, t3>"


public string Generatefulltypename (string name, int arity)


{


StringBuilder sb = Acquirebuilder (); /* Use SB as before * *


Return Getstringandreleasebuilder (SB);


}





The key part lies in the new Acquirebuilder () and Getstringandreleasebuilder () methods:





[ThreadStatic]


private static StringBuilder Cachedstringbuilder;





private static StringBuilder Acquirebuilder ()


{


StringBuilder result = Cachedstringbuilder;


if (result = = null)


{


return new StringBuilder ();


}


Result. Clear ();


Cachedstringbuilder = null;


return result;


}





private static string Getstringandreleasebuilder (StringBuilder SB)


{


string result = sb. ToString ();


Cachedstringbuilder = SB;


return result;


}





The Thread-static field is used in the above method implementation to cache the StringBuilder object, due to the fact that the new compiler uses multithreading. It is likely that the threadstatic statement will be forgotten. The thread-static character retains a unique instance of each thread that executes this part of the code.





If you already have an instance, the Acquirebuilder () method returns the instance of the cache directly, and when emptied, the field or cache is set to NULL. Otherwise, Acquirebuilder () creates a new instance and returns it, and then sets the fields and cache to null.





When we finish processing the StringBuilder, the Getstringandreleasebuilder () method is invoked to get the string result. The StringBuilder is then saved to the field or cached, and the result is returned. This code is likely to be repeated, creating multiple StringBuilder objects, although rarely occurs. The code only saves the StringBuilder object that was last released for later use. In the new compiler, this simple caching strategy greatly reduces unnecessary memory allocations. Some of the modules in the. NET Framework and MSBuild also use similar techniques to improve performance.





A simple caching strategy must follow a good cache design because he has the size of the limit cap. The use of caching may be more code than before, and more maintenance work is required. We should only adopt a caching strategy when we find out that this is a problem. Perfview has shown that StringBuilder contributes considerably to the allocation of memory.


LINQ and Lambdas expressions





Using LINQ and Lambdas expressions is a good embodiment of the strong productivity of the C # language, but it may be necessary to override LINQ or Lambdas expressions If your code needs to execute many times.




Example 5 lambdas expression, list<t>, and Ienumerable<t>





The following example uses LINQ and functional-style code to find symbols by using the name given by the compiler model.





Class Symbol


{


public string Name {get; private set;}/*...*/


}


Class Compiler


{


private list&lt;symbol&gt; symbols;


Public Symbol Findmatchingsymbol (string name)


{


return symbols. FirstOrDefault (s =&gt; s.name = = Name);


}


}





The new compiler and IDE experience is based on call Findmatchingsymbol, which is very frequent, in which such a simple line of code hides the underlying memory allocation overhead. To demonstrate this allocation, we first split the Single-line function into two lines:





Func&lt;symbol, bool&gt; predicate = s =&gt; s.name = = Name;


return symbols. FirstOrDefault (predicate);





In the first line, the lambda expression "s=&gt;s.name==name" is a closure of the local variable Name. This means that additional objects need to be allocated to allocate space for the delegate object predict, and an assignment of a static class is needed to save the environment so that the value of name is saved. The compiler produces the following code:





Compiler-generated class to hold environment state for Lambda


Private Class Lambda1environment


{


public string Capturedname;


public bool Evaluate (Symbol s)


{


return s.name = = This.capturedname;


}


}





Expanded Func&lt;symbol, bool&gt; predicate = s =&gt; s.name = Name;


Lambda1environment L = new Lambda1environment ()


{


Capturedname = Name


};


var predicate = new Func&lt;symbol, bool&gt; (l.evaluate);





The two new operator (the first to create an environment class and the second to create a delegate) clearly indicates the memory allocation.





Now look at the call to the FirstOrDefault method, which is an extension of the Ienumerable&lt;t&gt; class, which also produces a memory allocation. Because FirstOrDefault uses ienumerable&lt;t&gt; as the first parameter, you can expand the above to the following code:





Expanded return symbols. FirstOrDefault (predicate) ...


Ienumerable&lt;symbol&gt; enumerable = symbols;


Ienumerator&lt;symbol&gt; enumerator = Enumerable. GetEnumerator ();


while (enumerator. MoveNext ())


{


if (predicate (enumerator). Current))


Return Enumerator. Current;


}


return default (Symbol);





The symbols variable is a variable of type list&lt;t&gt;. The List&lt;t&gt; collection type implements the Ienumerable&lt;t&gt; and clearly defines an iterator,list&lt;t&gt; iterator that uses a struct to implement it. Using structs rather than classes means that you can generally avoid any allocations on the managed heap, which can affect the efficiency of garbage collection. An enumeration is typically used to facilitate the use of a foreach loop at the language level, and he uses the enumerator structure to return it on the call push stack. Increments the call stack pointer to allocate space for an object and does not affect the GC's operations on managed objects.





In the example that expands the FirstOrDefault call above, the code calls the GetEnumerator () method in the Ienumerabole&lt;t&gt; interface. Assigning symbols to a enumerable variable of type ienumerable&lt;symbol&gt; causes the object to lose its actual list&lt;t&gt; type information. This means that the contemporary code passes through the enumerable. When the GetEnumerator () method gets the iterator, the. NET Framework must boxing the returned value (that is, the iterator, using the struct implementation) to assign it to the Ienumerable&lt;symbol&gt; type (reference type) Enumerator variable.





Workaround:





The solution is to rewrite the Findmatchingsymbol method, replacing a single statement with six lines of code, which is still coherent, easy to read and understand, and easy to implement.





Public Symbol Findmatchingsymbol (string name)


{


foreach (Symbol s in symbols)


{


if (s.name = = Name)


return s;


}


return null;


}





The LINQ extension method is not used in the code, lambdas expressions and iterators, and there is no additional memory allocation overhead. This is because the compiler sees symbol as a collection of list&lt;t&gt; types, because it is possible to bind the returned structural enumerator directly to the type-correct local variable, thus avoiding boxing operations on the struct type. The original code shows the rich representation of C # language and the strong productivity of the. NET Framework. The 该着 code is more efficient and simple, and adds maintainability without adding complex code.


Aync Asynchronous





The following example shows a common problem when we try to cache a method return value:




Example 6 caching asynchronous Method





The features of the Visual Studio IDE are largely based on the new C # and VB compiler acquisition syntax tree, which can still keep visual stuido responsive when the compiler uses async. Here is the code to get the first version of the syntax tree:





Class Parser


{


/*...*/


Public Syntaxtree Syntax


{


Get


}





Public Task Parsesourcecode ()


{


/*...*/


}


}


Class compilation


{


/*...*/


Public async task&lt;syntaxtree&gt; Getsyntaxtreeasync ()


{


var parser = new parser (); Allocation


Await parser. Parsesourcecode (); Expensive


return parser. Syntax;


}


}





You can see that invoking the Getsyntaxtreeasync () method instantiates a parser object, parses the code, and then returns a Task&lt;syntaxtree&gt; object. The most performance-consuming place allocates memory for the parser instance and parses the code. method, the caller can await parsing and then release the UI thread so that it can respond to user input.





Because some of the features of Visual Studio may need to get the same syntax tree multiple times, parsing results may often be cached to save time and memory allocations, but the following code may result in memory allocations:





Class compilation


{ /*...*/


Private Syntaxtree Cachedresult;


Public async task&lt;syntaxtree&gt; Getsyntaxtreeasync ()


{


if (This.cachedresult = null)


{


var parser = new parser (); Allocation


Await parser. Parsesourcecode (); Expensive


This.cachedresult = parser. Syntax;


}


return this.cachedresult;


}


}





There is a Synataxtree type of field named Cachedresult in the code. When the field is empty, Getsyntaxtreeasync () executes, and then saves the result in the cache. The Getsyntaxtreeasync () method returns the Syntaxtree object. The problem is that when you have a async asynchronous method of type task&lt;syntaxtree&gt;, you want to return the Syntaxtree value, the compiler will produce code to assign a task to hold the execution result (by using the task&lt; Syntaxtree&gt;. Fromresult ()). The task is marked as complete, and the result is returned immediately. Assign a Task object to store the results of execution this action is called very frequently, so fixing the allocation problem can greatly improve application responsiveness.





Workaround:





To remove a save that completes the assignment of a task, you can cache the task object to save the completed results.





Class compilation


{ /*...*/


Private task&lt;syntaxtree&gt; Cachedresult;


Public task&lt;syntaxtree&gt; Getsyntaxtreeasync ()


{


Return This.cachedresult?? (This.cachedresult = Getsyntaxtreeuncachedasync ());


}


Private async task&lt;syntaxtree&gt; Getsyntaxtreeuncachedasync ()


{


var parser = new parser (); Allocation


Await parser. Parsesourcecode (); Expensive


return parser. Syntax;


}


}





The code changes the Cachedresult type to task&lt;syntaxtree&gt; and introduces the async help function to hold the Getsyntaxtreeasync () function in the original code. The Getsyntaxtreeasync function now uses the null operator to indicate that it is returned directly when the Cachedresult is not empty, Getsyntaxtreeasync call Getsyntaxtreeuncachedasync () The results are then cached. Note that Getsyntaxtreeasync does not await call Getsyntaxtreeuncachedasync. Not using await means that when Getsyntaxtreeuncachedasync returns a task type, Getsyntaxtreeasync immediately returns a task, which is now cached with a task, so there is no additional memory allocation when the cached result is returned.


Other miscellaneous things that affect performance





In a large app or app that handles a lot of data, there are a few potential performance problems that can arise.


Dictionary





In many applications, dictionary is used very widely, although the word is very convenient and college, but often use improper. In Visual Studio and the new compilers, the profiling tools are used to discover that many Dictionay contain only one element or are simply empty. There will be 10 fields inside an empty Dictionay structure that will occupy 48 bytes on the managed heap on the x86 machine. Dictionaries are useful when you need a constant time lookup in a mapping or associated data structure. But when there are only a few elements, using a dictionary wastes a lot of memory space. Instead, we can use the list&lt;keyvaluepair&lt;k,v&gt;&gt; structure to facilitate, for a small number of elements, the same college. If you just use a dictionary to load data and then read the data, then using an ordered array of lookup efficiencies with n (log (n)) will also be fast in speed, of course, depending on the number of elements.


Classes and structs





Not strictly speaking, classes and structs provide a classic space/time tradeoff (trade off) in optimizing applications. On the x86 machine, each class allocates a byte of space, even without any fields, to save the type object pointer and the synchronized index block, but it is very efficient and inexpensive to pass the class as a parameter between methods, because only a pointer to a type instance is passed. If the structure is not hit, no memory allocations are generated on the managed heap, but when a larger structure is used as a method parameter or when the return is worth it, CPU time is required to automatically copy and copy the structure, and then the properties of the structure are cached locally to avoid excessive data copying.


Cache





A common technique for performance optimization is caching results. However, a memory leak can occur if the cache does not have a maximum size or a good resource release mechanism. When dealing with large amounts of data, if you cache too much data in the cache, it consumes a lot of memory, which can result in more garbage collection than the benefits of finding results in the cache.


Conclusion





In large systems, or in systems that need to handle large amounts of data, we need to focus on performance bottlenecks that can be scaled to affect the responsiveness of the app, such as boxing operations, string manipulation, LINQ and lambda expressions, caching async methods, Caching lacks size limits and good resource-release policies, improper use of Dictionay, and delivery of structures everywhere. When optimizing our applications, we need to be aware of the four points mentioned earlier:





Do not optimize prematurely-tune in after locating and discovering the problem.


Professional testing does not lie-there is no evaluation, it is speculation.


Good tools are important. --Download Perfview and then go to the tutorial.


Memory allocation determines the responsiveness of the app. This is where the new compiler performance Team spends the most time.





Resources





If you want to watch a speech on this topic, you can watch it on Channel 9.


VS Profiler Foundation Http://msdn.microsoft.com/en-us/library/ms182372.aspx


. NET English Program performance analysis tools list http://msdn.microsoft.com/en-us/library/hh156536.aspx


Windows Phone Performance analysis tool http://msdn.microsoft.com/en-us/magazine/hh781024.aspx


Some C # and VB Performance optimization recommendations http://msdn.microsoft.com/en-us/library/ms173196 (v=vs.110). aspx (Note: This link has no content in the original text, the connection address should be made http:// msdn.microsoft.com/en-us/library/ms173196 (v=vs.100). aspx)


Some advanced optimization recommendations http://curah.microsoft.com/4604/improving-your-net-apps-startup-performance





--------------------------------------------------------------------------





The above is the whole story of this article, many things are actually very basic, such as value types (such as structure) and reference types (such as classes) and the difference between the use of scenes, string operations, box unboxing operations, etc., these in the CLR Via C # This book has a systematic description and explanation. It is particularly necessary to emphasize that many times we do not realize that boxing operations occur, such as the enumeration types mentioned in the article to get hashcode will cause boxing, and the same problem is that, usually when we use the value type as the Dictionay key, Dictionay in the internal implementation will call the key of the GetHashCode method to get hash value to hash, the default method will cause boxing operation, before the interview I was also asked this question, in a long time ago, Lao Zhao wrote an article to prevent boxing implementation in the end, only to do half also failed This is a special topic, so you need to be careful when writing code.





Microsoft has rewritten C # and Visual Basic compilers with the managed language and achieved better results than the previous compilers, and more importantly the compiler is open source, and many of the powerful features of VS are built on some of the compiler's compilation and analysis results. This is the embodiment of the compiler as a service, that is, "the traditional compiler is like a black box, you enter code at one end, and the other end is generated." NET assembly or object code, and so on. The black box is mysterious, and it's hard for you to get involved or understand its work at the moment. "Now that the compiler is open source, we can use some of the results that are generated in the middle to implement some functions, such as C # Interactive (sometimes called repl, or Read-eval-print-loop)." If you do, you can rewrite a simple Visual Studio.





The author of the article describes performance optimization practices in C # and Visual Baisc compilers using managed languages to explain some of the considerations and suggestions for performance optimization, in many ways, such as StringBuilder allocation overhead, caching of async function return values, The extra memory allocations generated by LINQ and lambda expressions are impressive. There is also a very important aspect is not blindly without the basis of optimization, first locate and find the cause of performance problems is the most important, I used to use the CLR profile, VS profiles and dottrace to view and analyze the performance of the application, The Perfview tool mentioned in this article is used by the Microsoft internal. NET Runtime team to see the information that the general tools can't provide, and it is very powerful and has a detailed description of how the tool is used in CHANNEL9. The following is a brief description of how the tool is used.





I hope this article will help you to optimize the performance of. NET applications.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.