From the evolution of delegate writing in. Net (II): performance-related

Source: Internet
Author: User

In the previous articleArticleIn C #3.0, we detail how lambda expressions (construct a delegate) are used, its Semantic advantages and simplified programming-these contents are already delegated "extended content ". It is better to go further this time. Let's discuss the performance-related topics of the above "programming methods.

Cyclic separation and Its Performance

In the first example above, we demonstrated how to use lambda expressions in combination with the extension methods defined in. Net 3.5 to easily process elements in the Set (filtering, conversion, and so on ). However, some may suggest that "normal writing" is not the best implementation method. For convenience, in order to highlight the "performance" issue, we simplified the original requirement: output the even square in the sequence into a list. Which of the following statements may be:

 Static  List < Int > Evensquare ( Ienumerable < Int > Source ){ VaR Evenlist = New  List < Int > (); Foreach ( VaR I In Source ){ If (I % 2 = 0) evenlist. Add (I );} VaR Squarelist = New  List < Int > (); Foreach ( VaR I In Evenlist) squarelist. Add (I * I ); Return Squarelist ;}

Theoretically, this method is indeed inferior to the following:

 
StaticList<Int> Evensquarefast (Ienumerable<Int> Source ){List<Int> Result =NewList<Int> ();Foreach(VaRIInSource ){If(I % 2 = 0) result. Add (I * I );}ReturnResult ;}

In the second method, you can filter data directly in a traversal and convert the data directly. The first write method divides the method into two steps based on the "function description". Filter and convert it, and save it with a temporary list. When an element is added to the temporary list, list <int> may double and copy the element when the capacity is insufficient, resulting in performance loss. Although we can draw a conclusion through "analysis", it is better to use codetimer to test the actual results:

 
List<Int> Source =NewList<Int> ();For(VaRI = 0; I <10000; I ++) source. Add (I );// PushEvensquare (source); evensquarefast (source );Codetimer. Initialize ();Codetimer. Time ("Normal", 10000, () => evensquare (source ));Codetimer. Time ("Fast", 10000, () => evensquarefast (source ));

We have prepared a list with a length of 10000 and executed 10 thousand times each using evensquare and evensquarefast. The results are as follows:

 
NormalTime elapsed: 3,506 Ms CPU cycles: 6,713,448,335 Gen 0: 624 Gen 1: 1 Gen 2: 0FastTime elapsed: 2,283 Ms CPU cycles: 4,390,611,247 Gen 0: 312 Gen 1: 0 Gen 2: 0

The results are the same as we expected. evensquarefast is ahead of the evensquare method in terms of both performance and GC. However, in actual situations, which method should we choose? If it was me, I would prefer to choose evensquare, on the grounds of being "clear.

Although evensquare uses an additional temporary container to save intermediate results (resulting in performance and GC loss), its logic matches the features we need, we can easily seeCodeThe meaning of the expression. The performance loss caused by this vulnerability is negligible in actual projects. In fact, most of our performance is consumed on the features of each step. For example, the time consumed by int32.parse each time is dozens or even hundreds of times of a simple multiplication. Therefore, although our tests reflect a performance gap of over 50%, since this is only the time consumed by "Pure traversal", if we calculate the time consumed by each step, the performance gap may change to 10%, 5%, or even lower.

Of course, if the logic is as simple as the above Code, there is no problem in using the evensquarefast implementation method. In fact, we do not have to force that all steps are fully merged (that is, only one loop is used) or completely separated. We can seek a balance between readability and performance. For example, it is more appropriate to use two cycles in five steps.

Speaking of "decomposition loop", this is actually similar to one of Martin Fowler's refactoring methods listed on his website: "split loop ". Although split loop is slightly different from our scenario, it also avoids putting multiple logics in one loop for code readability. After the loop is split, you can use "Extract Method" or "replace temp with query" to perform further refactoring. Naturally, it also mentions the performance impact after splitting:

You often see loops that are doing two different things at once, because they can do that with one pass through a loop. indeed most programmers wowould feel very uncomfortable with this refactoring as it forces you to execute the loop twice-which is double the work.

But like so many optimizations, doing two different things in one loop is less clear than doing them separately. it also causes problems for further refactoring as it introduces temps that get in the way of further refactorings. so while refactoring, don't be afraid to get rid of the loop. when you optimize, if the loop is slow that will show up and it wocould be right to slam the loops back togetherAt that point. You may be surprised at how often the loop isn't a bottleneck, or how the later refactorings open up another, more powerful, optimization.

 

After splitting, you may find a better optimization method. Grandpa Gartner also believes that "premature optimization is the source of all evil ". All these arguments are encouraging usProgramWrite more clearly than "looks" more efficient.

Latency of extension methods

For the preceding simplified requirements, you can use lambda expressions and built-in extension methods in. Net 3.5 as follows:

 
StaticList<Int> Evensquarelambda (Ienumerable<Int> Source ){ReturnSource. Where (I => I % 2 = 0). Select (I => I * I). tolist ();}

Many of you know about it. when processing a set in net 3.5, the extension method has a "delay" effect. That is to say, the delegate (two lambda expressions) in the where and select operations are executed only when the tolist method is called. This is both an advantage and a trap. When using these methods, we still need to understand the effects of these methods. However, these methods do not actually have any "tricks". In other words, their behaviors are consistent with the results of our normal thinking. If you want to understand, you can write similar methods, or you can "Justify yourself", there will be no deviation. However, if you do not want to understand how they are constructed, let's test it. The experiment method is actually very simple, as long as we have verified the "repeated calculation" trap method, that is, to observe the execution time and sequence of the Delegation for judgment.

Okay. Let's go back to our current issue. We know the "latency" effect, and we know that where and select will be processed only during tolist. However, what are their processing methods? Is it like "creating a temporary container (such as list <t>) like our" normal method "and filling in and returning? We will not analyze this much, but we will still find the answer by "observing the timing and sequence of delegated execution. The key to using this method is to print some information during the delegate execution. To do this, we need such a wrap method (you can also use this method when doing your own experiment ):

 
StaticFunc<T, tresult> wrap <t, tresult> (Func<T, tresult> func,StringMessgaeformat ){ReturnI => {VaRResult = func (I );Console. Writeline (messgaeformat, I, result );ReturnResult ;};}

The purpose of the wrap method is to encapsulate a func <t, tresult> delegate object and return a delegate object of the same type. Each time the encapsulated delegate is executed, the delegate object we provide will be executed, and the output will be formatted according to the messageformat we passed. For example:

VaRWrapper = wrap <Int,Int> (I => I + 1,"{0} + 1 = {1 }");For(VaRI = 0; I <3; I ++) wrapper (I );

The output is as follows:

 
0 + 1 = 11 + 1 = 22 + 1 = 3

So what will be printed in the following code?

 List < Int > Source = New  List < Int > (); For ( VaR I = 0; I <10; I ++) source. Add (I ); VaR Finalsource = source. Where (WRAP < Int , Bool > (I => I % 3 = 0, "{0} can be divided by 3? {1 }" ). Select (WRAP < Int , Int > (I => I * I, "The square of {0} equals {1 }." ). Where (WRAP < Int , Bool > (I => I % 2 = 0, "The result {0} can be devided by 2? {1 }" ));Console . Writeline ( "===== Start ====" ); Foreach ( VaR Item In Finalsource ){ Console . Writeline ( "==== Print {0 }===" , Item );}

We prepare a list containing 10 elements ranging from 0 to 9, and perform where... Select... Can you guess the content on the screen after foreach?

===== Start ==== 0 can be divided by 3? Truethe square of 0 equals 0.the result 0 can be devided by 2? True ===== print 0 ===== 1 can be divided by 3? False2 can be divided by 3? False3 can be divided by 3? Truethe square of 3 equals 9.The result 9 can be devided by 2? False4 can be divided by 3? False5 can be divided by 3? False6 can be divided by 3? Truethe square of 6 equals 36.the Result 36 can be devided by 2? True ===== print 36 ===== 7 can be divided by 3? False8 can be divided by 3? False9 can be divided by 3? Truethe square of 9 equals 81.the result 81 can be devided by 2? False

The execution sequence of the elements in the list is as follows:

    1. The first element "0" goes through where... Select... Where.
    2. The second element "1" goes through the WHERE clause and is aborted.
    3. The third element "2" goes through the WHERE clause and is aborted.
    4. The fourth element "4" goes through where... Select... Where: abort.
    5. ......

This shows that we use the where or select method that comes with the. NET Framework. The final effect is similar to the "merge loop" in the previous section. If a temporary container is created to save elements, all elements in the first where are executed by the first delegate (I => I % 3 = 0, then, the filtered elements are handed over to the Delegate (I => I * I) in the SELECT statement for execution. Please note that the effect of "merge loop" here is hidden from the outside, and our code seems to be processing the set step by step. In other words, we use the clear method of "decomposition loop", but achieve the efficient implementation of "merge loop. This is the magic of the extension methods of the. NET Framework.

Before we perform a specific performance test, let's take a look at the mode in gof 23 implemented by so many ienumerable objects? Enumerator? When we see ienumerable, we can say that the enumerator is too common. In fact, the "decorator" mode is also used here. Each time after where or select, a new ienumerable object is used to encapsulate the original object, so that we can get the "decoration" effect when traversing the new enumerator. Therefore, if someone asks you ". net Framework, in addition to the stream that everyone knows, you can also say ". net 3.5 system. LINQ. some extension methods in the enumerable class ", cool.

Performance testing of scaling methods

After the analysis in the previous section, we know that the where and select extension methods unify the appearance of the "decomposition loop" and the inner of the "merge loop, that is, it takes into account both "readability" and "performance ". Now we use the following code to verify this:

List<Int> Source =NewList<Int> ();For(VaRI = 0; I <10000; I ++) source. Add (I); evensquare (source); evensquarefast (source); evensquarelambda (source );Codetimer. Initialize ();Codetimer. Time ("Normal", 10000, () => evensquare (source ));Codetimer. Time ("Fast", 10000, () => evensquarefast (source ));Codetimer. Time ("Lambda", 10000, () => evensquarelambda (source ));

The result is as follows:

Normal time elapsed: 3,127 Ms CPU cycles: 6,362,621,144 Gen 0: 624 Gen 1: 3 Gen 2: 0 fast time elapsed: 2,031 Ms CPU cycles: 4,070,470,778 Gen 0: 312 Gen 1: 0 Gen 2: 0 Lambda time elapsed: 2,675 Ms CPU cycles: 5,672,592,948 Gen 0: 312 Gen 1: 156 Gen 2: 0

In terms of time, the performance of the "Extension Method" is between "decomposition loop" and "merge loop. In GC, the implementation of the "Extension Method" is better than that of the "decomposition loop" (Do you agree ?). Therefore, we can draw the following conclusions:

  Performance Readability
Decomposition cycle No. 3 No. 2
Merge Loop No. 1 No. 3
Extension Method No. 2 No. 1

You need to determine which method to choose.

It is worth noting that whether it is the effect of "latency" or "decomposition loop", we just talked about where and select. In fact, not all extension methods have similar features, such:

    • Non-delay:Toarray, tolist, any, all, count ......
    • Non-decomposition cycle:Orderby, groupby, todictionary ......

Don't worry, as mentioned in the previous section, whether to "delay" or "break down the loop" is very obvious. If you can write a similar method or be "self-explanatory", your judgment will not be wrong. For example, why is orderby not a "decomposition loop? Before being handed over to the next enumeration, it must retrieve all the elements in the previous step for sorting. If you cannot "naturally" think of these "reasons", write a program for verification.

Other performance problems

In general, these extension methods do not have performance problems themselves, but anything may be abused. This is the performance killer in the program. For example:

 
Ienumerable<Int> Source = ...;For(VaRI = 0; I <source. Count (); I ++ ){...}

The problem with this Code is that each cycle requires constant calculation of the number of elements in the source, which means continuous full traversal and traversal. For some "immediate execution" methods such as count or any, we must leave some concepts in our mind: will it cause performance problems. These problems are indeed easy to identify, but I have seen such errors. Even before these extension methods appeared, I have seen some friends write similar code, such as continuing to execute Str. Split (','). length in the for loop.

Let's emphasize the problem of "repeated computing", which may be more easily overlooked. If your "repeated computing" set is a list in the memory, this may not affect the performance. However, if you need to repeat external data sources (such as databases) to obtain raw data each time you repeat the computation, the performance problems you may encounter cannot be ignored.

Summary

This series of articles is written here, and it seems that I want to talk about it. These three articles are caused by my thoughts on recent interviews. At the beginning of my article, I also hope to prove that "delegation is also a content worth mentioning ". Understanding the writing of delegation, understanding these changes is not just "pipeline has several writing methods ". Here we reference the comments from my classmates, and I think it makes sense:

I think it is easier to determine the degree of understanding and understanding of a thing in different versions. The addition of each new feature in the new version is not random. Why should we add it? What are the shortcomings before it is added? What are the benefits after it is added, when and in what scenarios to use these new features, the risks and disadvantages of using them, and so on. If he can give a very stream response, even if it is not correct, it should at least mean that he has carefully thought about it, because these things may not exist in online books.

Therefore, during the interview, I will also ask questions such as "delegate writing evolution" or similar questions. I even think this is a very good start point, because such a classic evolution in. NET is rare. If a person can clarify all this content, I have reason to believe that his attitude towards technology is very worthy of recognition. On the other hand, there are many comments with ridicule after the original text. behind this, I don't know whether they really understand the Commission and think that the content is not worth mentioning, or whether they "think" understand all the content, but I don't know that there are huge deep thoughts and ideas hidden behind the seemingly simple ones.

I think we should not "despise" anything easily. For example, we despise the question raised by the other Party during the interview, and value the framework rather than the language, value the so-called "bottom layer" and despise "application ". Finally, I would like to end the full text by referencing Wayne's despicable words:

Generally, the C language is more powerful than C #, And the C language is written to Hello world.

Related Articles
  • From the evolution of delegate writing in. Net (I): Delegate and anonymous methods
  • From the evolution of delegate writing in. Net (medium): lambda expressions and their advantages
  • From the evolution of delegate writing in. Net (II): performance-related

       

      Note 1:Although where and select have a "delay" effect, the internal implementation of "decomposition loop" or "merge loop" is another option. Can you try to provide the where or select Implementation of "decomposition loop" and "merge loop" on the premise of "delay?

      Contact Us

      The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

      If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

      A Free Trial That Lets You Build Big!

      Start building with 50+ products and up to 12 months usage for Elastic Compute Service

      • Sales Support

        1 on 1 presale consultation

      • After-Sales Support

        24/7 Technical Support 6 Free Tickets per Quarter Faster Response

      • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.