. Net string Performance

Source: Internet
Author: User
Tags coding standards

Introduction

The method you use to process strings in your code may have a surprising impact on performance. In this article, I need to consider two problems caused by the use of strings: the use of temporary string variables and string connection.

 

Background

Every project requires you to consider coding standards for it. Using FxCop is a good start. One of my favorite FxCop rules is "performance.

So I used FxCop to check my project and find a series of string problems. I must admit one thing: I often encounter problems related to the immutable string of C. When I see myString. when ToUpper (), I often forget that it does not change the content of myString but returns a whole new string (because the string in C # is immutable ).

I corrected the Code to remove the warning from FxCop, and then I found that the code was indeed faster than before. I decided to conduct the investigation, and eventually I would write the code for the tests above.

 

Using the code

The test code is very simple. A console program calls four test methods, each of which executes a string processing routine for 1000 times (the entire execution time is long enough to see the performance difference ).

The four test methods are divided into two groups, each of which is two. The first group compares two methods, which are used for case-insensitive string comparison.

 

String Comparison and Temporary String Creation

The first test routine is a lame non-case sensitive string comparison. The code for the comparison routine is:

Static bool BadCompare (string stringA, string stringB)
{
Return (stringA. ToUpper () = stringB. ToUpper ());
}

For this code, FxCop provides the following suggestions:

"StringCompareTest. badCompare (String, String): Boolean callstring. op_Equality (String, String): Boolean after converting 'stack1', a local, to upper or lowercase. if possible, eliminate the string creation and call the overload of String. compare that performs a case-insensitive comparison."

This suggestion means that each call to ToUpper () creates a temporary string, which is created and managed by the garbage collector. This requires additional time and memory usage. The String. Compare method (relatively speaking) is more efficient.

The second test routine uses String. Compare:

Static bool GoodCompare (string stringA, string stringB)
{
Return (string. Compare (stringA, stringB, true, System. Globalization. CultureInfo. CurrentCulture) = 0 );
}

This method prevents redundant temporary strings from being created.

According to the nprof analysis results, GoodCompare only takes 1.69% Of the total execution time of the Code, while BadCompare takes 5.50% Of the total execution time.

Therefore, the String. Compare method is more than three times faster than the ToUpper method. If you Compare many strings in your code (especially in loops), you can use String. Compare to greatly improve the performance of your code.

 

String Concatenation inside a loop

Finally, the test routine assumes that the string connection is carried out in a loop.

The code for the "lame" test routine is as follows:

Static string BadConcatenate (string [] items)
{
String strRet = string. Empty;

Foreach (string item in items)
{
StrRet + = item;
}

Return strRet;
}

When FxCop sees this code, it will get angry and even mark the broken rule in red! FxCop said:

"Change StringCompareTest. BadConcatenate (String []): String to use StringBuilder instead of String. Concat or + ="

The code for the "excellent" test routine is as follows:

Static string GoodConcatenate (string [] items)
{
System. Text. StringBuilder builder = new System. Text. StringBuilder ();

Foreach (string item in items)
{
Builder. Append (item );
}

Return builder. ToString ();
}

This code is almost used as the first example to demonstrate the usage of System. Text. StringBuilder. The problem with poor code is that too many temporary strings are created. Due to the immutable character string, the concatenation operator (+ =) uses the original two strings to create a new string, and then points the original string instance to this new string.

However, based on nprof to study code performance, we found that running BadConcatenate only requires 5.67% Of the total execution time, while GoodConcatenate is 22.09%. That is to say:

Using StringBuilder takes almost four times as long as a simple string connection!

Why?

This is partly due to the design of this test-the connection routine only connects ten short strings. StringBuilder is a more complex class than a simple and immutable string class. Therefore, it is much more expensive to create a StringBuilder than to connect ten simple strings.

I repeated the tests for different numbers of string connections and found the following results:

Note: The value shown here is the percentage of execution time of the test routine to the total execution time (% ). GoodConcatenate is actually not much faster, but it is relatively faster than BadConcatenate.

Therefore, StringBuilder usually only shows the real performance advantage when the number of strings you want to connect exceeds 600.

Of course, another reason for using StringBuilder is the memory allocation. Use CLRProfiler to generate the following sequence diagram of memory usage when 100 simple strings are connected:

The area marked as "A" shows the effect of BadConcatenate on memory allocation and release. The maximum allocated memory increases rapidly with a large amount of garbage collection (approximately 215 garbage collection times in this region ).

The GoodConcatenate memory profile is displayed immediately after Area. The maximum increment of allocated memory is small and accompanied by a very small amount of garbage collection (approximately 60 garbage collection times in this region ).

So in some cases, using the StringBuilder class will not (make your code run) faster, but it is friendly to the garbage collector.

 

Conclusions

Use the string. Compare method to compare non-case sensitive strings. This is faster. The code is elegant and simple.

Use stringbuilder for better speed only when you perform more than 600 string connections in a loop. Note that the length of the string you are processing will also affect the final speed and the effect of the garbage collector, therefore, you should analyze the problem based on your actual code.

 

Points of interest

To my surprise, the methods for using the correct code string operations in the real world are still quite different (although we have compared and connected many strings in the current project ).

Fxcop performance rules are a good starting point for discovering potential low-performance code and can guide you through some simple corrections to improve code performance. Both of the issues discussed here are marked as "NON-BREAKING" by fxcop, which means that changes should not corrupt the code that relies on the changed code. I think that the changes made to improve performance are "NON-BREAKING" is a brainless idea.

 

Further Considerations by Allen Lee

The majority of. NET developers agreed to use stringbuilder to process string connections. But have you ever wondered whether the applicability of this empirical principle is as broad as you think? After reading this article, you may have realized that this is a moderate problem. The improvements brought about by using stringbuilder for small-scale string connections are not enough to cover the overhead caused by the complexity of stringbuilder. Only when the connection scale reaches the critical scale can the two compensate each other and reach a balance.

For actual code, a critical scale value that can be used may be necessary, especially for development on restricted systems. You may have an understanding of the factors that affect the critical size and doubt the number given by the author here. Maybe the design used for testing in this article is a little simple and may not convince more people, but you do know through this article that StringBuilder is not applicable in any situation. Because factors that affect critical scale are always subject to change, you cannot find a fixed critical scale value that applies to any situation. You should customize your code and make adjustments at any time (because changes always exist), as long as you really care about the performance impact. As a starting point, you can use the number mentioned in this article as a reference and fine-tune the specific situation until you are satisfied.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.