Comparison of Methods and performance for finding the number of occurrences of characters in a string

Source: Internet
Author: User
Today, I introduced a question in a "special discussion". How can I find the number of occurrences of a character in a string in C #, such as the number of occurrences of "A" in "adsfgehergasdf. The first method is to traverse the strings from the beginning and calculate the statistics:

C1 = 0;
For (INT I = 0; I <Str. length; I ++)
{
If (STR [I] = 'A ')
{
C1 ++;
}
}

The second method is also easy to think of, remove all the characters in the string to be searched, and then compare and remove the string length before and after. This method was despised by someone who said it had poor performance and occupied space.

C2 = Str. Length-Str. Replace ("",String. Empty). length;

Next, someone proposed the third method, which is to use the character to be searched as the separator, separate the original string into multiple substrings, and then obtain the number of substrings. In C #, this is a very short method:

C3 = Str. Split (New char [] {'A'}). Length-1;

We can infer the sequence of performance from the principle, but what is the gap? We still need to test it. This is a very classic test code:

String STR = "sadthdgsafsdgtghrdgsadfaddrhdfsgasdaa ";

StopwatchSw = new stopwatch ();

Long T;
Int C = 0;
GC. Collect ();
Application. Doevents ();

Sw. Start ();

For (INT I = 0; I <100000; I ++)
{
C = three algorithms
}

Sw. Stop ();

T = Sw. elapsedmilliseconds;

First, we ensure the correctness. The three methods tested can correctly handle multiple situations, including first and end, continuous appearance, no appearance, or the string length is 0, the string I take is a very common string. Compile as release and get the following results after ten pre-runs:

Traversal statistics: 13 Ms
Comparison length after replacement: 112 ms
Count after string disconnection: 233 Ms

Here, the travel time is different. The traversal statistics are 10 times faster than after replacement, and the disconnection string is slower. Next, I did the following two tests:

1. Increase or decrease the number of strings to be searched without changing the length of the string.
2. Do not change the occurrence frequency of characters to be searched, but increase the length of the string.

The results show that the three methods linearly slow down with the increase of the string length, and the latter two methods also slow down with the increase of the characters to be searched. The method for disconnecting strings is also affected by the distribution of strings to be searched.

The implementation of the replace function and split function can completely solve this problem. But I am not in the mood to study it carefully, I decided to choose the second method-after replacement, compare the length. Although the speed is slower than the first method, it is easy to rewrite to find the number of occurrences of a substring with a length not 1. If the first method is used to find a string with a length greater than 1, many factors need to be taken into consideration (although not necessarily very troublesome )!

# C # column

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.