How to make the program run faster? -- Try the Performance Analysis Tool (By Jun Guo) in Visual Studio)

Source: Internet
Author: User

Performance? We are back to this eternal topic. Yep, most programmers are persistently pursuing performance. A country prefers "more fast, better Province", and "more" and "Province". However, it is our greatest pleasure to make our programs run fast and better. To do the same thing, others' programs need to run for 1 minute, while their own programs only need a few seconds, this is a cool thing (you beat 99% of programmers across the country ......)!

However, in this case, efficiency optimization is not easy in practice. Time complexity is the easiest place to open the gap in efficiency, but it is also the most difficult place to open the gap between people-after all, solutions to many problems are relatively mature, it does not seem easy to find an algorithm with better time complexity. However, even for the two programs with the same complexity, the running efficiency often varies greatly due to different program constants. Try qsort in stdlib. h and sort in STL to clearly see how different sorting algorithms of the same O (NlogN) can be. In addition, the actual program often involves I/O, thread communication, and other operations. The speed of these operations is not determined by complexity analysis.

Therefore, it is very important and complex to improve program efficiency as much as possible within a limited time. Visual Studio provides us with powerful performance analysis tools, allowing us to quickly identify performance bottlenecks of programs, so as to improve program constants in a targeted manner.


Let's take a look at a simple example. The function of the following program is very simple: read a text file, count the frequency of occurrence of each word, and output the 100 words with the highest word frequency. A word is a string consisting of consecutive uppercase and lowercase letters. That is, I m is considered as an I and m word.

static void Main (string [] args)
{
    const int MAX_WORD_NUM = 1000000;
    const int BUFFER_SIZE = 100000;
    const int OUTPUT_NUM = 100;

     // read in file
    StreamReader sr = new StreamReader (new BufferedStream (new FileStream (
                "different.txt", FileMode.Open), BUFFER_SIZE));
    string data = sr.ReadToEnd ();

      // cut out words
    string [] words = Regex.Split (data, "[^ a-zA-Z]");

      // Statistics word frequency
    Dictionary <string, int> dict = new Dictionary <string, int> ((int) (MAX_WORD_NUM * 1.5));
    foreach (var word in words)
    {
        if (word == "")
            continue;

        if (dict.ContainsKey (word))
            dict [word] ++;
        else
            dict.Add (word, 1);
    }

    List <Tuple <int, string >> list = new List <Tuple <int, string >> (MAX_WORD_NUM);
    foreach (var item in dict)
    {
        list.Add (Tuple.Create (item.Value, item.Key));
    }

    // output the first 100 words with the highest word frequency
    list.Sort ();
    int count = 0;
    for (int i = list.Count-1; i> = 0; i--)
    {
        Console.WriteLine (list [i] .Item2 + "" + list [i] .Item1);
        count ++;
        if (count> OUTPUT_NUM)
            break;
    }

    sr.Close ();
}

The portable file different.txt is approximately 100 MB. Okay, now let's run the program! After about 10 s, the program outputs the result. The results are correct, but the efficiency is not too low (I mentioned a similar word frequency statistical program in a previous blog post, the program performs Word Frequency Statistics on the m text file for about 4 s, that is, the speed is 8 times that of the above program ). OK. What is the problem with the above program? There are different opinions. Some people read files because I/O is very slow. Some people say that the query and storage operations of the Hash table are time-consuming (although the ideal scenario is O (1 ), however, conflicts may deteriorate). Some people think that the final sorting consumes a lot of time. After all, the complexity is the highest (excluding the string length, and other operations are O (N ), the sorting is O (NlogN), which is indeed a bit higher ).


To prevent further complaints, let's try the performance analysis tool.

Make sure thatUse Release to compileIn Visual Studio 2012, and then select"Analysis" --> "start performance wizard", You can see:


We can see two performance analysis methods:

  • CPU sampling
  • Detection

To put it simply,CPU samplingWhen the program is running, Visual Studio regularly checks which function is running in the current program and records it. After the program runs, Visual Studio gives a general impression about the time distribution of the program running. The advantage of this method is that the program does not need to be modified and runs fast, so the performance bottleneck can be quickly obtained. However, this method cannot produce accurate data, and sometimes there may be errors.

WhileDetectionVisual Studio injects the detection code into every function, so that every action of the entire program is recorded, and all the performance data of the program can be accurately measured. However, this method will greatly increase the running time of the program, and the data analysis time will become very long.

 

In general, we will first use the CPU sampling method to find the performance bottleneck, and then analyze the specific modules in detail using the detection method. Since the two methods are similar, the following only shows the usage of CPU sampling. After sampling the CPU of the above program, we can see the following report:


Click the Main function in.To view more specific reports, such:


In the "called function" on the rightmost side, click the corresponding function.You can also jump to the time consumption statistics of each line of code in the function. Since the time-consuming statistics of the above functions are enough for performance optimization, it is not necessary for me to go into the code line for the time being, so I will not go into details here.

As you can see, sorting really occupies a lot of running time. However, there is an unexpected existence-Regex. the Split function takes nearly 30% of the running time. In contrast, the query and insert operations on the Hash table only take about 1% of the time. As for the I/O operation overhead, it is even more invisible. Some people may have guessed that the Split function would be time-consuming at the beginning, but it takes 30% of the time.

 

We may wish to implement this Spilt function by ourselves (don't let me know why I don't give priority to Sort improvement. Here I just want to show it here ). The Code is as follows:

// cut out words
List <string> words = new List <string> (MAX_DIFF_WORD_NUM * 10);
int curPos = data.Length-1, lastPos = curPos;
while (curPos> = 0)
{
       while (curPos> = 0 &&! char.IsLetter (data [curPos]))
             curPos--;
       lastPos = curPos;

       while (curPos> = 0 && char.IsLetter (data [curPos]))
             curPos--;
       words.Add (data.Substring (curPos + 1, lastPos-curPos));
} 

Then run the analysis task again.


Well, this is more reasonable. The time of the entire program is basically spent on Sort (after all, we have not improved Sort), and the overhead of I/O operations began to reflect, the huge overhead of Spit functions has been greatly reduced. (Why is manual Split faster? Because the overhead of detecting [^ a-zA-Z] in Regex. Split is much greater than that! Char. isLetter () overhead, 52 operations vs. 4 Operations .)

 

OK. The next optimization goal is Sort. I believe that the algorithm experts are very clear about how to optimize it, because we only need to take the first 100 words of the word frequency, so it is not necessary to complete the sorting, you can use a minimum heap that can hold 100 elements. I will not go into details here.

 

In this way, we can gradually improve the program performance and our own programming level by following the "Performance Analysis-> improvement-> Performance Analysis" process.

Note that when writing a program, it is best not to perform "performance optimization" too early without analysis. As mentioned above, although it is mentioned that Hash tables and I/O operations affect performance, this is not the case from the performance analysis results. The time overhead of both is very small. If the optimization is blind without analysis, it may be half the effort.

In addition, although the performance analysis tool specifies the time consumption of each part of the program, it does not mean that the most time-consuming part of improvement must be prioritized. Of course, the most time-consuming part of improvement is often the most obvious, but this does not mean that the most time-consuming part is easy to improve. Although the Split function shown above is not the most time-consuming, its improvements are very simple and sometimes become the objects for our priority improvement. In actual projects, we need to compromise between the effects we can achieve and the effort we need to invest in improvement, and give priority to the performance optimization with the ability to do and obvious effects.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.