Summary: Performance analysis and optimization reports under Windows

Source: Internet
Author: User

  

Performance Analysis and optimization

The vs2017 comes with a performance analysis tool.

The main analysis of the performance bottlenecks encountered, as well as the thought of optimization methods, some verified, and some did not have time.

First look at overall time and CPU occupancy.

The final run time on my device (i5-5200u Samsung 860EVO Solid state) is about 27.3S. CPU share is relatively stable during the period.

The first 0.5 seconds CPU occupancy is low, presumably because this time is just beginning to read the file, the CPU does not handle the task, and then went to the side to read the state of the calculation, CPU occupancy on the up, about 25%, but still not high.

And I have a very strange phenomenon here.

          

Until the end of the code, Readbychar's share has been high, possession has been close to 100%, which is a little difficult to understand, the last CPU to do is the thing is to traverse the 1600多万个 Word to find the maximum value, and feel that it should take a lot of time, In other words, the top () function to find the top ten is very high in the last few seconds, but it is not found, and it should be the result of this performance analysis.

Let's look at performance bottlenecks first:

Readbychar is a function that I write to read characters, as well as the logic implementation of the word judgment, and it calls a lot of other functions, so it is expected to float red.

Enter the function, the first to occupy the highest is

      

Openbychar is a fstream instance object, get () is the method to get the character, although the bar statement takes a lot of time, but this is necessary, the processing idea is to require one of the processing, basic no optimization, but if the object is created, but the C language file read method , because you do not need to create an object call method, it must be fast, but the last time did not come and, there is no change.

From here we can also find that the file read (IO) is a bottleneck, I was thinking about a significant improvement of the problem, not too sure that multithreading can improve this problem, because in this program should not finish processing the current character is not read the next, but multi-thread, you can divide the file into multiple separate processing , combined, because from the code hot line, hash table access is very fast, almost no time-consuming, so that the final result integration should also not spend much time, but the frustration of time, and the DDL before the bug, there is no time to verify, recently have a chance to try .

Word[sword] and Wordend[sword] Find are more, but this is also no way, this part of the time spent searching is also necessary, and the choice of hash is the best way to think, and then want to optimize, you have to analyze the characteristics of the data, change the default hash function , so that the words to minimize conflict, it takes too much time, it is not necessarily faster than the default hash, so do not engage.

  

  This place is the only one I've said before. is to make excellent results of optimization, first of these two sentences is to complete the phrase splicing, each decision success will be executed once, from the last word results see is executed 1600多万次, I figure simple, direct write is sphrase=sphrase+ "" +sword, The total time share is as high as 10%, but the right side of the write expression will first create an object to store the results, and then assign to Sphrase, more complex, want to be faster should call the string to the stitching method, I changed, the share directly below 1%.

The rest of the code hardly takes much time during the entire code run.

But if it really is the pursuit of the ultimate, I feel a few details can be modified under.

1

Some of the variable objects in a frequently called function can be statically modified to be static, so that it is not necessary to recreate the object every time it is called, which is characterized by the Getmax (), getmin () function, which are used many times in the output file and in the first ten of the search. But for a 30s mission, such optimizations have no effect at all.

  

2

Find the first ten methods, I am a traversal, the process of storing an array, each time to replace the minimum value, so that every word read, it is necessary to find the minimum value in 10 words, this from the analysis is not a performance bottleneck, because 10 is too small, but if the search for the smallest 10,000, 100,000 must not be able to write this, although we still use a traversal method, but should be maintained at this time is a 10000-size minimum heap.

The above is the performance analysis and optimization ideas

  

  

  

Summary: Performance analysis and optimization reports under Windows

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.