Use GPROF to optimize your C/C ++ Program

Source: Internet
Author: User
**************************************** **************************************** * ***** Original link: bytes *************************************************************************************************************************

 

Use GPROF to optimize your C/C ++ Program

Summary:

When optimizing programs, remember to optimize them in areas worth optimization! There is no need to spend several hours optimizing a program that actually only runs 0.04 seconds.

GPROF uses an unusually simple but very effective method to optimize the C/C ++ program, and can easily identify code worthy of optimization. A simple case analysis will show how GPROF optimizes the number of applications in an application from 3 minutes to 5 seconds by identifying and optimizing two key data structures.

This program dates back to the special discussion Conference on compiler building in 1982 (the sigplan Symposium on Compiler Construction ). Now this program has become a standard tool on Various UNIX platforms.

___________________________________________________
 

Profiling in a nutshell

The concept of Program Summary analysis is very simple: by recording the call and end time of each function, we can calculate the maximum runtime program segment of the program. This method seems to take a lot of effort-fortunately, we are not far from the truth! We only need to add an additional parameter ('-PG') when compiling with GCC to run this (Compiled) Program (to collect the data related to program summary analysis ), then run 'gprof' to analyze these results more conveniently.

Case study: pathalizer

I used a program in reality as an example. It is part of pathalizer:event2dotA tool that converts the path "Event" description file into a graphical "dot" file (executable which translates a pathalizer 'events' file to a graphviz 'dot 'file ).

Simply put, it reads various events from a file and saves them as images (using pages as nodes and changing pages as edges ), these images are then integrated into a large image and saved as a graphical 'dot 'file.

Program timing

Let's show us how long they will take to run our unoptimized programs. Use on my computerevent2dotThe example in the source code is used as the input (about 55000 of the data), which takes about three minutes:

 

real    3m36.316s  user    0m55.590s  sys     0m1.070s   

 

Program Analysis

To use GPROF for summary analysis, you must add the '-PG' option during compilation. We will re-compile the source code as follows:

g++ -pg dotgen.cpp readfile.cpp main.cpp graph.cpp config.cpp -o event2dot

Now we can run it againevent2dotAnd use the test data we used earlier. When we run this time,event2dotThe running analysis data is collected and saved in the 'gmon. out' file. You can run 'gprofevent2dot| Less 'to view the results.

GPROF will show that the following functions are important:

 % cumulative  self              self     total                                              time seconds  seconds  calls s/call s/call name                                            43.32   46.03  46.03 339952989  0.00  0.00 CompareNodes(Node *,Node *)                      25.06   72.66  26.63    55000   0.00  0.00 getNode(char *,NodeListNode *&)                  16.80   90.51  17.85 339433374  0.00  0.00 CompareEdges(Edge *,AnnotatedEdge *)             12.70  104.01  13.50    51987   0.00  0.00 addAnnotatedEdge(AnnotatedGraph *,Edge *)         1.98  106.11   2.10    51987   0.00  0.00 addEdge(Graph *,Node *,Node *)                    0.07  106.18   0.07        1   0.07  0.07 FindTreshold(AnnotatedEdge *,int)                 0.06  106.24   0.06        1   0.06 28.79 getGraphFromFile(char *,NodeListNode *&,Config *) 0.02  106.26   0.02        1   0.02 77.40 summarize(GraphListNode *,Config *)               0.00  106.26   0.00    55000   0.00  0.00 FixName(char *)                      

It can be seen that the first function is very important: it occupies most of the running tasks in the program.

Optimization

The above results show that this program is spent most of the time.CompareNodesIn the function, use grep to check whether comparenodes isCompareEdgesWhile compareedges is called only once.addAnnotatedEdgeCall -- they all appear in the above list. Here is where we should optimize it!

We noticed thataddAnnotatedEdgeA linked list is traversed. Although the linked list is easy to implement, it is not the best data type. We decided to replace the linked list G-> edges with a binary tree: this will make the search faster.

Result

Now let's take a look at the optimized running results:

real    2m19.314s         user    0m36.370s         sys     0m0.940s          

 

Second time

Run GPROF again for analysis:

%   cumulative self           self    total                                      time   seconds seconds calls  s/call  s/call name                              87.01     25.25  25.25  55000    0.00    0.00 getNode(char *,NodeListNode *&)   10.65     28.34   3.09  51987    0.00    0.00 addEdge(Graph *,Node *,Node *)    

It seems that the functions that used to occupy a large amount of runtime are no longer the biggest ones in runtime! Let's try to optimize it: Replace the node tree with a node hash table.

This is a huge improvement:

real    0m3.269suser    0m0.830ssys     0m0.090s

 

Other C/C ++ program analyzers

There are many other analyzer tools that can use GPROF data, such as kprof (screenshot) and cgprof. Although the graphic interface looks more comfortable, I personally think that GPROF is more convenient to use.
 

Analyze programs in other languages

Here we will introduce how to use GPROF to analyze C/C ++ programs. For other languages, we can do the same: For Perl, we can use devel: dprof module. Your program shouldperl -d:DProf mycode.plAnd usedprofppTo view and analyze the results. If you can use gcj to compile your Java program, you can also use GPROF. However, currently, only single-threaded Java code is supported.

Conclusion

As we have seen, we can use program summary analysis to quickly find a place worthy of optimization in the program. To optimize the program, we can reduce the running time of a program from 3 minutes 36 seconds to less than 5 seconds, as shown in the preceding example.

References
  • Pathalizer: http://pathalizer.sf.net

  • Kprof: http://kprof.sf.net
  • Cgprof: http://mvertes.free.fr
  • Devel: dprof http://www.perldoc.com/perl5.8.0/lib/Devel/DProf.html
  • Gcj: http://gcc.gnu.org/java
  • : Pathalizer example files: Download For article371
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.