**************************************** **************************************** * ***** Original link: bytes *************************************************************************************************************************
Use GPROF to optimize your C/C ++ Program
Summary:
When optimizing programs, remember to optimize them in areas worth optimization! There is no need to spend several hours optimizing a program that actually only runs 0.04 seconds.
GPROF uses an unusually simple but very effective method to optimize the C/C ++ program, and can easily identify code worthy of optimization. A simple case analysis will show how GPROF optimizes the number of applications in an application from 3 minutes to 5 seconds by identifying and optimizing two key data structures.
This program dates back to the special discussion Conference on compiler building in 1982 (the sigplan Symposium on Compiler Construction ). Now this program has become a standard tool on Various UNIX platforms.
___________________________________________________
Profiling in a nutshell
The concept of Program Summary analysis is very simple: by recording the call and end time of each function, we can calculate the maximum runtime program segment of the program. This method seems to take a lot of effort-fortunately, we are not far from the truth! We only need to add an additional parameter ('-PG') when compiling with GCC to run this (Compiled) Program (to collect the data related to program summary analysis ), then run 'gprof' to analyze these results more conveniently.
Case study: pathalizer
I used a program in reality as an example. It is part of pathalizer:event2dot
A tool that converts the path "Event" description file into a graphical "dot" file (executable which translates a pathalizer 'events' file to a graphviz 'dot 'file ).
Simply put, it reads various events from a file and saves them as images (using pages as nodes and changing pages as edges ), these images are then integrated into a large image and saved as a graphical 'dot 'file.
Program timing
Let's show us how long they will take to run our unoptimized programs. Use on my computerevent2dot
The example in the source code is used as the input (about 55000 of the data), which takes about three minutes:
real 3m36.316s user 0m55.590s sys 0m1.070s
Program Analysis
To use GPROF for summary analysis, you must add the '-PG' option during compilation. We will re-compile the source code as follows:
g++ -pg dotgen.cpp readfile.cpp main.cpp graph.cpp config.cpp -o event2dot
Now we can run it againevent2dot
And use the test data we used earlier. When we run this time,event2dot
The running analysis data is collected and saved in the 'gmon. out' file. You can run 'gprofevent2dot
| Less 'to view the results.
GPROF will show that the following functions are important:
% cumulative self self total time seconds seconds calls s/call s/call name 43.32 46.03 46.03 339952989 0.00 0.00 CompareNodes(Node *,Node *) 25.06 72.66 26.63 55000 0.00 0.00 getNode(char *,NodeListNode *&) 16.80 90.51 17.85 339433374 0.00 0.00 CompareEdges(Edge *,AnnotatedEdge *) 12.70 104.01 13.50 51987 0.00 0.00 addAnnotatedEdge(AnnotatedGraph *,Edge *) 1.98 106.11 2.10 51987 0.00 0.00 addEdge(Graph *,Node *,Node *) 0.07 106.18 0.07 1 0.07 0.07 FindTreshold(AnnotatedEdge *,int) 0.06 106.24 0.06 1 0.06 28.79 getGraphFromFile(char *,NodeListNode *&,Config *) 0.02 106.26 0.02 1 0.02 77.40 summarize(GraphListNode *,Config *) 0.00 106.26 0.00 55000 0.00 0.00 FixName(char *)
It can be seen that the first function is very important: it occupies most of the running tasks in the program.
Optimization
The above results show that this program is spent most of the time.CompareNodes
In the function, use grep to check whether comparenodes isCompareEdges
While compareedges is called only once.addAnnotatedEdge
Call -- they all appear in the above list. Here is where we should optimize it!
We noticed thataddAnnotatedEdge
A linked list is traversed. Although the linked list is easy to implement, it is not the best data type. We decided to replace the linked list G-> edges with a binary tree: this will make the search faster.
Result
Now let's take a look at the optimized running results:
real 2m19.314s user 0m36.370s sys 0m0.940s
Second time
Run GPROF again for analysis:
% cumulative self self total time seconds seconds calls s/call s/call name 87.01 25.25 25.25 55000 0.00 0.00 getNode(char *,NodeListNode *&) 10.65 28.34 3.09 51987 0.00 0.00 addEdge(Graph *,Node *,Node *)
It seems that the functions that used to occupy a large amount of runtime are no longer the biggest ones in runtime! Let's try to optimize it: Replace the node tree with a node hash table.
This is a huge improvement:
real 0m3.269suser 0m0.830ssys 0m0.090s
Other C/C ++ program analyzers
There are many other analyzer tools that can use GPROF data, such as kprof (screenshot) and cgprof. Although the graphic interface looks more comfortable, I personally think that GPROF is more convenient to use.
Analyze programs in other languages
Here we will introduce how to use GPROF to analyze C/C ++ programs. For other languages, we can do the same: For Perl, we can use devel: dprof module. Your program shouldperl -d:DProf mycode.pl
And usedprofpp
To view and analyze the results. If you can use gcj to compile your Java program, you can also use GPROF. However, currently, only single-threaded Java code is supported.
Conclusion
As we have seen, we can use program summary analysis to quickly find a place worthy of optimization in the program. To optimize the program, we can reduce the running time of a program from 3 minutes 36 seconds to less than 5 seconds, as shown in the preceding example.
References
- Pathalizer: http://pathalizer.sf.net
- Kprof: http://kprof.sf.net
- Cgprof: http://mvertes.free.fr
- Devel: dprof http://www.perldoc.com/perl5.8.0/lib/Devel/DProf.html
- Gcj: http://gcc.gnu.org/java
- : Pathalizer example files: Download For article371