When writing programs, especially embedded programs, we usually need to analyze the program's performance so that the program can run faster and better to achieve real-time. If the program is large, it will be difficult to analyze it. If there is a tool that can automatically analyze the program's performance, it would be better. The following describes a Program Profiling tool in Linux-GNU profiler.
Basic usage of gprof:
1. Use the-pg option to compile and link your application
When compiling a program with gcc, add the-pg option, for example:
Gcc-pg-o test. c
In this way, the executable file test is generated. If it is a large project, modify the compilation option in makefile, where-pg can be put.
2. Execute your application to generate data for gprof Analysis
Run the program:./test. A gmon. out file is generated, which contains the profiling data.
3. Use gprof to analyze the data generated by your application
Gprof test gmon. out> profile.txt
With the authorization command, gprofcan be used to store the profilingresult in the profile.txt file, and the analysis result is displayed. Through the analysis of the results to improve our procedures, so as to achieve our goal.
Gnu gprof is a good tool that can be used for programming. I now use GPROF to profiling my program, find out the most time-consuming functions or operations, and use an FPGA chip to achieve real-time.
Compile a program for GPROF
When compiling or linking the source program, add the "-PG" option to the compiler's command line parameters. during compilation, the compiler automatically inserts a code snippet for performance testing in the target code, these codes collect and record the call relationship and number of calls of the function when the program is running, and collect and record the execution time of the function and the call time of the sub-function. After the program runs, A gmon is generated in the exit path of the program. out file. This file is the monitoring data recorded and saved. You can use GPROF in command line mode or kprof in graphical mode to interpret the data and analyze the program performance. In addition, if you want to view the profiling of the library function, you need to add "-lc_p" to compile the parameter in place of "-LC". In this way, the program will link to library libc_p.a, to generate the profiling information of the library function. If you want to execute the profiling of one row and one row, you also need to add the "-G" Compilation parameter.
For example, the following command line:
Gcc-wall-g-PG-lc_p example. C-o example
Execute GPROF
Run the following command to execute GPROF:
GPROF options executable-file gmon. Out BB-DATA [yet-more-profile-data-files...] [> OUTFILE]
Information generated by GPROF
% The percentage of the total running time of
Time program used by this function.
The percentage of time used by the function to all time.
Cumulative a running sum of the number of seconds accounted
Seconds for by this function and those listed abve it.
Cumulative execution time of functions and upper-column functions.
Self the number of seconds accounted for by this
Seconds function alone. This is the major sort for this
Listing.
The time when the function is executed.
Callthe number of times this function was invoked, if
This function is profiled, else blank.
Number of function calls
Self the average number of milliseconds spent in this
MS/call function per call, if this function is profiled,
Else blank.
Each call takes microseconds as the function time.
Total the average number of milliseconds spent in this
MS/call Function and Its descendents per call, if this
Function is profiled, else blank.
The average time of each call is microseconds.
Name the name of the function. This is the minor sort
For this listing. The index shows the location
The function in the gprof listing. If the index is
In parenthesis it shows where it wowould appear in
The gprof listing if it were to be printed.
Function Name
Prof implementation principle:
By compiling and linking your program (using-pg compilation and link options ), gcc adds a function named mcount (or "_ mcount", or "_ mcount", depending on the compiler or operating system) to each function of your application, that is to say, every function in your application calls mcount, and mcount saves a function call graph in the memory, the address of the subfunction and the parent function is searched through the function call stack. This call chart also saves all information related to the function call time, number of calls, and so on.
Gprof is simple to use:
Let's take a simple example to see how Gprof is used.
1. Open the linux terminal. Create a new test. c file and use-pg to compile and link the file.
The content of the test. c file is as follows:
Citation:
# Include "stdio. h"
# Include "stdlib. h"
Void (){
Printf ("/t + --- call a () function ");
}
Void c (){
Printf ("/t + --- call c () function ");
}
Int B (){
Printf ("/t + --- call B () function ");
A ();
C ();
Return 0;
}
Int main (){
Printf ("main () function ()");
B ();
}
Enter the following command in the command line without the-c option. gcc will compile and link to generate a. out by default:
Citation:
[Linux/home/test] $ gcc-pg test. c
If there is no compilation error, gcc will generate. you can also use the-o option to name the generated file, such as gcc-pg test. c-o test, gcc will generate an executable file named test, and enter [linux/home/test] $ under the command line. /test, you can execute the program, remember to add. /otherwise, the program may be executed, but there is no output.
2. Execute your application to generate data for gprof analysis. In the command line, enter:
Citation:
[Linux/home/test] $ a. out
Main () function ()
+ --- Call B () function
+ --- Call a () function
+ --- Call c () function
[Linux/home/test] $
You will see a gmon. out file in the current directory, which is used for gprof analysis.
3. Use the gprof program to analyze the data generated by your application.
In the command line, enter:
Citation:
[Linux/home/test] $ gprof-B a. out gmon. out | less
Because gprof outputs a lot of information, the less command is used here. This command allows us to view the output of gprof through the up and down arrow keys. | indicates gprof-B. out gmon. the output of out is used as the less input. Below are some of our details extracted from the gprof output.
Citation:
Flat profile:
Each sample counts as 0.01 seconds.
No time accumulated
% Cumulative self total
Time seconds CILS Ts/call name
0.00 0.00 0.00 1 0.00 0.00
0.00 0.00 0.00 1 0.00 0.00 B
0.00 0.00 0.00 1 0.00 0.00 c
Call graph
Granularity: each sample hit covers 4 byte (s) no time propagated
Index % time self children called name
0.00 0.00 1/1 B [2]
[1] 0.0 0.00 0.00 1 a [1]
-----------------------------------------------
0.00 0.00 1/1 main [10]
[2] 0.0 0.00 0.00 1 B [2]
0.00 0.00 1/1 c [3]
0.00 0.00 1/1 a [1]
-----------------------------------------------
0.00 0.00 1/1 B [2]
[3] 0.0 0.00 0.00 1 c [3]
-----------------------------------------------
Index by function name
[1] a [2] B [3] c
From the above output, we can see that main calls function B, while function B calls function a and Function c respectively. Since our function simply outputs a string, each function consumes 0 seconds.
Use the gprof analysis program
Gprof Introduction
Gprof is a GNU profiler tool. It can display the "flat profile" of the program running, including the number of calls to each function, the processor time consumed by each function, and the "Call diagram", including the function call relationship, how long does each function call take. You can also display the "annotated source code"-a copy of the program source code, marking the number of executions of each line of code in the program.
Basic usage:
1. Use the-pg option to compile and link your application.
2. Run your application to generate a data file for gprof analysis after it is run (gmon. out by default ).
3. Use the gprof program to analyze the data generated by your application, for example, gporf a. out gmon. out.
Gprof implementation principle:
Gprof is not magical. When compiling and linking a program (using-pg compilation and linking options ), gcc adds a function named mcount (or "_ mcount", or "_ mcount") to each function of your application, that is to say, every function in the application compiled by pg calls mcount, and mcount saves a function call diagram in the memory, the address of the subfunction and the parent function is searched through the function call stack. This call chart also saves all information related to the function call time, number of calls, and so on.
Common gprof Command Options:
-B no longer outputs detailed descriptions of each field in the Statistical Chart.
-P only outputs the Call graph of the function (the part of the Call graph Information ).
-Q only outputs the time consumption list of the function.
-E Name no longer outputs call diagrams of function names and their subfunctions (unless they have other parent functions that are not restricted ). Multiple-e flags can be specified. One-e flag can only specify one function.
-E Name no longer outputs the call diagram of the function Name and its subfunctions. This flag is similar to the-e flag, however, in the calculation of the total time and percentage time, it excludes the time used by the function Name and its subfunctions.
-F Name: The call graph of the output function Name and its subfunctions. Multiple-f flag can be specified. One-f flag can only specify one function.
-F Name: The call graph of the output function Name and its subfunctions. It is similar to the-f sign, but it only uses the Time of the printed routine in the total time and percentage time calculation. Multiple-F flag can be specified. One-F flag can only specify one function. -F flag overwrites the-E flag.
-Z: a zero-number-of-use routine (calculated based on the call count and accumulation time ).
Note:
1) Generally, gprof can only view User Function information. If you want to view the library function information, you need to add "-lc_p" to compile the parameter in place of "-lc". In this way, the program will link to library libc_p.a, to generate the profiling information of the library function.
2) gprof can generate a program evaluation report only after the program ends and Exits normally because gprof registers a function in atexit () to generate the result information, no abnormal exit will execute the atexit () action, so no gmon will be generated. out file. If your program is a service program that will not exit, you can only modify the code to achieve the goal. If you do not want to change the running mode of the program, you can add a signal processing function to solve the problem (which minimizes the modification to the Code), for example:
Static void sighandler (int sig_no)
{
Exit (0 );
}
Signal (SIGUSR1, sighandler );
After the kill-USR1 pid is used, the program exits and generates the gmon. out file.
Use gprof and oprofile to find performance bottlenecks
Sometimes, we pay special attention to program performance, especially the underlying software, such as drivers and OS. In order to better optimize the program performance, we must find the performance bottleneck, "good steel is used on the blade" to achieve good results, otherwise it may be done in vain. To find the key path, we can use profilng technology. On the linux platform, we can use gprof and oprofile tools.
Gprof is one of the GNU tools. during compilation, it adds profiling code to the entrances and exits of each function. during runtime, the statistics program executes information in the user State, you can obtain the number of calls, execution time, call relationship, and other information of each function, which is easy to understand. It is suitable for finding performance bottlenecks of user-level programs. gprof is not suitable for programs that are executed in kernel state for many times.
Oprofile is also an open-source profiling tool. It uses hardware debugging registers to collect statistics. It has a low overhead for profiling and can perform profiling on the kernel. It collects a lot of statistics, and obtains the cache loss rate, memory access information, and branch prediction error rate. GPROF is not available, but for the number of function calls, it cannot be obtained ..
To put it simply, GPROF is simple and suitable for finding Bottlenecks of user-level programs. oprofile is a little complicated, but more information is obtained, which is more suitable for debugging system software.
Let's compile and run Hello. C is used as an example to describe how to use these two tools. Here we do not explain the meaning of the specific results. To learn more about the meaning of each result, you can refer to the official site Doc information in the reference documents, it will give you a detailed explanation.
GPROF Quick Start
GPROF is one of the GNU binutils tools. It is included in Linux by default.
Use the-PG option to compile hello. C. If you want to get the source code list with annotations, you need to add the-G option. Run: gcc-PG-g-o hello. c
Run the application:./Hello will generate the gmon. Out file in the current directory.
To use GPROF to analyze the gmon. Out file, you need to associate it with the application that generates it:
GPROF Hello gmon. Out-P obtains the execution time of each function.
GPROF Hello gmon. Out-Q obtains the call graph, including the call relationship, number of calls, and execution time of each function.
GPROF Hello gmon. Out-A gets a "Source code list" with comments, which comments the source code and indicates the number of times each function is executed. This requires the-G option to be added during compilation.
Oprofile Quick Start
Oprofile is an open-source project on SourceForge. It comes with this tool on the 2.6 kernel. It seems that only SMP systems are available. For older systems, you need to install and re-compile the kernel.
Oprofile is a set of tools to accomplish different things.
Op_help: lists all supported events.
Opcontrol: Set the events to be collected.
Opreport: outputs statistical results.
Opannaotate: generates source/assembly files with annotations. Source Language-level annotations must be supported when source files are compiled.
Opstack: generate the Call Graph profile, but the x86/2.6 platform is required, and the call-graph patch is installed in linux2.6.
Opgprof: generate results similar to GPROF.
Oparchive: Collects and packages all raw data files and analyzes them on another machine.
Op_import: converts the sampled database files from another Abi to the local cost format.
The root permission is required to run oprofile because it needs to load the profile module and start the oprofiled background program. Therefore, you need to switch to root before running.
Opcontrol -- init loading module, mout/dev/oprofile to create necessary files and directories
Opcontrol -- no-vmlinux or opcontrol -- vmlinux =/boot/vmlinux-'uname-R' determines whether to profiling the kernel.
Opcontrol -- reset: clear data in the current session
Opcontrol -- start starts profiling
./Hello run the application, oprofile will profiling it
Opcontrol -- dump writes collected data to a file
Opcontrol -- stop profiling
Opcotrol-h disable the daemon oprofiled
Opcontrol -- shutdown to stop oprofiled
Opcontrol -- deinit uninstall the module
The commonly used process is 3 to 7. After obtaining performance data, you can use opreport, opstack, opgprof, and opannotate tools for analysis. I usually use opreport and opannotate for analysis.
Opreport use http://oprofile.sourceforge.net/doc/opreport.html
Opannotate http://oprofile.sourceforge.net/doc/opannotate.html
Opgprof http://oprofile.sourceforge.net/doc/opgprof.html
The most common information is opreport, which provides image and symbols information. For example, I want to obtain the execution time ratio of each function and other information to identify system performance bottlenecks. Opannotate can be used to comment out the source code, indicating the location that occupies a large amount of time. Common commands are as follows:
Opreport-l/bin/bash -- exclude-dependent -- threshold 1 is used to detect system bottlenecks.
Specifies to view the profiling information of/bin/bash, which takes more than 1% of the overall execution time.
Opannotate -- source -- output-dir = annotated/usr/local/oprofile-pp/bin/oprofiled
Opannotate -- source -- base-dirs =/tmp/build/libfoo/-- search-dirs =/home/user/libfoo/-- output-dir = annotated/lib/libfoo. so
Network Resources
Gprof user manual http://sourceware.org/binutils/docs-2.17/gprof/index.html
Http://oprofile.sourceforge.net/oprofile Official Site/
Use GNU profiler to speed up code http://www-128.ibm.com/developerworks/cn/linux/l-gnuprof.html
Identifying performance bottlenecks http://www-128.ibm.com/developerworks/cn/linux/l-pow-oprofile/ with OProfile for Linux on POWER
This article from: Development Institute http://edu.codepub.com Source: http://edu.codepub.com/2011/0105/28527.php