Sometimes, we pay special attention to program performance, especially the underlying software, such as drivers and OS. In order to better optimize the program performance, we must find the performance bottleneck. "Good steel is used on the blade" to achieve good results. Otherwise, we may do the work in vain. To find the key path, we can use profilng technology. On the Linux platform, we can use GPROF and oprofile tools.
- GPROF is one of the GNU tools. during compilation, it adds profiling code to the entrances and exits of each function. during runtime, the statistics program executes information in the user State, you can obtain the number of calls, execution time, call relationship, and other information of each function, which is easy to understand. It is suitable for finding the performance bottleneck of user-level programs. GPROF is not suitable for the procedures that are executed in kernel state for many times.
- Oprofile is also an open-source profiling tool that uses hardware debugging registers to collect statistics. The profiling overhead is relatively small, and the kernel can be profiling. It collects a large amount of statistics, including the cache loss rate, memory access information, and branch prediction error rate. GPROF is not available, but for the number of function calls, it cannot be obtained ..
To put it simply, GPROF is simple and suitable for finding Bottlenecks of user-level programs. oprofile is a little complicated, but more information is obtained, which is more suitable for debugging system software.
Let's compile and run Hello. C is used as an example to describe how to use these two tools. Here we do not explain the meaning of the specific results. To learn more about what each result means, you can take a look at the doc information on the official site of the reference materials, which will give you a detailed explanation.
GPROF implementation principle:
By compiling and linking your program (using-PG compilation and link options ), GCC adds a function named mcount (or "_ mcount", or "_ mcount", depending on the compiler or operating system) to each function of your application, that is to say, every function in your application calls mcount, and mcount saves a function call graph in the memory, the address of the subfunction and the parent function is searched through the function call stack. This call chart also saves all the information related to the function call time, number of calls, and so on. In addition, if you want to view the profiling of the library function, you need to add the "-lc_p" Compilation parameter to the compilation instead of the "-LC" Compilation parameter. In this way, the program will link to the libc_p.a library to generate the profiling information of the library function. If you want to execute the profiling of one row and one row, you also need to add the "-G" Compilation parameter.
GPROF Quick Start
GPROF is one of the GNU binutils tools. It is included in Linux by default.
- Use the-PG option to compile hello. C. If you want to get the source code list with annotations, you need to add the-G option. Run: gcc-PG-g-o hello. c
- Run the application:./Hello will generate the gmon. Out file in the current directory.
- To use GPROF to analyze the gmon. Out file, you need to associate it with the application that generates it:
- GPROF Hello gmon. Out-P obtains the execution time of each function.
- GPROF Hello gmon. Out-Q obtains the call graph, including the call relationship, number of calls, and execution time of each function.
- GPROF Hello gmon. Out-A gets a "Source code list" with comments, which comments the source code and indicates the number of times each function is executed. This requires the-G option to be added during compilation.
Oprofile Quick Start
Oprofile is an open-source project on SourceForge. It comes with this tool on the 2.6 kernel. It seems that only SMP systems are available. For older systems, you need to install and re-compile the kernel.
Oprofile is a set of tools to accomplish different things.
Op_help: lists all supported events.
Opcontrol: Set the events to be collected.
Opreport: outputs statistical results.
Opannaotate: generates source/assembly files with annotations. Source Language-level annotations must be supported when source files are compiled.
Opstack: generate the Call Graph profile, but the x86/2.6 platform is required, and the call-graph patch is installed in linux2.6.
Opgprof: generate results similar to GPROF.
Oparchive: Collects and packages all raw data files and analyzes them on another machine.
Op_import: converts the sampled database files from another Abi to the local cost format.
The root permission is required to run oprofile because it needs to load the profile module and start the oprofiled background program. Therefore, you need to switch to root before running.
- Opcontrol -- init loading module, MOUT/dev/oprofile to create necessary files and directories
- Opcontrol -- no-vmlinux or opcontrol -- vmlinux =/boot/vmlinux-'uname-R' determines whether to profiling the kernel.
- Opcontrol -- reset: clear data in the current session
- Opcontrol -- start starts profiling
- ./Hello run the application, oprofile will profiling it
- Opcontrol -- dump writes collected data to a file
- Opcontrol -- Stop stop profiling
- Opcotrol-H disable the daemon oprofiled
- Opcontrol -- shutdown to stop oprofiled
- Opcontrol -- deinit uninstall the module
The commonly used process is 3 to 7. After obtaining performance data, you can use opreport, opstack, opgprof, and opannotate tools for analysis. I usually use opreport and opannotate for analysis.
- Opreport use http://oprofile.sourceforge.net/doc/opreport.html
- Opannotate http://oprofile.sourceforge.net/doc/opannotate.html
- Opgprof http://oprofile.sourceforge.net/doc/opgprof.html
The most common information is opreport, which provides image and symbols information. For example, I want to obtain the execution time ratio of each function and other information to identify system performance bottlenecks. Opannotate can be used to comment out the source code, indicating the location that occupies a large amount of time. Common commands are as follows:
- Opreport-L/bin/Bash -- exclude-depand -- threshold 1 is used to detect system bottlenecks.
- Opannotate -- source -- output-Dir = annotated/usr/local/oprofile-PP/bin/oprofiled
- Opannotate -- source -- Base-dirs =/tmp/build/libfoo/-- search-dirs =/home/user/libfoo/-- output-Dir = annotated/lib/libfoo. So
Network Resources
- GPROF user manual http://sourceware.org/binutils/docs-2.17/gprof/index.html
- Http://oprofile.sourceforge.net/oprofile Official Site/
- Use GNU profiler to speed up code http://www-128.ibm.com/developerworks/cn/linux/l-gnuprof.html
- Identifying performance bottlenecks http://www-128.ibm.com/developerworks/cn/linux/l-pow-oprofile/ with oprofile for Linux on power