Http://www.cnblogs.com/bangerlee/archive/2012/08/30/2659435.html
Introduction
The CPU occupies a high amount of unnecessary. Application response is slow. Suffers from the lack of analytical tools.
Oprofile leverages CPU hardware level performance counters (performance counter) to help us identify CPU-hogging "culprit" from process, function, and code levels by counting sampling. Below we through the example, understands oprofile the concrete use method.
Common Commands
Using Oprofile for CPU usage detection requires initialization, start detection, export detection data, view detection results, and other steps, the following are commonly used oprofile commands.
Initialization of Opcontrol--no-vmlinux : Instructs Oprofile to start detection, does not record kernel module, kernel code related statistics Opcontrol--init : Load oprofile module, Oprofile Driver
Detection Control Opcontrol--start : Instructs Oprofile to start detection Opcontrol--dump : Indicates that data detected by Oprofile is written to the file Opcontrol--reset : Empty previously detected data records Opcontrol-h : Shutting down the oprofile process
View Test Results opreport : Displays the detection result in the image angle, the process, the dynamic library, the kernel module belongs to the mirror category opreport-l : Displays the detection result in the function angle opreport-l test : Displays the detection results for the test process as a function opannotate-s test : Displays the detection results for the test process at the code point opannotate-s/lib64/libc-2.4.so : To Code perspective, display detection results for the libc-2.4.so library
opreport Output Parsing
As the above command resolves, the opreport command shows CPU usage from the Mirror's perspective:
Opreport
Cpu:core 2, Speed 2128.07 MHz (estimated)
counted cpu_clk_unhalted events (Clock cycles when don't halted) with a unit Mask of 0x00 (unhalted core cycles) Count 100000
cpu_clk_unhalt.........|
Samples | %|
------------------------
31645719 87.6453 no-vmlinux
4361113 10.3592 libend.so
7683 0.1367 libpython2.4.so.1.0
7046 0.1253 op_test
...
The above list is output in the following form:
Samples | %|
Percentage of sample count sampled in-----------------------------------------------------
Mirror Mirror name of total sample count
Because we executed the "Opcontrol--no-vmlinux" command on initialization, we instructed that the oprofile does not detect the module and the kernel, so that the module and the kernel appear as no-vmlinux mirrors in the probe results. In the output, both libend.so and libpython2.4.so.1.0 are dynamic libraries, and op_test is the process. The above sampling data shows that the CPU mainly executes kernel and module code during the detection time, and the proportion of the libend.so library function is also larger, reaching about 10%.
Further, we can see how the CPU is consumed by each function in the process, in the dynamic library, during the detection time:
Opreport- L
Samples % image name app name symbol name
31645719 87.4472 no-vmlinux No-vmlinux /no-vmlinux
4361113 10.3605 libend.so libend.so endless
7046 0.1253 op_test op_test main
...
The above output shows that the CPU-consuming function is the endless function in the libend.so library, and the main function in the Op_test program.
When performing oprofile initialization, if we perform Opcontrol--vmlinux=vmlinux-' uname-r ', specify oprofile to probe the kernel and kernel modules, and when performing opreport to view the test results, The kernel and kernel modules are no longer displayed as No-vmlinux, but the kernel and individual kernel modules are used as separate mirrors to display the corresponding CPU usage.
using Opannotate to look at the CPU footprint from the code layer
The above describes the method of using Oprofile's Opreport command to view CPU usage from both the process and function levels. See here, some students may have such a question: using Opreport, I found a CPU-consuming process A, found the process a CPU-consuming function B, further, whether there is a way to find the most CPU in function B that line of code it.
The opannotate command in Oprofile can help us complete this task, combining a program with debug information, a dynamic library with Debuginfo, and a opannotate command that shows the CPU-intensive statistics on the code level. Here are some simple examples of how to use the Opannotate command.
First, we need a CPU-consuming program with the following code:
op_test.c
extern void Endless ();
int main ()
{
int i = 0, j = 0;
for (; i < 10000000; i++)
{
j + +;
}
Endless ();
return 0;
}
The program references an external function endless,endless function definition as follows:
End.c
void Endless ()
{
int i = 0;
while (1)
{
i++;
}
}
The endless function is also very simple, we will define the endless function END.C to compile with debug information and generate libend.so dynamic library file:
Linux # gcc-c-g-fpic end.c
Linux # gcc-shared-fpic-o libend.so end.o
Linux # cp Libend.so/usr/lib64/libend. So
Next, compile the op_test.c with debugging information and generate the Op_test execution file:
Linux # gcc-g-lend-o op_test op_test.c
After that, we open the oprofile for testing and pull up the op_test process:
Opcontrol-- Reset
--start
Linux #./op_test &
After the program runs for a period of time, export the detection data and use opannotate to view the results:
Opcontrol-- Dump
- s
op_test*
* Total samples for file: "/tmp/lx/op_test.c" * *
7046 100.00
*/
: int main ()
: {/*main total:7046 100.000 * * : int i = 0, j =
0; 6447 91.4987: for (; i < 10000000; i++)
: {
599 8.5013: j + +;
: } : Endless (); : return 0;
:}
The above output shows that in the main function of the Op_test program, the CPU is mainly consumed by the line code of the For loop, because the code contains not only the self increment operation of the variable I, but also I compared with 10000000.
The following shows the test results for the libend.so dynamic library:
opannotate-s/usr/lib64/libend.so
/* Total samples for file: "/TMP/LX/END.C"
*
* 4361113 100.00
/: void Endless ()
: { : int i = 0; : While (1)
: {
25661 0.6652: i++;
4335452 99.3348: }
View C library code CPU usage
The above uses opannotate, separately looked at the application code, the custom dynamic library code CPU occupation situation, for C library code, whether we can also see its CPU consumption situation.
Before using Oprofile to view C library code information, you need to install GLIBC Debuginfo package, install Debuginfo package, we can see through the opannotate of C library code, the following shows the malloc bottom implementation function _int_ Part of malloc code:
opannotate-s/lib64/libc-2.4.so
/*----------------malloc---------------------/*:
void_t *
: _int_malloc (mstate av, size_t bytes)
: { /* _int_malloc total:118396 94.9249 * *
: assert (Fwd->size & Non_main_arena ) = = 0);
115460 92.5709: While (unsigned long) (size) < (unsigned long) (fwd->size)) {
1161 0.9308: C17/>FWD = fwd->fd;
: assert ((fwd->size & non_main_arena) = = 0);
:}
:}
In the process of performance tuning, according to the Oprofile detected C library code to occupy CPU statistics, can determine whether the bottleneck of the program performance by C library code caused. If the oprofile detection results show that the CPU is used to execute the code in C library, we can further solve the application performance problems raised by C library by modifying the C library code and upgrading the GLIBC version.
Summary
This article describes how to use the Oprofile tool to detect CPU usage from the process, function, and code levels, and for the code level, it introduces methods for viewing program code, custom dynamic library code, and Gblic code CPU statistics, using the intermediate process to Opcontrol, Opreport, opannotate three commonly used oprofile commands.
When CPU usage is abnormally high in the system, Oprofile can not only help us to analyze which process is using CPU, but also can find out the CPU function and code in the process. In the analysis of application performance bottlenecks, performance tuning, we can through Oprofile, get the program code CPU usage, find the most CPU-consuming part of the code for Analysis and tuning, so targeted. In addition, in the process of performance tuning, we should not only focus on their own writing the upper layer of code, we should also consider the underlying library functions, and even the impact of the kernel on the application performance.
As for the scenario that the Oprofile tool can use for analysis, this article only describes the CPU usage one, we can also view the cache utilization, the erroneous transfer forecast and so on Oprofile, "Opcontrol--list-events" Command shows all the events oprofile can detect, and more oprofile use methods, see Oprofile Manual.
Reference:oprofile Manual