Who moved my cpu--oprofile. Use notes

Source: Internet
Author: User
Tags assert function definition cpu usage

Http://www.cnblogs.com/bangerlee/archive/2012/08/30/2659435.html

Introduction

The CPU occupies a high amount of unnecessary. Application response is slow. Suffers from the lack of analytical tools.

Oprofile leverages CPU hardware level performance counters (performance counter) to help us identify CPU-hogging "culprit" from process, function, and code levels by counting sampling. Below we through the example, understands oprofile the concrete use method.

Common Commands

Using Oprofile for CPU usage detection requires initialization, start detection, export detection data, view detection results, and other steps, the following are commonly used oprofile commands.

Initialization of Opcontrol--no-vmlinux : Instructs Oprofile to start detection, does not record kernel module, kernel code related statistics Opcontrol--init : Load oprofile module, Oprofile Driver

Detection Control Opcontrol--start : Instructs Oprofile to start detection Opcontrol--dump : Indicates that data detected by Oprofile is written to the file Opcontrol--reset : Empty previously detected data records Opcontrol-h : Shutting down the oprofile process

View Test Results opreport : Displays the detection result in the image angle, the process, the dynamic library, the kernel module belongs to the mirror category opreport-l : Displays the detection result in the function angle opreport-l test : Displays the detection results for the test process as a function opannotate-s test : Displays the detection results for the test process at the code point opannotate-s/lib64/libc-2.4.so : To Code perspective, display detection results for the libc-2.4.so library

opreport Output Parsing

As the above command resolves, the opreport command shows CPU usage from the Mirror's perspective:

Opreport
Cpu:core 2, Speed 2128.07 MHz (estimated)
counted cpu_clk_unhalted events (Clock cycles when don't halted) with a unit Mask of 0x00 (unhalted core cycles) Count 100000
cpu_clk_unhalt.........|
  Samples |           %|
------------------------
   31645719     87.6453      no-vmlinux  
    4361113     10.3592      libend.so
       7683      0.1367      libpython2.4.so.1.0
       7046      0.1253      op_test
        ...

The above list is output in the following form:

              Samples |                            %|
Percentage of sample count sampled in----------------------------------------------------- 
     Mirror Mirror      name of total sample count

Because we executed the "Opcontrol--no-vmlinux" command on initialization, we instructed that the oprofile does not detect the module and the kernel, so that the module and the kernel appear as no-vmlinux mirrors in the probe results. In the output, both libend.so and libpython2.4.so.1.0 are dynamic libraries, and op_test is the process. The above sampling data shows that the CPU mainly executes kernel and module code during the detection time, and the proportion of the libend.so library function is also larger, reaching about 10%.

Further, we can see how the CPU is consumed by each function in the process, in the dynamic library, during the detection time:

Opreport- L
 Samples           %        image name        app name         symbol name
31645719     87.4472        no-vmlinux      No-vmlinux         /no-vmlinux
 4361113     10.3605         libend.so libend.so endless
    7046      0.1253           op_test         op_test                main
    ...

The above output shows that the CPU-consuming function is the endless function in the libend.so library, and the main function in the Op_test program.

When performing oprofile initialization, if we perform Opcontrol--vmlinux=vmlinux-' uname-r ', specify oprofile to probe the kernel and kernel modules, and when performing opreport to view the test results, The kernel and kernel modules are no longer displayed as No-vmlinux, but the kernel and individual kernel modules are used as separate mirrors to display the corresponding CPU usage.

using Opannotate to look at the CPU footprint from the code layer

The above describes the method of using Oprofile's Opreport command to view CPU usage from both the process and function levels. See here, some students may have such a question: using Opreport, I found a CPU-consuming process A, found the process a CPU-consuming function B, further, whether there is a way to find the most CPU in function B that line of code it.

The opannotate command in Oprofile can help us complete this task, combining a program with debug information, a dynamic library with Debuginfo, and a opannotate command that shows the CPU-intensive statistics on the code level. Here are some simple examples of how to use the Opannotate command.

First, we need a CPU-consuming program with the following code:

op_test.c
extern void Endless ();
int main ()
{
int i = 0, j = 0;
for (; i < 10000000; i++)
     {
           j + +;
     }
Endless ();
return 0;
}

The program references an external function endless,endless function definition as follows:

End.c
void Endless ()
{
int i = 0;
while (1)
     {
         i++;
     }
}

The endless function is also very simple, we will define the endless function END.C to compile with debug information and generate libend.so dynamic library file:

Linux # gcc-c-g-fpic end.c
Linux # gcc-shared-fpic-o libend.so end.o
Linux # cp Libend.so/usr/lib64/libend. So

Next, compile the op_test.c with debugging information and generate the Op_test execution file:

Linux # gcc-g-lend-o op_test op_test.c

After that, we open the oprofile for testing and pull up the op_test process:

Opcontrol-- Reset
 --start
Linux #./op_test &

After the program runs for a period of time, export the detection data and use opannotate to view the results:

Opcontrol-- Dump
 - s
op_test*
 * Total samples for file: "/tmp/lx/op_test.c" * *
 7046  100.00
 */
               : int main ()
               : {/*main total:7046  100.000 * *            :    int i = 0, j =
0; 6447   91.4987: for    (; i < 10000000; i++)
               :    {
 599    8.5013:          j + +;
               :    }           :    Endless ();           : return    0;
               :}      

The above output shows that in the main function of the Op_test program, the CPU is mainly consumed by the line code of the For loop, because the code contains not only the self increment operation of the variable I, but also I compared with 10000000.

The following shows the test results for the libend.so dynamic library:

opannotate-s/usr/lib64/libend.so

 /* Total samples for file: "/TMP/LX/END.C"
 *
 * 4361113  100.00
                 /: void Endless ()
                 : {              :     int i = 0;              : While     (1)
                 :     {
  25661   0.6652:          i++;
4335452  99.3348:     }
                   

View C library code CPU usage

The above uses opannotate, separately looked at the application code, the custom dynamic library code CPU occupation situation, for C library code, whether we can also see its CPU consumption situation.

Before using Oprofile to view C library code information, you need to install GLIBC Debuginfo package, install Debuginfo package, we can see through the opannotate of C library code, the following shows the malloc bottom implementation function _int_ Part of malloc code:

opannotate-s/lib64/libc-2.4.so

 /*----------------malloc---------------------/*:
                void_t *
                : _int_malloc (mstate av, size_t bytes)
                : {  /* _int_malloc total:118396  94.9249 * *
                :       assert (Fwd->size & Non_main_arena ) = = 0);
115460  92.5709: While       (unsigned long) (size) < (unsigned long) (fwd->size)) {
  1161   0.9308: C17/>FWD = fwd->fd;
                :            assert ((fwd->size & non_main_arena) = = 0);
                :}
                :}

In the process of performance tuning, according to the Oprofile detected C library code to occupy CPU statistics, can determine whether the bottleneck of the program performance by C library code caused. If the oprofile detection results show that the CPU is used to execute the code in C library, we can further solve the application performance problems raised by C library by modifying the C library code and upgrading the GLIBC version.

Summary

This article describes how to use the Oprofile tool to detect CPU usage from the process, function, and code levels, and for the code level, it introduces methods for viewing program code, custom dynamic library code, and Gblic code CPU statistics, using the intermediate process to Opcontrol, Opreport, opannotate three commonly used oprofile commands.

When CPU usage is abnormally high in the system, Oprofile can not only help us to analyze which process is using CPU, but also can find out the CPU function and code in the process. In the analysis of application performance bottlenecks, performance tuning, we can through Oprofile, get the program code CPU usage, find the most CPU-consuming part of the code for Analysis and tuning, so targeted. In addition, in the process of performance tuning, we should not only focus on their own writing the upper layer of code, we should also consider the underlying library functions, and even the impact of the kernel on the application performance.

As for the scenario that the Oprofile tool can use for analysis, this article only describes the CPU usage one, we can also view the cache utilization, the erroneous transfer forecast and so on Oprofile, "Opcontrol--list-events" Command shows all the events oprofile can detect, and more oprofile use methods, see Oprofile Manual.

Reference:oprofile Manual

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.