SOURCE Analysis: Dynamic analysis of Linux kernel function call relationships

Source: Internet
Author: User
Tags gcov

by Falcon of tinylab.org
2015/04/18

Reason

Source code analysis is a topic that programmers can not leave.

Whether it is research open source projects, or usually do all kinds of transplant, development, can not avoid the deep interpretation of the source code.

工欲善其事, its prerequisite.

The first two articles describe the dynamic analysis of static and application parts. Here we begin to discuss how to dynamically analyze the Linux kernel section.

Preparation work Ftrace

Similar to the user-state gprof , before tracing the kernel function, you need to do some extra configuration to the kernel, insert some code in kernel correlation function to get the necessary information, such as call time, number of calls, parent function, etc.

Early kernel function tracking supports KFT, which is based on the -finstrument-functions insertion of specific calls at the exit and entry of each function to intercept the various types of information mentioned above. The early author has maintained KFT, and successfully transplanted to the Loongson/mips platform, related mail records see: Kernel function Tracing support for linux-mips. But what does the official Linux community ultimately use Ftrace ? Although it is a similar idea, Ftrace there are significant innovations:

    • Ftrace only needs to insert an external call at the function entry: Mcount, and KFT in both the inlet and outlet
    • Ftrace cleverly intercepts the address returned by the function so that it can jump to a pre-prepared unified exit at run time, record all kinds of information, and then return to the original address.
    • Ftrace after the link is complete, all the insertion point addresses are recorded in a table, and then by default all the insertion points are replaced with empty instructions (NOP), so by default the cost of Ftrace is almost 0
    • Ftrace can be enabled and used on demand through the Sysfs interface at run time, even when no third-party tools are available

Therefore, this article only describes Ftrace , for its detailed usage, the recommended reading Ftrace of the author Steven writes in LWN series articles, for example:

    • Debugging the kernel using Ftrace:1, 2
    • Secrets of the Ftrace function tracer
    • Trace-cmd:a Front-End for Ftrace

For the content of this article, as long as the Ftrace kernel configuration can be, we do not directly use its underlying interface:

In addition, you need to compile the symbol table of the kernel function:

CONFIG_KALLSYMS=yCONFIG_KALLSYMS_ALL=y

If you want to use it directly Ftrace , you can install the following tools, but this article does not introduce:

$ sudo apt-get install trace-cmd kernelshark pytimerchart
Perf

PerfThe earliest was born to replace Oprofile , starting from 2009 just added a new system call, now strong to almost push back the Oprofile history stage. Because it not only supports the hardware performance counters, but also supports a variety of software counters, for the Linux world to provide a set of perfect performance Profiling tools, of course, the kernel of the bottom part of the function Profiling inseparable from Ftrace support.

For Perf detailed usage, refer to: Perf Wiki.

Ok, you also need to enable the following kernel configuration:

CONFIG_HAVE_PERF_EVENTS=yCONFIG_PERF_EVENTS=y

Client Installation:

$ sudo apt-get install linux-tools-`uname -r`
Flamegraph

Flamegraph is a big innovation in the field of Profiling data display, the traditional tree-like structure occupies a large visual area, and can not pinpoint hot spots, and FlameGraph through the flame-like data display, using cascading structure, occupy a small page space, can quickly and clearly show the proportion of each path , and based on the free scaling of SVG, Javascript can dynamically show the specific samples and ratios of each function.

OK, get FlameGraph ready:

$ git clone https://github.com/brendangregg/FlameGraph.git

Before using FlameGraph it, let's briefly introduce an example to better understand its uniqueness.

A;b;c;d 90
E 10

This data has three information:

    • Function call Relationship: A calls B, C, D in turn
    • Number of calls: A branch 90 times, E branch 10 times
    • There are two major branches: A and E

To render this data, if you use the previous dot description language, relatively more complex, especially when the function node is very much, almost will not be able to view, but the FlameGraph processing is very good, the above information is saved as Calls.log and processed as follows:

$ cd FlameGraph$ cat calls.log | ./FlameGraph/flamegraph.pl > calls-flame.svg

The effect is as follows:

More preparation

Daily program Development We are basically only concerned about the situation of the user state, in the system-level optimization, will take into account the system library and even the kernel part, because the daily application run pretty much work in addition to the application itself of various types of operations, there are a large part of access to various system libraries, and then through the library access to various types of underlying The Linux kernel space is then accessed.

Let's go back to the example in the previous article: fib.c you can pass ltrace and strace view the case of library functions and system calls:

$ ltrace-f-t-ttt-c./fib 2>&1 >/dev/null% time seconds usecs/call calls function------------ ---------------------------------------------100.00 0.006063 141 printf------------------------- --------------------------------100.00 0.006063 total$ strace-f-t-ttt-c./fib 2>&1 & Gt /dev/null% time seconds Usecs/call calls errors syscall-------------------------------------------------- ------------22.77 0.000051 6 8 mmap 15.18 0.000034 9 4 Mpro Tect 11.61 0.000026 9 3 3 Access 9.82 0.000022 1 Munmap 9. 0.000021 1 execve 8.93 0.000020 2 Open 7.14 0.00           0016 5 3 Fstat 4.46 0.000010 5 2 close 3.12 0.000007    7 1       Read 3.12 0.000007 7 1 brk 2.68 0.000006 6 1 1 IOCTL  1.79 0.000004 4 1 arch_prctl 0.00 0.000000 0 1 Write------ --------------------------------------------------------100.00 0.000224 4 Total

The above article can see that the application itself fibonnaci() takes up almost 100% the time overhead, but in fact the library functions and the kernel have overhead when an application runs. The above reaction to the ltrace library function call situation, strace it reflects the system calls, the kernel overhead is triggered by the system call, of course, there is a part of the kernel itself scheduling, body switching, memory allocation and other costs. The approximate time ratio can be time viewed by command:

$ time ./fib 2>&1 > /dev/nullreal        0m5.887suser        0m5.881ssys         0m0.004s

Next, let's cut to the chase. Based on Ftrace a Perf comprehensive look at an application runtime user space and kernel space two parts of the call situation and by FlameGraph drawing out.

Kernel function calls

Before use Perf , in addition to the above-mentioned kernel configuration, you also need to enable a symbol to get permission, otherwise the result will be a lot of 16 binary numbers, do not see the function symbol:

$ echo 0 > /proc/sys/kernel/kptr_restrict

Let's take a look at the case of user space, library functions, and system calls, taking the command as an example:

find /proc/ -maxdepth 2 -name "vm" 2>&1 >/dev/null
User space
$ valgrind --tool=callgrind find /proc/ -maxdepth 2 -name "vm" 2>&1 >/dev/null$ gprof2dot -f callgrind ./callgrind.out.24273 | dot -Tsvg -o find-callgrind.svg

The effect is as follows:

Library functions
$ ltrace-f-ttt-c find/proc/-maxdepth 2-name "VM" 2>&1 >/dev/null% time seconds Usecs/call calls function---------------------------------------------------------30.75 2.939452 47175 strlen 1     6.71 1.597174 25560 free 15.38 1.469654-23589 memmove 9.18 0.877211 63    13773 malloc 8.55 0.817158 12542 readdir 7.65 0.731476 7.56 11796 Fnmatch       0.722771 11793 __strndup 1.73 0.165002 0.41 1966 __fxstatat 0.039644 78    503 Fchdir 0.23 0.022348 408 memcmp 0.23 0.022276 0.23 0.021551 opendir 0.22 0.021419 0.22 0.021144 84 2          Wuyi Open 0.21 0.019795 249 __fxstat 0.21 0.019790 19790 1 qsort 0.17 0.016417 DirFD 0.16 0.015680 98 159 strcmp 0.16 0.015218 $252 __errno_location 0.00 0.0004         417 1 dcgettext 0.00 0.000404 404 1 setlocale 0.00 0.000266 133 2 Isatty 0.00 0.000213 106 2 getenv 0.00 0.000158 158 1 __fprintf_chk 0.00 0. 000158 2 fclose 0.00 0.000135 135 1 uname 0.00 0.000120 120 1    Strtod 0.00 0.000113 2 __fpending 0.00 0.000110 1 Bindtextdomain 0.00 0.000107 107 1 Gettimeofday 0.00 0.000107 107 1 Textdomain 0.00 0.000107 1    1 Fileno 0.00 0.000106 106 1 STRCHR 0.00 0.000106 106 1 memcpy 0.00         0.000105 1 __cxa_atexit 0.00 0.000102 2 ferror 0.00 0.000092 92  1 fflush 0.00  0.000079 1 realloc 0.00 0.000076 1 strspn 0.00 0.000072 72 1 strtol 0.00 0.000052 1 calloc 0.00 0.000051 1 STRRCHR-------------- -------------------------------------------100.00 9.558436 151045 Total
System calls
$ strace-f-ttt-c find/proc/-maxdepth 2-name "VM" 2>&1 >/dev/null% time seconds Usecs/call calls Errors syscall--------------------------------------------------------------39.93 0.007072 4 196           6 Newfstatat 22.44 0.003974 8 getdents 10.53 0.001865 4 508  Close 8.27 0.001464 3 503 fchdir 6.09 0.001079 4 261 4 Open 5.72 0.001013 4 openat 4.68 0.000829 3 Fstat 0. 0.000120 9 mmap 0.36 0.000064 6 mprotect 0.33 0.           000058 5 BRK 0.27 0.000048 3 Munmap 0.20 0.000036         9 4 4 Access 0.19 0.000034 9 4 Read 0.11 0.000020 7 3 2 IOCTL 0.07 0.000012 1 Write 0.05 0.000009 9 1 Execve 0.03 0.000006 6 1 uname 0.03 0.000006 6 1 arch_prctl-------------- ------------------------------------------------100.00 0.017709 4286 Total

Next, perf take a look at the kernel section:

Kernel space
$ perf record -g find /proc -maxdepth 2 -name "vm" 2>&1 >/dev/null$ perf report -g --stdio$ perf script | ./stackcollapse-perf.pl > find.perf-outfolded$ flamegraph.pl find.perf-outfolded > find-flame.svg

The above-mentioned commands have the following meanings:

    • perf record -gRecord the function call relationship followed by the command at the time of execution
    • perf report -g --stdioPrint out the obtained function relationship data in the console (the output is a bit like a tree)
    • perf script | ./stackcollapse-perf.pl > find.perf-outfoldedConvert to Flamegraph supported formats
    • flamegraph.pl find.perf-outfolded > find-flame.svgGenerate Flame Diagram

The effect is as follows:

Summary

Through the above procedure, we demonstrate how to analyze the kernel space part function call situation when the application executes, and then make a good supplement to the previous two articles.

The whole sequence has so far been primarily an analysis of function call relationships. For the analysis of the source code, for performance optimization, are completely inadequate:

    • On the one hand, this can only assist in understanding the function level and cannot understand the code level. To do further, get gcov and kgcov support.
    • If you want to do performance analysis, in addition to function call relationship tracking hotspot area, in fact, there is still a lack of information, such as the entire call timing, the current processor frequency, kernel scheduling situation, and can not be reflected in this sequence.

Next, for this source analysis series, we will add three articles:

    • The function call relation (flowchart) Drawing method Introduction, will introduce several new methods and analyze the advantages and disadvantages on the basis of the existing.
    • Code-level source analysis, through gcov and kgcov analysis.

In addition, we will open another performance optimization series to introduce the various performance optimization examples, including application and kernel two aspects.

Initiative

Finally, I would like to pay tribute to the developers and contributors of those open source tools!

The Linux field gathers too many talented person, the creativity is like the fountain unceasingly moistens The IT world, this article uses the three big tools the original author all Is this kind of genius representative, the admiration not to say the statement.

Steven has had a one side, and in the 2009 to the official community to submit ftrace for MIPS, he provided a lot of guidance and help, gratitude to the unlimited professional research power.

Here, more front-line engineers are invited to converge to the technology, collaborate, share the learning experience, exchange research and development experiences, develop open source tools together, and strive to promote the exchange and prosperity of the industry. At present, there are 15 first-line engineers involved, we work together worktile to explore, together to create. If you are happy to join, you can contact us to get an invitation.

SOURCE Analysis: Dynamic analysis of Linux kernel function call relationships

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.