Dynamic tracking Technology (iv): implementation of GO program dynamic tracking based on Linux BCC/BPF

Source: Internet
Author: User
Tags aliases
This is a creation in Article, where the information may have evolved or changed.


Summary


    • Original: Brendan Gregg ' s Blog: "Golang bcc/bpf Function Tracing", 2017 Jan
    • Intro: GDB, go execution Tracer, Godebug, Gctrace, Schedtrace
    • First, Gccgo Function counting
    • Second, Go GC Function counting
    • Iii. per-event invocations of a function
    • Iv. Interface Arguments
    • V. Function Latency
    • Vi. Summary
    • Vii. Tips: Building LLVM and Clang development tools Library


In this article, I'll quickly investigate a new way to track the Go program: Dynamic tracking based on the Linux 4.x EBPF. If you go to search for go and BPF, you will find the Go language interface using the BPF interface (for example, GOBPF). That's not what I'm exploring: I'll use the BPF tool to implement performance analysis and debugging for Go applications.



There are several ways to debug and track Go programs, including but not limited to:


    • Gdb
    • Go execution tracer: For high-level anomalies and blocking events

Go execution Tracer. (import "Runtime/trace")

    • Godebug (a cross-platform Go Program debugging Tool), Gctrace , and schedtrace


BPF tracks to do a lot of things, but has its own pros and cons, which will be explained in detail next. First I start with a simple Go program (HELLO.GO)


package main

import "fmt"

func main () {
        fmt.Println ("Hello, BPF!")
}
First, gccgo Function Counting
I started compiling with gccgo and then using the Go gc compiler. (Difference: gccgo can generate optimized binaries, but based on older versions of Go.)

## Compile
$ gccgo -o hello hello.go
$ ./hello
Hello, BPF!
Now I will use the funccount of the bcc tool to dynamically track and count all Go library function calls starting with "fmt.". Re-running the Hello program in another terminal has the following effect:

# funccount 'go: fmt. *'
Tracing 160 functions for "go: fmt. *" ... Hit Ctrl-C to end.
^ C
FUNC COUNT
fmt..import 1
fmt.padString.pN7_fmt.fmt 1
fmt.fmt_s.pN7_fmt.fmt 1
fmt.WriteString.pN10_fmt.buffer 1
fmt.free.pN6_fmt.pp 1
fmt.fmtString.pN6_fmt.pp 1
fmt.doPrint.pN6_fmt.pp 1
fmt.init.pN7_fmt.fmt 1
fmt.printArg.pN6_fmt.pp 1
fmt.WriteByte.pN10_fmt.buffer 1
fmt.Println 1
fmt.truncate.pN7_fmt.fmt 1
fmt.Fprintln 1
fmt. $ nested1 1
fmt.newPrinter 1
fmt.clearflags.pN7_fmt.fmt 2
Detaching ...
The Neat! Output contains the program's fmt.Println () function call.

I don't need to enter any special mode to achieve this effect. For a Go application that is already running, I can start measuring directly without restarting the process. So how does it even work? This is thanks to uprobes, a new feature of Linux 3.5, see Linux uprobes: User-Level Dynamic Tracing for details.

It overwrites instructions with a soft interrupt to kernel instrumentation, and reverses the process when tracing has ended.

The gccgo compiled output provides a standard symbol table for function lookups. In this case, I use libgo as a measurement tool (assuming "lib" precedes "go:") as a binary dynamic link library issued by gccgo (libgo includes the fmt package). uprobes can be connected to already running processes, or as a binary library as I am now, catching all processes that call themselves.

For efficiency, I count function calls in the kernel context and only send the count to user space. E.g:

$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU / Linux 2.6.32, BuildID [sha1] = 4dc45f1eb023f44ddb32c15bbe0fb4f933e61815, not stripped
$ ls -lh hello
-rwxr-xr-x 1 bgregg root 29K Jan 12 21:18 hello
$ ldd hello
    linux-vdso.so.1 => (0x00007ffc4cb1a000)
    libgo.so.9 => /usr/lib/x86_64-linux-gnu/libgo.so.9 (0x00007f25f2407000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f25f21f1000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f25f1e27000)
    /lib64/ld-linux-x86-64.so.2 (0x0000560b44960000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f25f1c0a000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f25f1901000)
$ objdump -tT /usr/lib/x86_64-linux-gnu/libgo.so.9 | grep fmt.Println
0000000001221070 g O .data.rel.ro 0000000000000008 fmt.Println $ descriptor
0000000000978090 g F .text 0000000000000075 fmt.Println
0000000001221070 g DO .data.rel.ro 0000000000000008 Base fmt.Println $ descriptor
0000000000978090 g DF .text 0000000000000075 Base fmt.Println
This content looks very much like a compiled C binary, so it can be observed using many existing debugging tools and tracers including bcc / BPF. Much simpler than measuring just-in-time runtimes (such as Java and Node.js). So far, the only trouble with this example is that the function name may contain non-standard characters, such as ".".

funccount provides the -p option to match the process ID (PID), and the -i option to control the output frequency. It is currently capable of processing 1000 probe points at the same time, and works fine when it matches "fmt. *", But all functions that match libgo fail. There are still a lot of problems in bcc / BPF, we need to find other ways to deal with them.

# funccount 'go: *'
maximum of 1000 probes allowed, attempted 21178
Go gc function counting
Use the gc compiler in Go to implement the fmt function call count:

$ go build hello.go
$ ./hello
Hello, BPF!
# funccount '/home/bgregg/hello:fmt.*'
Tracing 78 functions for "/ home / bgregg / hello: fmt. *" ... Hit Ctrl-C to end.
^ C
FUNC COUNT
fmt.init.1 1
fmt. (* fmt) .padString 1
fmt. (* fmt) .truncate 1
fmt. (* fmt) .fmt_s 1
fmt.newPrinter 1
fmt. (* pp) .free 1
fmt.Fprintln 1
fmt.Println 1
fmt. (* pp) .fmtString 1
fmt. (* pp) .printArg 1
fmt. (* pp) .doPrint 1
fmt.glob.func1 1
fmt.init 1
Detaching ...
You can still trace fmt.Println (). This binary is different from libgo: it contains a 2M static library (instead of 29K for dynamic libraries). Another difference is that the function name contains more special characters, such as "*", "(", etc., I suspect that if the haul cannot be corrected, it will affect other debuggers (such as the bcc tracker).

$ file hello
hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped
$ ls -lh hello
-rwxr-xr-x 1 bgregg root 2.2M Jan 12 05:16 hello
$ ldd hello
    not a dynamic executable
$ objdump -t hello | grep fmt.Println
000000000045a680 g F .text 00000000000000e0 fmt.Println
Per-event invocations of a function
3.1 gccgo Function Tracing
Now I will try to use Sasha Goldshtein's tracing tool, also based on bcc, to view every function call event. I will return to gccgo, using a very simple example program (from the go tour), functions.go:

package main

import "fmt"

func add (x int, y int) int {
    return x + y
}

func main () {
    fmt.Println (add (42, 13))
}
Trace the add () function. All parameters are output on the right, and trace has other options (help -h), such as output timestamp and stack.

\ # trace '/home/bgregg/functions:main.add'
PID TID COMMFUNC
14424 14424 functions main.add

# ... and with both its arguments:

\ # trace '/home/bgregg/functions:main.add "% d% d" arg1, arg2'
PID TID COMM FUNC-
14390 14390 functions main.add 42 13
3.2 Go gc Function Tracing
The same program, if you use go build, there is no main.add ()? Disabling inlining is sufficient.

$ go build functions.go

# trace '/home/bgregg/functions:main.add "% d% d" arg1, arg2'
could not determine address of symbol main.add

$ objdump -t functions | grep main.add
$
$ go build -gcflags '-l' functions.go
$ objdump -t functions | grep main.add
0000000000401000 g F .text 0000000000000020 main.add

# trace '/home/bgregg/functions:main.add "% d% d" arg1, arg2'
PID TID COMM FUNC-
16061 16061 functions main.add 536912504 16
That's wrong. The parameters should be 42 and 13 instead of 536912504 and 16.
Use gdb to see the results:

$ gdb ./functions
[...]
warning: File "/usr/share/go-1.6/src/runtime/runtime-gdb.py" auto-loading has been declined
 by your 'auto-load safe-path' set to "$ debugdir: $ datadir / auto-load".
[...]
(gdb) b main.add
Breakpoint 1 at 0x401000: file /home/bgregg/functions.go, line 6.
(gdb) r
Starting program: / home / bgregg / functions
[New LWP 16082]
[New LWP 16083]
[New LWP 16084]
Thread 1 "functions" hit Breakpoint 1, main.add (x = 42, y = 13, ~ r2 = 4300314240) at
 /home/bgregg/functions.go:6
6 return x + y
(gdb) i r
rax 0xc820000180 859530330496
rbx 0x584ea0 5787296
rcx 0xc820000180 859530330496
rdx 0xc82005a048 859530698824
rsi 0x10 16
rdi 0xc82000a2a0 859530371744
rbp 0x0 0x0
rsp 0xc82003fed0 0xc82003fed0
r8 0x41 65
r9 0x41 65
r10 0x4d8ba0 5082016
r11 0x0 0
r12 0x10 16
r13 0x52a3c4 5415876
r14 0xa 10
r15 0x8 8
rip 0x401000 0x401000
eflags 0x206 [PF IF]
cs 0xe033 57395
ss 0xe02b 57387
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
The startup message contains a warning about runtime-gdb.py, which is very useful: if I need to dig deeper into the Go context, I want to be able to fix and find out the cause of the alert. Even without this information, gdb can still output the value of the parameter variable as "x = 42, y = 13". I also compare them from the register export with the x86_64 ABI (Application Binary Interface), which is how bcc's trace reads them. From the syscall (2) man page:

       arch / ABI arg1 arg2 arg3 arg4 arg5 arg6 arg7 Notes
       ──────────────────────────────────────────── ────────────────
[...]
       x86_64 rdi rsi rdx r10 r8 r9-
The reason is that Go's gc compiler is not following the standard AMD64 ABI function calling convention, which causes problems with this and other debuggers.

42 and 13 do not appear in rdi, rsi or any other register. The reason is that Go's gc compiler does not follow the standard AMD64 ABI function calling convention, and other debuggers have this problem. This is annoying. I guess the return value of the Go language uses another ABI, because it can return multiple values, so even if the entry parameters are standard, we still encounter differences. I went through the guide (Quick Guide to Go's Assembler and the Plan 9 assembly manual) and it looks like the function is passed on the stack. These are our 42 and 13:

(gdb) x / 3dg $ rsp
0xc82003fed0: 4198477 42
0xc82003fee0: 13
BPF can dig these out too. As a proof of concept, I just hacked in a couple of new aliases, "go1" and "go2" for those entry arguments:

BPF can also mine this information. To test this concept, I declared a new pair of aliases "go1" and "go2" for the entry parameters. Hope you read this article, I (or others) have added it to the bcc tracking tool, preferably "goarg1", "goarg2", etc.

# trace '/home/bgregg/functions:main.add "% d% d" go1, go2'
PID TID COMM FUNC-
17555 17555 functions main.add 42 13
Fourth, Interface Arguments
You can write a custom bcc / BPF program to mine. In order to process the interface parameters we can add multiple aliases to the bcc tracker. The input parameters are examples of interfaces:

func Println (a ... interface ()) (n int, err error) {
    return Fprintln (os.Stdout, a ...)
$ gdb ./hello
[...]
(gdb) b fmt.Println
Breakpoint 1 at 0x401c50
(gdb) r
Starting program: / home / bgregg / hello
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff449c700 (LWP 16836)]
[New Thread 0x7ffff3098700 (LWP 16837)]
[Switching to Thread 0x7ffff3098700 (LWP 16837)]
Thread 3 "hello" hit Breakpoint 1, fmt.Println (a = ...) at ../../../src/libgo/go/fmt/print.go:263
263 ../../../src/libgo/go/fmt/print.go: No such file or directory.
(gdb) p a
$ 1 = {__values = 0xc208000240, __count = 1, __capacity = 1}
(gdb) p a .__ values
$ 18 = (struct (...) *) 0xc208000240
(gdb) p a .__ values [0]
$ 20 = {__type_descriptor = 0x4037c0 <__ go_tdn_string>, __object = 0xc208000210}
(gdb) x / s * 0xc208000210
0x403483: "Hello, BPF!"
V. Function Latency
Example: Delayed Histogram (nanoseconds) of a fmt.Println () Function
WARNING: During a Go function call, if switching from one process (goroutine) to another system process, funcency cannot match the entry-return. This scenario requires a new tool-gofunclatency, which tracks delays based on the TID of Go's built-in GOID replacement system. In some cases, uretprobes may modify Go programs to crash, so you need to be thorough before debugging. plan of.

# funclatency 'go: fmt.Println'
Tracing 1 functions for "go: fmt.Println" ... Hit Ctrl-C to end.
^ C

Function = fmt.Println [3041]
     nsecs: count distribution
         0-> 1: 0 | |
         2-> 3: 0 | |
4-> 7: 0 | |
         8-> 15: 0 | |
        16-> 31: 0 | |
        32-> 63: 0 | |
        64-> 127: 0 | |
       128-> 255: 0 | |
       256-> 511: 0 | |
       512-> 1023: 0 | |
      1024-> 2047: 0 | |
      2048-> 4095: 0 | |
      4096-> 8191: 0 | |
      8192-> 16383: 27 | **************************************** |
     16384-> 32767: 3 | **** |
Detaching ...
Tips
6.1 Install and compile BCC
git clone https://github.com/iovisor/bcc.git
mkdir bcc / build; cd bcc / build
cmake -DCMAKE_INSTALL_PREFIX = / usr \
      -DLUAJIT_INCLUDE_DIR = `pkg-config --variable = includedir luajit` \ # for lua support
      ..
make
sudo make install
cmake -DPYTHON_CMD = python3 .. # build python3 binding
pushd src / python /
make
sudo make install
popd
6.2 Build LLVM and Clang development tool libraries
yum install gcc
yum install gcc-g ++

wget https://cmake.org/files/v3.9/cmake-3.9.0-rc4.tar.gz
tar -xvf cmake-3.9.0-rc4.tar.gz
cd cmake-3.9.0
./bootstrap
gmake
gmake install
export CMAKE_ROOT = / usr / local / share / cmake-3.9.0
export PATH = $ PATH: $ CMAKE_ROOT

git clone http://llvm.org/git/llvm.git
cd llvm / tools;
git clone http://llvm.org/git/clang.git
cd ..; mkdir -p build / install; cd build
cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD = "BPF; X86" -DCMAKE_BUILD_TYPE = Release -DCMAKE_INSTALL_PREFIX = $ PWD / install ..
make
make install
export PATH = $ PWD / install / bin: $ PATH
6.3 LLVM and Clang
LLVM (Low Level Virtual Machine) is a compiler infrastructure written in C ++. Originated in 2000 by the study of Vikram Adve and Chris Lattner at the University of Illinois at Urbana-Champaign, they wanted to create dynamic compilations for all static and dynamic languages technology. In 2005, Apple directly hired Chris Ratner and his team to develop applications for Apple Computer. During this period, Chris Ratner designed and invented the Swift language, and LLVM became part of Mac OS X and iOS development tools. The scope of LLVM is not limited to the creation of a virtual machine. It has become a collective name for many compilation tools and low-level tool technologies. It is applicable to all projects under LLVM, including LLVM intermediate code (LLVM IR), LLVM debugging tools, and LLVM C ++ standard. Library, etc.

LLVM currently supports languages including ActionScript, Ada, D, Fortran, GLSL, Haskell, Java bytecode, Objective-C, Swift, Python, Ruby, Rust, Scala1, and C #.

Clang is the front-end of the LLVM compiler toolset, the purpose is to output the Abstract Syntax Tree (AST) corresponding to the code, and compile the code into LLVM Bitcode. Then use LLVM on the back-end to compile into platform-dependent machine language. Clang supports C, C ++, Objective C. Its goal is to provide an alternative to GCC. The author is Chris Lattner, who developed with Apple's sponsorship. The Clang project includes Clang front-end and Clang static analyzer.

6.4 ABI
Application Binary Interface (ABI) describes a low-level interface between an application and the operating system or other applications. The ABI covers various details such as:

The size and layout of the data type;

Calling convention (controls how the parameters of a function are passed and how to accept the return value), for example, whether all parameters are passed through the stack or some parameters are passed through registers; which register is used for which function parameters; Whether the function parameters are pushed onto the stack first or last;

Binary format of object files, libraries, etc.

ABI vs API
The application program interface (API) defines the interface between the source code and the library, so the same code can be compiled in any system that supports this API. However, the ABI allows the compiled object code to be used in ABI-compatible systems without modification. Can run.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.