Golang and system calls

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

A video in GopherCon2017 explains how to implement a simple strace with Golang, and this article is based on this presentation.

What is a system call

First look at the definition of the wiki:

In computing, a system call is the programmatic way in which a computer program requests a service from the kernel of the operating system it is executed on. This may include hardware-related services (for example, accessing a hard disk drive), creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

A system call is the process by which a program requests a service from the operating system kernel, typically including hardware-related services (such as accessing a hard disk), creating a new process, and so on. System calls provide an interface between a process and the operating system.

Syscall everywhere

As long as you write programs on the OS, you can't avoid dealing with syscall. For the most common example, fmt.Println("hello world") here is the system call write , we turn the source.

func Fprintln(w io.Writer, a ...interface{}) (n int, err error) {    p := newPrinter()    p.doPrintln(a)    // writer 是 stdout    n, err = w.Write(p.buf)    p.free()    return}Stdout = NewFile(uintptr(syscall.Stdout), "/dev/stdout")func (f *File) write(b []byte) (n int, err error) {    if len(b) == 0 {        return 0, nil    }    // 实际的write方法,就是调用syscall.Write()    return fixCount(syscall.Write(f.fd, b))}

Zero-copy

One more example, we often hear of zero-copy, we see zero-copy is used to solve what problem.

read(file, tmp_buf, len);write(socket, tmp_buf, len);

Borrow a picture to illustrate the problem

    1. The first step, which read() causes context switch, enters kernel mode from user mode, and DMA (Direct memory access) engine reads the content from the disk and stores it in the kernel address buffer.
    2. In the second step, the data is copied from the kernel buffer into the user buffer, read() returned, and the context switches back to the user state.
    3. The third step, the write() context switch, copies the buffer to the kernel address buffer.
    4. Fourth step, write() return, fourth context switch, DMA engine transmits the data from the kernel buffer to the protocol engine, usually enters the queue, waits for the transmission.

We see that the data is copied back and forth between the user space and the kernel space, which is not necessary.

The solution is: mmap , sendfile , specifically, you can refer to this article

Here we should have a certain understanding of the system call.

Strace

straceis the tool used to view process system calls, and is generally used as follows

Strace <bin>strace-p <pid>//is used to count the number of system calls Strace-c <bin>//such as strace-c Echo hellohello% time Secon DS Usecs/call calls errors syscall--------------------------------------------------------------0.00 0.0           00000 0 1 Read 0.00 0.000000 0 1 Write 0.00 0.000000         0 3 Open 0.00 0.000000 0 5 Close 0.00 0.000000 0           4 Fstat 0.00 0.000000 0 7 mmap 0.00 0.000000 0 4 Mprotect 0.00 0.000000 0 1 munmap 0.00 0.000000 0 3 B     RK 0.00 0.000000 0 3 3 Access 0.00 0.000000 0 1 Execve 0.00 0.000000 0 1 arch_prctl--------------------------------------------------------------      100.00 0.000000              3 Total 

The implementation of Stace is the system call Ptrace, we see what Ptrace is.

Ptrace

The man page is described as follows:

The Ptrace () system call provides a means by which one process (the "tracer") may observe and control th E Execution of another process (the "Tracee"), and examine and change the Tracee ' s memory and registers. It is primarily used to implement breakpoint Debuggingand system call tracing.

In simple terms, there are three main competencies:

    • Tracking system Calls
    • Read and write memory and registers
    • Passing signals to the tracked program

Interface

int ptrace(int request, pid_t pid, caddr_t addr, int data);request包含:PTRACE_ATTACHPTRACE_SYSCALLPTRACE_PEEKTEXT, PTRACE_PEEKDATA等

Tracer use the PTRACE_ATTACH command to specify the PID to be traced. Immediately after the call PTRACE_SYSCALL .
The Tracee will run until the system call is encountered and the kernel will stop executing. At this point, the Tracer will receive SIGTRAP a signal, tracer can print the memory and the information in the Register.

Next, Tracer continues the call PTRACE_SYSCALL , Tracee continues execution until Tracee exits the current system call.
It is important to note that Tracer will be aware of this when entering Syscall and exiting Syscall.

Mystrace

Knowing the above, presenter implemented a go version of Strace, which needs to be compiled in the Linux AMD64 environment.
GitHub

Strace.go

Package Mainimport ("FMT" "OS" "Os/exec" "Syscall") func main () {var err error var regs syscall. Ptraceregs var ss Syscallcounter ss = Ss.init () fmt. Println ("Run:", OS.) Args[1:]) cmd: = Exec.command (OS). ARGS[1], OS.    Args[2:] ...) Cmd. Stderr = OS. Stderr cmd. Stdout = OS. Stdout cmd. Stdin = OS. Stdin cmd. Sysprocattr = &syscall. sysprocattr{ptrace:true,} cmd. Start () Err = cmd. Wait () if err! = Nil {fmt. Printf ("Wait err%v \ n", err)} pid: = cmd.        Process.pid exit: = True for {//Remember that Ptrace_syscall pauses the tracee when entering and exiting SYSCALL, so this is controlled by a variable, and the contents of the Rax are printed only once If exit {err = Syscall. Ptracegetregs (PID, &regs) if err! = nil {break}//fmt. Printf ("% #v \ n", regs) Name: = Ss.getname (regs. Orig_rax) fmt. Printf ("Name:%s, ID:%d \ n", name, Regs. Orig_rax) Ss.inc (regs.      Orig_rax)}//above Ptrace a request command mentioned  Err = Syscall. Ptracesyscall (PID, 0) If err! = Nil {panic (ERR)}//guess is to wait for the process to enter the next stop, where if you do not wait, then you will print a large number of repeated tune With the function name _, err = Syscall. WAIT4 (PID, nil, 0, nil) if err! = Nil {panic (err)} exit =!exit} ss.print ()}

counter for statistical information, Syscallcounter.go

package mainimport (    "fmt"    "os"    "text/tabwriter"    "github.com/seccomp/libseccomp-golang")type syscallCounter []intconst maxSyscalls = 303func (s syscallCounter) init() syscallCounter {    s = make(syscallCounter, maxSyscalls)    return s}func (s syscallCounter) inc(syscallID uint64) error {    if syscallID > maxSyscalls {        return fmt.Errorf("invalid syscall ID (%x)", syscallID)    }    s[syscallID]++    return nil}func (s syscallCounter) print() {    w := tabwriter.NewWriter(os.Stdout, 0, 0, 8, ' ', tabwriter.AlignRight|tabwriter.Debug)    for k, v := range s {        if v > 0 {            name, _ := seccomp.ScmpSyscall(k).GetName()            fmt.Fprintf(w, "%d\t%s\n", v, name)        }    }    w.Flush()}func (s syscallCounter) getName(syscallID uint64) string {    name, _ := seccomp.ScmpSyscall(syscallID).GetName()    return name}

Final Result:

Run:  [echo hello]Wait err stop signal: trace/breakpoint trapname: execve, id: 59name: brk, id: 12name: access, id: 21name: mmap, id: 9name: access, id: 21name: open, id: 2name: fstat, id: 5name: mmap, id: 9name: close, id: 3name: access, id: 21name: open, id: 2name: read, id: 0name: fstat, id: 5name: mmap, id: 9name: mprotect, id: 10name: mmap, id: 9name: mmap, id: 9name: close, id: 3name: mmap, id: 9name: arch_prctl, id: 158name: mprotect, id: 10name: mprotect, id: 10name: mprotect, id: 10name: munmap, id: 11name: brk, id: 12name: brk, id: 12name: open, id: 2name: fstat, id: 5name: mmap, id: 9name: close, id: 3name: fstat, id: 5helloname: write, id: 1name: close, id: 3name: close, id: 3        1|read        1|write        3|open        5|close        4|fstat        7|mmap        4|mprotect        1|munmap        3|brk        3|access        1|execve        1|arch_prctl

Comparing the results, we can find the same as strace.

Presenter GitHub
YouTube video

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.