Linux Problem Diagnostic Tool Strace__linux

Source: Internet
Author: User
Tags vars

Introduction

"Oops, the system hangs dead ..."

"Oops, the program crashes ..."

"Oops, command to perform an error ..."

For the maintenance staff, such tragedies are being staged every day. Ideally, the system or application error log provides sufficient information to enable the maintainer to quickly locate the cause of the problem by viewing the relevant log. However, in reality, many error log printing mode is ambiguous, more describe the phenomenon when the error (such as "could not open file", "Connect to XXX times out"), rather than the cause of the error.

Error logs do not meet the needs of positioning problems, we can start from a more "deep" aspect of the analysis. The execution of programs or commands requires interaction with the operating system through system calls, in which we can look at the system calls and their parameters, return values, define the scope of the error, and even find the root cause of the problem.

In Linux, Strace is such a tool. It allows us to trace the system calls and signals received during the execution of the program, helping us to analyze the exceptions encountered in the execution of the program or command.

a simple example

How to use Strace to track the program, how to view the corresponding output. Here's an example to illustrate.

1. Sample Tracking Program

Main.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int  Main ()
{
int fd;
int i = 0;
FD = open ("/tmp/foo", o_rdonly);
if (FD < 0)
i=5;
else
i=2;
return i;
}

The above program attempts to open the/tmp/foo file as read-only and then exits, using only the open system call function. We then compile the program and build the executable file:

lx@lx:~$ GCC Main.c-o Main

 

2.strace Trace Output

Using the following command, we will use Strace to track the above program and redirect the results to the Main.strace file:

lx@lx:~$ strace-o main.strace./main

Next we look at the contents of the Main.strace file:

lx@lx:~$ Cat Main.strace 1 Execve ("./main", ["./main"], [/* VARs/]) = 0 2 brk (0)       & nbsp;                            = 0x9ac4000 3 Access ("/etc/ld.so.nohwcap", F_OK)        =-1 enoent (No such file or directory) 4 mmap2 (NULL, 8192, prot_read| Prot_write, map_private|  Map_anonymous,-1, 0) = 0xb7739000 5 Access ("/etc/ld.so.preload", R_OK)       =-1 enoent (No such File or directory) 6 open ("/etc/ld.so.cache", o_rdonly)       = 3 7 Fstat64 (3, {st_mode=s_ifreg| 0644, st_size=80682, ...}) = 0 8 Mmap2 (NULL, 80682, Prot_read, Map_private, 3, 0) = 0xb7725000 9 Close (3) &NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&N bsp;                         = 0 ACcess ("/etc/ld.so.nohwcap", F_OK)       =-1 enoent (No such file or directory) one open ("/lib/i38 6-linux-gnu/libc.so.6 ", o_rdonly) = 3 Read (3," \177elf\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220o\1\0004\0\0\0 " -Fstat64 (3, {st_mode=s_ifreg|0755, st_size=1434180, ...}) = 0 mmap2 (NULL, 1444360, prot_read| Prot_exec, map_private| Map_denywrite, 3, 0) = 0x56d000 mprotect (0x6c7000, 4096, prot_none)      = 0 mmap2 (0x6c8000, 1228 8, prot_read| Prot_write, map_private| map_fixed| Map_denywrite, 3, 0x15a) = 0x6c8000 mmap2 (0x6cb000, 10760, prot_read| Prot_write, map_private| map_fixed| Map_anonymous,-1, 0) = 0x6cb000 Close (3)                                   = 0 Mmap2 (NULL, 4096, prot_read| Prot_write, map_private| Map_anonymous,-1, 0) = 0xb7724000 Set_thread_area ({entry_Number:-1-> 6, base_addr:0xb77248d0, limit:1048575, Seg_32bit:1, contents:0, read_exec_    only:0, Limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect (0x6c8000, 8192, prot_read)      = 0 2 2 Mprotect (0x8049000, 4096, prot_read)     = 0 Mprotect (0x4b0000, 4096, Prot_read)    & nbsp = 0 Munmap (0xb7725000, 80682)                 = 0 Open ("/tmp/foo", o_rdonly)                =-1 enoent (No such file or directory) Exit_group (5)          &
nbsp;                 =? The marked line number is added for ease of instruction and is not strace execution output

See this pile of output, whether heart fear mood. Don't worry, let's analyze the output.

Strace the system call that is generated when the tracker interacts with the system, each of these lines corresponds to a system call, in the form:

Name of system call (parameter ...) = return value error flag and description

Line 1: The Execve (or one of the Exec series calls) is the first of the Strace output system calls for programs executed under the command-line. Strace first call the fork or clone function to create a new subprocess, and then call exec load the program that needs to be executed (here for./main) in the child process.

Line 2: Call BRK with 0 as a parameter, return value is the starting address of memory management (if malloc is invoked in a subprocess, the space is allocated from the 0x9ac4000 address)

Line 3: Invoke the Access function to verify that the/ETC/LD.SO.NOHWCAP exists

Line 4: Use the MMAP2 function for anonymous memory mapping to obtain 8192bytes memory space, which starts with the address 0xb7739000, about anonymous memory mapping, you can see here

Line 6: Call the Open function to try to open the/etc/ld.so.cache file and return the file descriptor to 3

Line 7: fstat64 function gets/etc/ld.so.cache file information

Line 8: Call the MMAP2 function to map the/etc/ld.so.cache file to memory, for the use of mmap mapping file to memory, you can see here

Line 9: Close closes the file descriptor 3 point to the/etc/ld.so.cache file

Line12: Call read, read 512bytes from/lib/i386-linux-gnu/libc.so.6 libc library file, read Elf header information

Line15: Protects the 4096bytes space from the start of the 0x6c7000 using the Mprotect function (prot_none means no access, Prot_read indicates it can be read)

Line24: Call the Munmap function to map the/etc/ld.so.cache file from memory, corresponding to the MMAP2 of line 8

Line25: The only system call--open function used in the corresponding source code to open/tmp/foo file

Line26: End of subprocess, exit code is 5 (why the exit value is 5.) Return to the previous program Example section to see the source code bar:

 

3. Output Analysis

Whirring After watching so many system call functions, is not a bit of touch north. Let's start with the whole, back to the theme strace up.

From the above output can be found, the actual source corresponding to the only open this system call (LINE25), the other system calls are almost all for process initialization work: loading the execution of the program, loading libc function library, set memory mapping, and so on.

The IF statement or other code in the source is not reflected in the corresponding strace output because they do not evoke system calls. Strace only cares about the interaction between the program and the system, so strace does not apply to error and analysis of program logic code.

For hundreds of system calls in Linux, a few of the above strace output is just the tip of the iceberg, want to get a deeper understanding of Linux system calls, then man.

Man 2 system call name man
ld.so  //linux dynamic Link manpage

strace Common Options

This section describes several strace command options that are often used, and when to use them is appropriate.

1. Tracking Child Processes

By default, strace tracks only the specified processes, not the newly created child processes in the specified process. Using the-f option, you can track new child processes in a process and print the corresponding process PID in the output result:

Mprotect (0x5b1000, 4096, prot_read)     = 0
munmap (0xb77fc000, 80682)               = 0
Clone (Process 13600 attached
child_stack=0, flags=clone_child_cleartid| clone_child_settid| SIGCHLD, child_tidptr=0xb77fb938) = 13600
[pid 13599] Fstat64 (1, {st_mode=s_ifchr|0620, St_rdev=makedev (136, 0), ...})  = 0
[pid 13600] Fstat64 (1, {st_mode=s_ifchr|0620, St_rdev=makedev (136, 0), ...}) = 0
[pid 13599] MMAP2 (NULL, 4096, prot_read| Prot_write, map_private| Map_anonymous,-1, 0 <unfinished ...>
[pid 13600] MMAP2 (NULL, 4096, prot_read| Prot_write, map_private| Map_anonymous,-1, 0) = 0xb780f000 ...

When multiple process programs, commands, and scripts are tracked using Strace, the-f option is generally turned on.

 

2. Record system call time

Strace can also record the time information of each system call when the program interacts with the system, with several options such as R, T, TT, TTT, T, which record the time as follows:

-T: Record the time that each system call takes, accurate to microseconds

-R: Timed to the first system call (usually EXECVE), accurate to microseconds

-T: Time: Minutes: sec

-tt: minutes: seconds. Microseconds

-TTT: The number of seconds since the computer era. Microseconds

The more commonly used is the T option, because it provides the time each system call takes. The time records for other options include both system call times and user-level code execution, with a smaller reference value. The partial time option we can combine to use, for example:

Strace-tr./main
0.000000 execve ("./main", ["main"], [/* VARs/]) = 0
0.000931 fcntl64 (0, f_getfd) = 0 <0.0 00012>
0.000090 fcntl64 (1, f_getfd) = 0 <0.000022>
0.000060 fcntl64 (2, F_GETFD) = 0 <0.000012>
0.000054 uname ({sys= "Linux", node= "ion", ...}) = 0 <0.000014> 0.000307 geteuid32
() = 7903 <0.000011 >
0.000040 getuid32 () = 7903 <0.000012>
0.000039 getegid32 () = <0.000011>
0.000039 GETGID32 () = <0.000011> ...

The leftmost column corresponds to the time output of the-r option, and the rightmost column corresponds to the output of the-t option.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.