"Intractable diseases" using truss, strace, or ltrace diagnostic Software"
Introduction
The process cannot be started, the software runs slowly, and the program's "Segment Fault" is a headache for every Unix system user, this article demonstrates how to use three common debugging tools, truss, strace, and ltrace, to quickly diagnose the "intractable diseases" of the software ".
Truss and strace are used to track system calls or signal generation of a process, while ltrace is used to track the situation of library functions called by the process. Truss is a debugging program developed earlier for System V R4. Most Unix systems, including Aix and FreeBSD, come with this tool. strace was originally written for the SunOS System, ltrace first appeared in GNU/Debian Linux. These two tools have now been transplanted to most Unix systems. Most Linux distributions have their own strace and ltrace, And FreeBSD can install them through Ports.
You can not only debug a new program from the command line, but also bind truss, strace, or ltrace to an existing PID to debug a running program. The basic usage of the three debugging tools is roughly the same. The following describes only the three tools, which are the most commonly used three command line parameters:
-F: in addition to tracking the current process, it also tracks its sub-processes. -O file: Write the output information to the file instead of stderr ). -P pid: bind to a running process corresponding to the pid. This parameter is often used to debug background processes.
The above three parameters can basically be used to complete most debugging tasks. The following are some command line examples:
Truss-o ls. truss ls-al: tracks the running of ls-al and writes the output information to the file/tmp/ls. truss. Strace-f-o vim. strace vim: tracks the running of vim and its sub-processes, and writes the output information to the file vim. strace. Ltrace-p 234: tracks a running process with a pid of 234.
The output formats of the three debugging tools are similar. Take strace as an example:
- Brk (0) = 0x8062aa8
- Brk (0x8063000) = 0x8063000
- Mmap2 (NULL, 4096, PROT_READ, MAP_PRIVATE, 3, 0x92f) = 0x40016000
Each line is a system call. The function name and its parameters of the system call are on the left of the equal sign, and the return value of the call is on the right. The operating principles of truss, strace, and ltrace are similar. They all use ptrace system calls to track processes in debugging. The detailed principles are not covered in this article. If you are interested, refer to their source code.
The following two examples demonstrate how to use these three debugging tools to diagnose the "intractable diseases" of the software ":
Case 1: A Segment Fault error occurs when you run the clint.
Operating System: FreeBSD-5.2.1-release
Clint is a C ++ static source code analysis tool. After installation through Ports, run:
- # Clint foo. cpp
- Segmentation fault (core dumped)
Meeting "Segmentation Fault" in Unix systems is just as annoying as the "illegal operation" dialog box popped up in MS Windows. OK. We use truss to give clint "pulse ":
- # Truss-f-o clint. truss clint
- Segmentation fault (core dumped)
- # Tail clint. truss
- 739: read (0x6, 0x806f000, 0x1000) = 4096 (0x1000)
- 739: fstat (6, 0xbfbfe4d0) = 0 (0x0)
- 739: fcntl (0x6, 0x3, 0x0) = 4 (0x4)
- 739: fcntl (0x6, 0x4, 0x0) = 0 (0x0)
- 739: close (6) = 0 (0x0)
- 739: stat ("/root/. clint/plugins", 0xbfbfe680) ERR #2 'no such file or directory'
- SIGNAL 11
- SIGNAL 11
- Process stopped because of: 16
- Process exit, rval = 139
We use truss to track the execution of clint system calls and output the results to the file clint. truss. Then we use tail to view the last few lines.
Pay attention to the last system call executed by clint (the last five lines): stat ("/root /. clint/plugins ", 0xbfbfe680) ERR #2 'no such file or directory'. The problem lies in this: clint cannot find the directory"/root /. clint/plugins ", causing a segment error. How can this problem be solved? Simple: mkdir-p/root/. clint/plugins. However, this operation will still "Segmentation Fault" 9. Continue to use truss to trace and find that clint still needs this directory "/root/. clint/plugins/python". After this directory is created, clint can finally run normally.
Case 2: vim startup speed slows down significantly
Operating System: FreeBSD-5.2.1-release
The vim version is 6.2.154. After running vim from the command line, it takes nearly half a minute to enter the editing interface without any error output. I carefully checked that. vimrc and all vim scripts have no error configuration, and I cannot find a solution to this problem on the Internet. Is it hard to get a hacking source code? It is not necessary to use truss to locate the problem:
- # Truss-f-D-o vim. truss vim
Here, the-D parameter is used to add a relative timestamp before the output of each row, that is, the time consumed by each execution of a system call. We only need to pay attention to which system calls take a long time. We can use less to carefully check the output file vim. truss, and soon find the problem:
- 735: 0.000021511 socket (0x0) = 4 (0x4)
- 735: 0.000014248 setsockopt (0x6, 0x1, 0xbfbfe3c8, 0x4) = 0 (0x0)
- 735: 0.000013688 setsockopt (0x4, 0 xffff, 0x8, 0xbfbfe2ec, 0x4) = 0 (0x0)
- 735: 0.000203657 connect (0x4, {AF_INET 10.57.18.27: 6000}, 16) ERR #61 'Connection refused'
- 735: 0.000017042 close (4) = 0 (0x0)
- 735: 1.009366553 nanosleep (0xbfbfe468, 0xbfbfe460) = 0 (0x0)
- 735: 0.000019556 socket (0x0) = 4 (0x4)
- 735: 0.000013409 setsockopt (0x6, 0x1, 0xbfbfe3c8, 0x4) = 0 (0x0)
- 735: 0.000013130 setsockopt (0x4, 0 xffff, 0x8, 0xbfbfe2ec, 0x4) = 0 (0x0)
- 735: 0.000272102 connect (0x4, {AF_INET 10.57.18.27: 6000}, 16) ERR #61 'Connection refused'
- 735: 0.000015924 close (4) = 0 (0x0)
- 735: 1.009338338 nanosleep (0xbfbfe468, 0xbfbfe460) = 0 (0x0)
Vim tries to connect to Port 6000 (connect () on the fourth line) of the host 10.57.18.27. After the connection fails, it will continue to retry (nanosleep () on the second line) after sleep ()). The above fragments appear more than a dozen times, each time it takes more than one second, which is why vim is obviously slowing down.
However, you will be wondering: "How can vim connect to Port 6000 of other computers for no reason? ". Well, let's take a look at what service Port 6000 is? That is, X Server. It seems that vim is to direct the output to a remote X Server, so the Shell certainly defines the DISPLAY variable to view. cshrc: setenv DISPLAY $ {REMOTEHOST}: 0. comment it out and log on again. The problem is solved.
For more details, please continue to read the highlights on the next page: