Into
The program cannot be started, the software runs slowly, and the program's "segment
Fault "and so on are all headaches for every UNIX system user. This article uses three practical cases to demonstrate how to use the three commonly used debugging tools truss, strace, and ltrace.
To quickly diagnose the "intractable diseases" of the software ".
Truss and strace are used to track system calls or signal generation of a process, while
Ltrace is used to track the process calling database functions. Truss is an earlier version of System V
Most Unix systems, including AIX and FreeBSD, have their own debugging programs developed by R4. strace was originally written for the SunOS system, and ltrace
First appeared in GNU/Debian
Linux. These two tools have now been transplanted to most Unix systems. Most Linux distributions have their own strace and ltrace, while FreeBSD can also
Install them through ports.
You can not only debug a new program from the command line, but also bind truss, strace, or ltrace to
Some PIDs are used to debug a running program. The basic usage of the three debugging tools is roughly the same. The following describes only the three tools, which are the most commonly used three command line parameters:
-F: in addition to tracking the current process, it also tracks its sub-processes. -O file: Write the output information to the file instead of stderr ). -P pid: bind to a running process corresponding to the PID. This parameter is often used to debug background processes. |
Enable
The above three parameters can basically be used to complete most debugging tasks. The following is a few command line examples:
Truss-O ls. Truss LS-AL: tracks the running of LS-Al and writes the output information to the file/tmp/LS. truss. Strace-f-o Vim. strace VIM: tracks the running of vim and its sub-processes, and writes the output information to the file Vim. strace. Ltrace-P 234: tracks a running process with a PID of 234. |
3.
The output formats of debugging tools are similar. Take strace as an example:
brk(0) = 0x8062aa8 brk(0x8063000) = 0x8063000 mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE, 3, 0x92f) = 0x40016000 |
Each line is a system call. The function name and its parameters of the system call are on the left of the equal sign, and the return value of the call is on the right.
Return Value.
The operating principles of truss, strace, and ltrace are similar. They all use ptrace system calls to track the processes in debugging. The detailed principles are not covered in this article. If you are interested
To refer to their source code.
Here are two examples to demonstrate how to use these three debugging tools to diagnose the "intractable diseases" of the software ":
Case 1: A segment fault error occurs when you run the Clint.
Operating System
System: FreeBSD-5.2.1-release
Clint is a C ++ static source code analysis tool. After installation through ports, run:
# clint foo.cpp Segmentation fault (core dumped) |
In Unix systems
See "segmentation fault", just like in MS
The pop-up "illegal operation" dialog box in Windows is just as annoying. OK. We use truss to give Clint "pulse ":
# truss -f -o clint.truss clint Segmentation fault (core dumped) # tail clint.truss 739: read(0x6,0x806f000,0x1000) = 4096 (0x1000) 739: fstat(6,0xbfbfe4d0) = 0 (0x0) 739: fcntl(0x6,0x3,0x0) = 4 (0x4) 739: fcntl(0x6,0x4,0x0) = 0 (0x0) 739: close(6) = 0 (0x0) 739: stat("/root/.clint/plugins",0xbfbfe680) ERR#2 'No such file or directory' SIGNAL 11 SIGNAL 11 Process stopped because of: 16 process exit, rval = 139 |
We use truss to track the Clint system.
Unified execution, output the results to the file Clint. truss, and then use tail to view the last few lines. Pay attention to the last system call executed by Clint (5th to 5th
Row): Stat ("/root/. Clint/plugins", 0xbfbfe680) Err #2 'no such file or
Directory, the problem lies here: Clint cannot find the directory "/root/. Clint/plugins", leading to a segment error. How can this problem be solved? Simple:
Mkdir-P/root/. Clint/plugins, but this operation will still "Segmentation
Fault "9. Continue to trace with Truss and find that Clint still needs this directory "/root/. Clint/plugins/Python". After this directory is created
Clint is finally running properly.
Case 2: Vim startup speed slows down significantly
Operation
System: FreeBSD-5.2.1-release
The Vim version is 6.2.154. After running Vim from the command line, it takes nearly half a minute to enter
Enter the editing page without any error output. Carefully checked.
Neither vimrc nor all Vim scripts have error configurations, and no solution to similar problems can be found on the Internet, so it is difficult to get hacking Source
Code? It is not necessary to use truss to locate the problem:
# truss -f -D -o vim.truss vim |
Here, the function of the-D parameter is: Before the output of each row
Add the relative timestamp, that is, the time consumed by each system call. We only need to pay attention to which system calls take a long time, and use less to carefully view the output file
Vim. TRUSS:
735: 0.000021511 socket(0x2,0x1,0x0) = 4 (0x4) 735: 0.000014248 setsockopt(0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0) 735: 0.000013688 setsockopt(0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0) 735: 0.000203657 connect(0x4,{ AF_INET 10.57.18.27:6000 },16) ERR#61 'Connection refused' 735: 0.000017042 close(4) = 0 (0x0) 735: 1.009366553 nanosleep(0xbfbfe468,0xbfbfe460) = 0 (0x0) 735: 0.000019556 socket(0x2,0x1,0x0) = 4 (0x4) 735: 0.000013409 setsockopt(0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0) 735: 0.000013130 setsockopt(0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0) 735: 0.000272102 connect(0x4,{ AF_INET 10.57.18.27:6000 },16) ERR#61 'Connection refused' 735: 0.000015924 close(4) = 0 (0x0) 735: 1.009338338 nanosleep(0xbfbfe468,0xbfbfe460) = 0 (0x0) |
Vim tries to connect to Port 6000 of the host 10.57.18.27 (the fourth line
Connect (). After the connection fails, the system continues to retry (6th rows of nanosleep () after one second of sleep ()). The above fragments appear more than a dozen times, each time it takes more than one second, this
That is why Vim is obviously slow. However, you will be wondering: "How can Vim connect to Port 6000 of other computers for no reason? ". Well, let's look back at what server 6000 is.
Service port? That is, X server. It seems that Vim wants to direct the output to a remote X
Server, so the shell certainly defines the DISPLAY variable, view. cshrc, there is such a line: setenv display
$ {Remotehost}: 0. comment it out and log on again. The problem is solved.
Case 3:
Use debugging tools to understand how the software works
Operating System: Red Hat Linux 9.0
Use
Debugging tools to track the running status of software in real time are not only an effective means to diagnose the "difficult and miscellaneous" of the software, but also help us to clarify the "context" of the software ", that is to say, it is enough to quickly master the software running process and working principle.
An auxiliary method for learning source code. The following example shows how to use strace to "trigger inspiration" by tracking other software to solve the difficulties in software development.
Large
Everyone knows that opening a file in a process has a unique file descriptor (FD: File
Descriptor) corresponds to this file. When developing a software, I encountered the following problem: a FD is known.
, How to obtain the complete path of the file corresponding to this FD? No such API is provided in Linux, FreeBSD, or other UNIX systems. What should I do? Let's think from another angle:
Is there any software in UNIX that can obtain the files opened by the process? If you have rich experience, it is easy to think of lsof. You can use it to know which files are opened by the process or a document
Which process is opened. Okay, let's use a small program to test lsof and see how it gets the files opened by the process.
/* Testlsof. C */ # Include <stdio. h> # Include <unistd. h> # Include <sys/types. h> # Include <sys/STAT. h> # Include <fcntl. h>
Int main (void) { Open ("/tmp/foo", o_creat | o_rdonly);/* open the file/tmp/Foo */ Sleep (1200);/* sleep for 1200 seconds for subsequent operations */ Return 0; } |
Put testlsof into the background for running.
PID is 3125. Command lsof-P
3125 check the files opened by process 3125. We use strace to track the running of lsof and save the output results in lsof. strace:
# gcc testlsof.c -o testlsof # ./testlsof & [1] 3125 # strace -o lsof.strace lsof -p 3125 |
Me
Search for the output file lsof. strace with the keyword "/tmp/foo". There is only one result:
# grep '/tmp/foo' lsof.strace readlink("/proc/3125/fd/3", "/tmp/foo", 4096) = 8 |
Previously, lsof cleverly used the/proc/NNNN/FD/directory (NNNN is
PID): the Linux kernel creates a directory named "/proc/" for each process to save information about the process. Its subdirectory FD stores all the files opened by the process.
Part FD. The target is very close to us. Okay. Let's go to/proc/3125/FD/and check whether:
# cd /proc/3125/fd/ # ls -l total 0 lrwx------ 1 root root 64 Nov 5 09:50 0 -> /dev/pts/0 lrwx------ 1 root root 64 Nov 5 09:50 1 -> /dev/pts/0 lrwx------ 1 root root 64 Nov 5 09:50 2 -> /dev/pts/0 lr-x------ 1 root root 64 Nov 5 09:50 3 -> /tmp/foo # readlink /proc/3125/fd/3 /tmp/foo |
The answer is already obvious.
Now, every FD file in the/proc/NNNN/FD/directory is a symbolic link, which points to a file opened by the process. We only need to use readlink () system to call
The Code is as follows:
#include <stdio.h> #include <string.h> #include <sys/types.h> #include <unistd.h> #include <fcntl.h> #include <sys/stat.h> int get_pathname_from_fd(int fd, char pathname[], int n) { char buf[1024]; pid_t pid; bzero(buf, 1024); pid = getpid(); snprintf(buf, 1024, "/proc/%i/fd/%i", pid, fd); return readlink(buf, pathname, n); } int main(void) { int fd; char pathname[4096]; bzero(pathname, 4096); fd = open("/tmp/foo", O_CREAT|O_RDONLY); get_pathname_from_fd(fd, pathname, 4096); printf("fd=%d; pathname=%s/n", fd, pathname); return 0; } |
For security considerations
By default, the system no longer automatically loads the proc file system. Therefore, to use the truss or strace tracking program, you must manually load the proc file system: Mount-T
Procfs proc/proc; or add a line in/etc/fstab: