TIPS: Use truss, strace, or ltrace diagnostic software for "intractable diseases"

Source: Internet
Author: User
Introduction

The process cannot be started, the software runs slowly, and the program's "segment fault" is a headache for every UNIX system user, this article demonstrates how to use three common debugging tools, truss, strace, and ltrace, to quickly diagnose the "intractable diseases" of the software ".

Used by Truss and straceTracks system calls or signal generation of a processWhile ltrace is usedTracking process call database functions. Truss is a debugging program developed earlier for System V R4. Most Unix systems, including AIX and FreeBSD, come with this tool. strace was originally written for the SunOS system, ltrace first appeared in GNU/Debian Linux. These two tools have now been transplanted to most Unix systems. Most Linux distributions have their own strace and ltrace, And FreeBSD can install them through ports.

You can not only debug a new program from the command line, but also bind truss, strace, or ltrace to an existing PID to debug a running program. The basic usage of the three debugging tools is roughly the same. The following describes only the three tools, which are the most commonly used three command line parameters:

-F: in addition to tracking the current process, it also tracks its sub-processes. -O file: Write the output information to the file instead of stderr ). -P pid: bind to a running process corresponding to the PID. This parameter is often used to debug background processes.

The above three parameters can basically be used to complete most debugging tasks. The following are some command line examples:

Truss-O ls. Truss LS-AL: tracks the running of LS-Al and writes the output information to the file/tmp/LS. truss. Strace-f-o Vim. strace VIM: tracks the running of vim and its sub-processes, and writes the output information to the file Vim. strace. Ltrace-P 234: tracks a running process with a PID of 234.

The output formats of the three debugging tools are similar. Take strace as an example:

brk(0)                                  = 0x8062aa8brk(0x8063000)                          = 0x8063000mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE, 3, 0x92f) = 0x40016000

Each line is a system call. The function name and its parameters of the system call are on the left of the equal sign, and the return value of the call is on the right. The operating principles of truss, strace, and ltrace are similar. They all use ptrace system calls to track processes in debugging. The detailed principles are not covered in this article. If you are interested, refer to their source code.

The following two examples demonstrate how to use these three debugging tools to diagnose the "intractable diseases" of the software ":

 

Back to Top

Case 1: A segment fault error occurs when you run the Clint.

Operating System: FreeBSD-5.2.1-release

Clint is a C ++ static source code analysis tool. After installation through ports, run:

# clint foo.cppSegmentation fault (core dumped)

Meeting "segmentation fault" in UNIX systems is just as annoying as the "illegal operation" dialog box popped up in MS Windows. OK. We use truss to give Clint "pulse ":

# truss -f -o clint.truss clintSegmentation fault (core dumped)# tail clint.truss  739: read(0x6,0x806f000,0x1000)               = 4096 (0x1000)  739: fstat(6,0xbfbfe4d0)                       = 0 (0x0)  739: fcntl(0x6,0x3,0x0)                        = 4 (0x4)  739: fcntl(0x6,0x4,0x0)                        = 0 (0x0)  739: close(6)                                    = 0 (0x0)  739: stat("/root/.clint/plugins",0xbfbfe680)   ERR#2 ‘No such file or directory‘SIGNAL 11SIGNAL 11Process stopped because of:  16process exit, rval = 139

We use truss to track the execution of Clint system calls and output the results to the file Clint. truss. Then we use tail to view the last few lines. Pay attention to the last system call executed by Clint (the last five lines ):stat("/root/.clint/plugins",0xbfbfe680) ERR#2 ‘No such file or directory‘The problem lies here: Clint cannot find the directory "/root/. Clint/plugins", leading to a segment error. How can this problem be solved? Simple:mkdir -p /root/.clint/pluginsBut this run of Clint will still "segmentation fault" 9. Continue to use truss to trace and find that Clint still needs this directory "/root/. Clint/plugins/Python". After this directory is created, Clint can finally run normally.

 

Back to Top

Case 2: Vim startup speed slows down significantly

Operating System: FreeBSD-5.2.1-release

The Vim version is 6.2.154. After running Vim from the command line, it takes nearly half a minute to enter the editing interface without any error output. I carefully checked that. vimrc and all Vim scripts have no error configuration, and I cannot find a solution to this problem on the Internet. Is it hard to get a hacking source code? It is not necessary to use truss to locate the problem:

# truss -f -D -o vim.truss vim

Here, the-D parameter is used to add a relative timestamp before the output of each row, that is, the time consumed by each execution of a system call. We only need to pay attention to which system calls take a long time. We can use less to carefully check the output file Vim. truss, and soon find the problem:

735: 0.000021511 socket(0x2,0x1,0x0)       = 4 (0x4)735: 0.000014248 setsockopt(0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0)735: 0.000013688 setsockopt(0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0)735: 0.000203657 connect(0x4,{ AF_INET 10.57.18.27:6000 },16) ERR#61 ‘Connection refused‘735: 0.000017042 close(4)          = 0 (0x0)735: 1.009366553 nanosleep(0xbfbfe468,0xbfbfe460) = 0 (0x0)735: 0.000019556 socket(0x2,0x1,0x0)       = 4 (0x4)735: 0.000013409 setsockopt(0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0)735: 0.000013130 setsockopt(0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0)735: 0.000272102 connect(0x4,{ AF_INET 10.57.18.27:6000 },16) ERR#61 ‘Connection refused‘735: 0.000015924 close(4)          = 0 (0x0)735: 1.009338338 nanosleep(0xbfbfe468,0xbfbfe460) = 0 (0x0)

Vim tries to connect to Port 6000 (connect () on the fourth line) of the host 10.57.18.27. After the connection fails, it will continue to retry (nanosleep () on the second line) after sleep ()). The above fragments appear more than a dozen times, each time it takes more than one second, which is why Vim is obviously slowing down. However, you will be wondering: "How can Vim connect to Port 6000 of other computers for no reason? ". Well, let's take a look at what service Port 6000 is? That is, X server. It seems that Vim wants to direct the output to a remote X server, so the shell certainly defines the DISPLAY variable, view. cshrc, there is indeed such a line:setenv DISPLAY ${REMOTEHOST}:0, Comment it out, and then log on again to solve the problem.

 

Back to Top

Case 3: Use debugging tools to understand the working principle of the software

Operating System: Red Hat Linux 9.0

Using debugging tools to track the running status of software in real time is not only an effective means to diagnose the "difficult and miscellaneous" of the software, but also helps us to clarify the "context" of the software ", that is to say, it is an auxiliary method for learning source code to quickly master the software running process and working principle. The following example shows how to use strace to "trigger inspiration" by tracking other software to solve the difficulties in software development.

As we all know, opening a file in a process has a unique file descriptor (FD: file descriptor) corresponding to this file. When I develop a software program, I encounter the following problem: How can I obtain the complete path of the file corresponding to this FD when I know a FD? No such API is provided in Linux, FreeBSD, or other UNIX systems. What should I do? Let's look at it from another perspective: Is there any software in UNIX that can get the files opened by the process? If you have rich experience, it is easy to think of lsof. You can use it to know which files are opened by a process or which process is opened by a file.

Okay, let's use a small program to test lsof and see how it gets the files opened by the process.

/* Testlsof. C */# include <stdio. h> # include <unistd. h> # include <sys/types. h> # include <sys/STAT. h> # include <fcntl. h> int main (void) {open ("/tmp/foo", o_creat | o_rdonly);/* Open the/tmp/Foo */sleep (1200) file ); /* sleep for 1200 seconds for subsequent operations */return 0 ;}

Put testlsof into the background for running, and the PID is 3125. Run the lsof-P 3125 command to check which files are opened by process 3125. We use strace to track the running of lsof and save the output results in lsof. strace:

# gcc testlsof.c -o testlsof# ./testlsof &[1] 3125# strace -o lsof.strace lsof -p 3125

We use the keyword "/tmp/foo" to search for the output file lsof. strace. There is only one result:

# grep ‘/tmp/foo‘ lsof.stracereadlink("/proc/3125/fd/3", "/tmp/foo", 4096) = 8

Previously, lsof cleverly used the/proc/NNNN/FD/directory (NNNN is PID ): the Linux kernel creates a directory named "PID" for each process in/proc/to save information about the process. Its subdirectory FD stores the FD of all files opened by the process. The target is very close to us. Okay. Let's go to/proc/3125/FD/and check whether:

# cd /proc/3125/fd/# ls -ltotal 0lrwx------    1 root     root           64 Nov  5 09:50 0 -> /dev/pts/0lrwx------    1 root     root           64 Nov  5 09:50 1 -> /dev/pts/0lrwx------    1 root     root           64 Nov  5 09:50 2 -> /dev/pts/0lr-x------    1 root     root           64 Nov  5 09:50 3 -> /tmp/foo# readlink /proc/3125/fd/3/tmp/foo

The answer is already obvious: Every FD file in the/proc/NNNN/FD/directory is a symbolic link, which points to a file opened by the process. We only need to call the readlink () system to obtain the file corresponding to an FD. The Code is as follows:

#include <stdio.h>#include <string.h>#include <sys/types.h>#include <unistd.h>#include <fcntl.h>#include <sys/stat.h>int get_pathname_from_fd(int fd, char pathname[], int n){        char buf[1024];        pid_t  pid;        bzero(buf, 1024);        pid = getpid();        snprintf(buf, 1024, "/proc/%i/fd/%i", pid, fd);        return readlink(buf, pathname, n);}int main(void){        int fd;        char pathname[4096];        bzero(pathname, 4096);        fd = open("/tmp/foo", O_CREAT|O_RDONLY);        get_pathname_from_fd(fd, pathname, 4096);        printf("fd=%d; pathname=%s\n", fd, pathname);        return 0;}

[Note] for security reasons, after FreeBSD 5, the system no longer automatically loads the proc file system by default. Therefore, to use the truss or strace tracking program, you must manually mount the proc file system: Mount-T procfs proc/proc; or add a line in/etc/fstab:

proc                   /proc           procfs  rw              0       0

Procfs is not required for ltrace.

TIPS: Use truss, strace, or ltrace diagnostic software for "intractable diseases"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.