"Incurable diseases" using truss, strace, or ltrace diagnostic software

Source: Internet
Author: User

Original link

Brief introduction

Process does not start, the software runs suddenly slow, the program's "Segment Fault" and so on is to make every UNIX system users headache problem, this article through three practical cases demonstrate how to use the three common debugging tools truss, Strace and ltrace to quickly diagnose the software " Difficult diseases. "

Truss and strace are used to track the situation of a process's system call or signal generation , and Ltrace is used to track the process call library functions . Truss was the first debug program developed for System V R4, and most Unix systems, including Aix and FreeBSD, came with this tool, while Strace was originally written for the SunOS system, ltrace in Gnu/debian Linux. Both tools have now been ported to most UNIX systems, and most Linux distributions have their own strace and ltrace, and FreeBSD can install them through ports.

Not only can you debug a newly started program from the command line, you can also bind truss, strace, or Ltrace to an existing PID to debug a running program. The basic use of three debugging tools is basically the same, the following is only three common, and is the most commonly used three command line parameters:

-F: Tracks its child processes in addition to tracking the current process. -o File: Writes output information to the file, rather than to the standard error output (stderr). -P PID: binds to a running process that is corresponding to a PID. This parameter is commonly used to debug background processes.

Most of the debugging tasks can be done using the three parameters above, and here are a few command-line examples:

Truss-o Ls.truss Ls-al: Trace the run of the Ls-al and write the output information to the file/tmp/ls.truss. Strace-f-O vim.strace vim: Tracks the operation of Vim and its sub-processes and writes output information to the file vim.strace. Ltrace-p 234: Track a process that is already running with a PID of 234.

The output results of the three debug tools are similar in format, taking Strace as an example:

BRK (0)                                  = 0x8062aa8brk (0x8063000)                          = 0X8063000MMAP2 (NULL, 4096, Prot_read, Map_private, 3, 0x92f) = 0x40016000

Each row is a system call, the left side of the equals sign is the function name of the system call and its arguments, and to the right is the return value of the call. Truss, Strace, and ltrace work in a similar sense, using Ptrace system calls to track the process of debugging running, the detailed principle is not within the scope of this article, interested in reference to their source code.

Here are two examples of how to use these three debugging tools to diagnose the "incurable diseases" of the Software:

Back to top of page

Case one: Segment fault error occurred running Clint

Operating system: Freebsd-5.2.1-release

Clint is a C + + static source code analysis tool that runs after ports is installed:

# clint Foo.cppsegmentation Fault (core dumped)

Encountering "segmentation Fault" in Unix systems is just as annoying as popping the "Illegal actions" dialog box in MS Windows. OK, we use Truss to give Clint "pulse":

# truss-f-O clint.truss clintsegmentation fault (core dumped) # tail Clint.truss  739:read (0x6,0x806f000,0x1000) 
      
       = 4096 (0x1000)  739:fstat (6,0xbfbfe4d0)                       = 0 (0x0)  739:fcntl (0x6,0x3,0x0)                        = 4 (0x4)  739:fcntl ( 0x6,0x4,0x0)                        = 0 (0x0)  739:close (6)                                    = 0 (0x0)  739:stat ("/root/.clint/plugins", 0xbfbfe680)   Err#2 ' No such file or directory ' SIGNAL 11SIGNAL 11Process stopped because of:  16process exit, Rval = 139
      

We use truss to track the execution of Clint system calls, output the results to a file Clint.truss, and then use tail to view the last few rows. Note that the last system call executed by Clint (line fifth): stat("/root/.clint/plugins",0xbfbfe680) ERR#2 ‘No such file or directory‘ The problem is here: Clint cannot find the directory "/root/.clint/plugins", causing a segment error. How to solve? Very simple: mkdir -p /root/.clint/plugins , but this run Clint still will "segmentation Fault" 9. Continue to use truss tracking, found that Clint also need this directory "/root/.clint/plugins/python", after the establishment of this directory Clint finally able to run normally.

Back to top of page

Case two: Vim startup speed significantly slower

Operating system: Freebsd-5.2.1-release

Vim version 6.2.154, after running vim from the command line, wait for nearly half a minute to enter the editing interface without any error output. Check it out. VIMRC and all Vim scripts are not misconfigured, and there is no solution to similar problems on the Internet, it is hard to hacking source code? There is no need to use truss to find out where the problem lies:

# truss-f-d-o Vim.truss vim

The function of the-d parameter here is to add a relative timestamp before each line output, which is the amount of time spent on each system call executed. As long as we are concerned about which system calls take longer, we look at the output file Vim.truss with less and quickly find the doubt:

735:0.000021511 socket (0x2,0x1,0x0)       = 4 (0x4) 735:0.000014248 setsockopt (0x4,0x6,0x1,0xbfbfe3c8,0x4) = 0 (0x0) 735: 0.000013688 setsockopt (0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0) 735:0.000203657 Connect (0x4,{af_inet 10.57.18.27:6000 },16) err#61 ' Connection refused ' 735:0.000017042 close (4)          = 0 (0x0) 735:1.009366553 nanosleep (0xbfbfe468, 0xbfbfe460) = 0 (0x0) 735:0.000019556 socket (0x2,0x1,0x0)       = 4 (0x4) 735:0.000013409 setsockopt (0x4,0x6,0x1, 0xbfbfe3c8,0x4) = 0 (0x0) 735:0.000013130 setsockopt (0x4,0xffff,0x8,0xbfbfe2ec,0x4) = 0 (0x0) 735:0.000272102 Connect ( 0x4,{af_inet 10.57.18.27:6000},16) err#61 ' Connection refused ' 735:0.000015924 close (4)          = 0 (0x0) 735:1.009338338 NA Nosleep (0xbfbfe468,0xbfbfe460) = 0 (0x0)

Vim tries to connect to the 6000 port of this host (connect () on line fourth), and after the connection fails, sleep for one second continues to retry (line 6th of Nanosleep () 10.57.18.27). The above fragment cycle appears more than 10 times, each time takes more than a second time, this is the reason that vim is obviously slow. However, you will certainly wonder: "How can vim connect to the other computer's 6000 port for no reason?" "。 That's a good question, so please think about what is the port of the 6000 service? Yes, that's X Server. It seems that vim is going to direct the output to a remote X Server, then the shell must have defined the display variable to see. CSHRC, there is such a line: setenv DISPLAY ${REMOTEHOST}:0 , put it commented out, and then re-login, the problem is solved.

Back to top of page

Case three: Using debugging tools to understand how the software works

Operating system: Red Hat Linux 9.0

Using debugging tools to track the operation of software in real-time is not only an effective means of diagnosing software "incurable diseases", but also helps us to clarify the "vein" of software, that is, to quickly master the running process and working principle of software, which is a kind of auxiliary method of learning source code. The following example shows how to use Strace to "trigger inspiration" by tracking other software to solve the problems in software development.

As you know, opening a file within a process has a unique file descriptor (Fd:file descriptor) that corresponds to this file. And I encountered such a problem in the development of a software process: a known FD, how to obtain the full path of the corresponding file of FD? Whether it's Linux, FreeBSD, or any other Unix system that doesn't provide such an API, what do you do? Let's think in a different way: Is there any software under Unix that can get the files that the process is opening? If you have enough experience, it is easy to think of lsof, which can be used to know which files the process is opening, and which process the file is opened by.

OK, let's experiment with a little program lsof to see how it gets the files that the process opened.

/* TESTLSOF.C */#include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h > #include <fcntl.h>int main (void) {        open ("/tmp/foo", o_creat| o_rdonly);    /* Open File/tmp/foo *        /sleep (+);                                /* Sleep for 1200 seconds for further action */        return 0;}

The testlsof is placed in the background and its PID is 3125. Command Lsof-p 3125 to see which files are open for process 3125, we use Strace to track lsof runs, and the output is saved in Lsof.strace:

# gcc Testlsof.c-o testlsof#/testlsof &[1] 3125# strace-o lsof.strace lsof-p 3125

We searched the output file lsof.strace with "/tmp/foo" as the keyword, with only one result:

# grep '/tmp/foo ' Lsof.stracereadlink ("/proc/3125/fd/3", "/tmp/foo", 4096) = 8

The original lsof ingenious use of the/proc/nnnn/fd/directory (nnnn PID): The Linux kernel for each process in the/proc/to establish a directory with its PID name to save the process of information, and its subdirectory FD holds all the files opened by the process of FD. The target is close to us. OK, let's go to/proc/3125/fd/to see:

# cd/proc/3125/fd/# Ls-ltotal 0lrwx------    1 root     root  5 09:50 0-/dev/pts/0lrwx------    1  Root     root  5 09:50 1-/dev/pts/0lrwx------    1 root     root  5 09:50 2 /dev/pts/0lr-x------    1 root     root  5 09:50 3-/tmp/foo# Readlink/proc/3125/fd/3/tmp/foo

The answer is obvious: each fd file in the/proc/nnnn/fd/directory is a symbolic link that points to a file opened by the process. We just need to use the Readlink () system call to obtain a corresponding file of FD, the code is as follows:

#include <stdio.h> #include <string.h> #include <sys/types.h> #include <unistd.h> #include < fcntl.h> #include <sys/stat.h>int get_pathname_from_fd (int fd, char pathname[], int n) {        char buf[1024];        pid_t  pid;        Bzero (buf, 1024x768);        PID = Getpid ();        snprintf (buf, 1024x768, "/proc/%i/fd/%i", PID, FD);        Return Readlink (buf, pathname, n);} int main (void) {        int fd;        Char pathname[4096];        Bzero (pathname, 4096);        FD = open ("/tmp/foo", o_creat| o_rdonly);        GET_PATHNAME_FROM_FD (FD, pathname, 4096);        printf ("fd=%d; Pathname=%s\n ", FD, pathname);        return 0;}

Note: For security reasons, the system does not automatically load the proc file system by default after FreeBSD 5, so to use the truss or Strace tracker, you must manually load the proc file system: Mount-t PROCFS Proc/proc , or add a line to the/etc/fstab:

Proc                   /proc           procfs  rw              0       0

Ltrace does not require the use of PROCFS.

Resources

        • Truss (1) manual page

        • Strace (1) manual page

        • Ltrace (1) manual page

        • Ptrace (2) manual page

        • Lsof (1) manual page

        • Debugging with Strace:http://www.devchannel.org/devtoolschannel/03/10/24/2057246.shtml

"Incurable diseases" using truss, strace, or ltrace diagnostic software

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.