Document directory
- Tracing System calls
- Tracking signal transmission
- System Call statistics
- Redirection output
- Timing System calls
- System Call time
- Truncation output
- Trace an existing process
Address: http://www.dbabeta.com/2009/strace.html
What is strace?
Open man strace and we can see that the most concise introduction to strace is "strace-trace system CILS and signals ". In fact, strace is a tool integrating diagnosis, debugging, statistics, and integration. We can use strace to analyze the application's system calls and signal transmission tracking results, to solve the problem or understand the application process. Of course, strace cannot be compared with professional debugging tools such as gdb, because it is not a professional debugger.
The simplest use of strace is to execute a specified command and exit after the specified command is completed. During command execution, strace records and parses all system calls of the command process and all signal values received by the process.
The command is used as follows:
| 1234567 |
Strace [-dffhiqrtttTvxx] [-acolumn] [-eexpr]... [-ofile] [-ppid]... [-sstrsize] [-uusername] [-Evar = val]... [-Evar]... [command [arg...] strace-c [-eexpr]... [-Ooverhead] [-Ssortby] [command [arg...] |
Basic strace usage Tracing System Call
Now we have a simple program to demonstrate the basic usage of strace. The C language code of this program is as follows:
| 12345678910 |
# Filename test. c # include <stdio. h> int main () {inta; scanf ("% d", & a); printf ("% 09d \ n", a); return0 ;} |
Then we compile it with gcc-o test. c to get an executable file test. Then use strace to call and execute
During execution, you are required to enter an integer. We enter 99 and the following result is obtained:
| 12 |
// Directly execute the test result oracle @ orainst [orcl]: ~ $./Test |
| 123 |
// Result 99000000099 |
| 12 |
// Execute the test result using strace. oracle @ orainst [orcl]: ~ $ Strace./test |
| 123456789101112131415161718192021222324252627 |
// Execve (". /test ",[". /test "], [/* 41 vars */]) = 0 uname ({sys =" Linux ", node =" orainst.topics top.mycompany.com ",...}) = 0brk (0) = 0x8078000fstat64 (3, {st_mode = S_IFREG | 0644, st_size = 65900 ,...}) = 0old_mmap (NULL, 65900, PROT_READ, MAP_PRIVATE, 3, 0) = 0xbf5ef000close (3) = 0 open ("/lib/tls/libc. so.6 ", O_RDONLY) = 3 read (3, "\ 177ELF \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 3 \ 0 \ 3 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 200X \ 1 "..., 512) = 5 12fstat64 (3, {st_mode = S_IFREG | 0755, st_size = 1571692 ,...}) = 0old_mmap (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0) = 0xbf5ee000old_mmap (NULL, 1275340, PROT_READ | PROT_EXEC, MAP_PRIVATE, 3, 0) = bytes (0xb34000, 12288, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, 3, 0x132000) = bytes (0xb37000, 9676, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOU S,-1, 0) = 0xb37000close (3) = 0set_thread_area ({entry_number:-1-> 6, base_addr: 0xbf5ee740, limit: 1048575, seg_32bit: 1, contents: 0, read_exec_only: 0, limit_in_pages: 1, seg_not_present: 0, useable: 1}) = 0 munmap (0xbf5ef000, 65900) = 0fstat64 (0, {st_mode = S_IFCHR | 0620, st_rdev = makedev (136, 0 ),...}) = 0mmap2 (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0) = 0xbf5ff000read (0, 99 "99 \ N ", 1024) = 3fstat64 (1, {st_mode = S_IFCHR | 0620, st_rdev = makedev (136, 0 ),...}) = 0mmap2 (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0) = 0xbf5fe000write (1, "000000099 \ n", 10000000099) = 10 munmap (0xbf5fe000, 4096) = 0exit_group (0) =? |
From the trace structure, we can see that the system first calls execve to start a new process, then initializes some environments, and finally pauses on "read (0, this means that our scanf function is executed. Wait for us to enter a number. After 99 is entered, call the write function to output the formatted value "000000099" to the screen, finally, call exit_group to exit and complete the entire program execution process.
Tracking signal transmission
We still use the test program above to observe the signal received by the process. Do not enter anything when you wait for the input screen. Then open another window and enter the following command
At this time, we can see that our program has been launched. The final trace result is as follows:
| 12 |
Oracle @ orainst [orcl]: ~ $ Strace./test |
| 1234567891011121314151617181920 |
Execve (". /test ",[". /test "], [/* 41 vars */]) = 0 uname ({sys =" Linux ", node =" orainst.topics top.mycompany.com ",...}) = 0brk (0) = 0x9ae2000old_mmap (NULL, 65900, PROT_READ, MAP_PRIVATE, 3, 0) = 0xbf5ef000close (3) = 0 open ("/lib/tls/libc. so.6 ", O_RDONLY) = 3 read (3, "\ 177ELF \ 1 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 0 \ 3 \ 0 \ 3 \ 0 \ 1 \ 0 \ 0 \ 0 \ 0 \ 0 \ 200X \ 1 "..., 512) = 512fstat64 (3, {st_mode = S_IFREG | 0755, st_size = 1571692 ,...}) = 0old_mmap (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0) = 0xbf5ee000old_mmap (NULL, 1275340, PROT_READ | PROT_EXEC, MAP_PRIVATE, 3, 0) = bytes (0x41b000, 12288, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED, 3, 0x132000) = 0x41b000old_mmap (0x41e000, 9676, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS,-1, 0) = 0x41e000close (3) = 0set_thread_area ({entry_number:-1-> 6, base _ Addr: limit, limit: 1048575, seg_32bit: 1, contents: 0, read_exec_only: 0, limit_in_pages: 1, seg_not_present: 0, useable: 1}) = 0 munmap (0xbf5ef000, 65900) = 0fstat64 (0, {st_mode = S_IFCHR | 0620, st_rdev = makedev (136, 0 ),...}) = 0mmap2 (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0) = 0xbf5ff000read (0, 0xbf5ff000, 1024) =? ERESTARTSYS (To be restarted) --- SIGTERM (Terminated) @ 0 (0) --- ++ killed by SIGTERM ++ |
The trace clearly tells you the test process "++ killed by SIGTERM ++ ".
System Call statistics
Strace does not only track system calls. By using the-c parameter, it can also provide you with a statistical analysis of all system calls of the process. Let's take a look at the strace statistics, this time we execute strace with the-c parameter:
Finally, we can get the trace result as follows:
| 12 |
Oracle @ orainst [orcl]: ~ $ Strace-c./test |
| 12345678910111213141516 |
Execve (". /test ",[". /test "], [/* 41 vars */]) = 0% time seconds usecs/call callerrors syscall ------ ------------- ----------- ------------------ 45.90 0.000140 34.43 5 27 25 open 0.000105 7.54 4 24 21 stat64 0.000023 5 5 old_mmap 2.62 0.000008 1.97 8 1 munmap 0.000006 6 1 uname 1.97 0.000006 2 3 fstat64 1.64 0.000005 3 2 1 read 1.31 0.000004 2 2 close 0.98 0.000003 3 1 brk 0.98 0.000003 3 1 mmap2 0.66 0.000002 2 1 set_thread_area -------------------------------------------------------------- 100.00 0.000305 68 47 total |
This clearly tells you the information about the system functions called, the number of calls, the amount of time consumed, and so on. It is very useful for us to analyze a program.
Common Parameters
In addition to the-c parameter, strace also provides other useful parameters for us to conveniently obtain the information we want. The following describes frequently-used parameters.
Redirection output
The-o parameter is used to output strace results to a file. If the-o parameter is not specified, the default output device is STDERR, that is to say, the results of using "-o filename" and "2> filename" are the same.
| 123 |
# These two commands output straceresults to strace-c-o test.txt./teststrace-c./test 2> test.txt. |
Timing System calls
The strace parameter-T can be used to print the time spent by each system call. The time spent for each call is now in the angle brackets on the rightmost side of the Call Line.
| 12 |
Oracle @ orainst [orcl]: ~ $ Strace-T./test |
| 123456789 |
// Here only extract part of the result read (0, 1 "1 \ n", 1024) = 2 <2.673455> fstat64 (1, {st_mode = S_IFCHR | 0620, st_rdev = makedev (136, 0 ),...}) = 0 <0.000014> mmap2 (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1, 0) = 0xbf5fe000 <0.000017> write (1, "000000001 \ n ", 10000000001) = 10 <0.000016> munmap (0xbf5fe000, 4096) = 0 <0.000020> exit_group (0) =? |
System Call time
This is a very useful function. strace will record the occurrence time of each system call. You only need to use the-t/tt/ttt parameters to see the effect, for specific examples, try it by yourself.
| Parameter Name |
Output style |
Description |
| -T |
10:33:04 exit_group (0) |
The output result is accurate to seconds. |
| -Tt |
10:33:48. 159682 exit_group (0) |
Output results are accurate to subtle |
| -Ttt |
1262169244.788478 exit_group (0) |
Accurate to subtle, and time expressed as unix Timestamp |
Truncation output
The-s parameter is used to specify the length of the output string for each line of the trace result. Next, let's take a look at the effect of the-s parameter in the test program on the result. Now, the-s parameter is set to 20, then, in read, we enter a string of more than 20 characters.
| 12 |
Read (0, 2222222222222222222222222 // we input a total of 25 "22222222222222222222"..., 1024) = 26 // and we see that 2 only has 20 results |
Trace an existing process
Strace does not only initialize a process to trace the existing process, but also trace the existing process. The parameter-p is used for this purpose. The usage is also very simple, as follows.
Example
After talking about so many functions and parameters, we now have a practical point: to study the lgwr process of Oracle, check if this process does not write a log file every 3 seconds as described in this document. Considering that lgwr has many log writing trigger conditions, we need to find an idle Oracle instance for this experiment.
Run the following command to obtain the pid of the lgwr process:
| 1 |
Oracle 5912 1 0 Nov12? 00:14:56 ora_lgwr_orcl |
The pidr of lgwris 5912. after straceis started, several traceentries are output to the lgwr.txt file, and the following command is executed:
| 1 |
Strace-tt-s 10-o lgwr.txt-p 5912 |
After a while, stop strace and view the result. Due to the large number of output results, we only need to look at the calls of the pwrite function used when Oracle writes log files.
| 1 |
Grep 'pwrite \ (20' lgwr.txt |
Wait, why is "pwrite (2" used in grep ?, Because I know that the handle numbers of the Current log File opened on my machine start from 2. The specific search method is to first use the following statement to find out which log files are currently active:
| 12 |
Select member, v $ log. status from v $ log, v $ logfilewhere v $ log. group # = v $ logfile. group #; |
Get
| 12345678910 |
Member status statuses ----------------/db/databases/orcl/redo-01-a/redo-t01-g03-m1.log INACTIVE/db/databases/orcl/redo-03-a/redo-t01-g03-m2.log INACTIVE/db/databases/orcl/redo-02-a/redo-t01-g02-m1.log CURRENT/db/ databases/orcl/redo-04-a/redo-t01-g02-m2.log CURRENT/db/databases/orcl/redo-01-a/redo-t01-g01-m1.log INACTIVE/db/databases/orcl/redo-03-a/redo-t01-g01-m2.log INACTIVE/db/databases/orcl/redo-02-a/redo-t01-g04-m1.log INACTIVE/db/databases/orcl/redo-04-a/redo-t01-g04-m2.log INACTIVE |
Then go to/proc to find the handle to open the file:
Get
| 12345678 |
Lrwx ------ 1 oracle dba 64 Dec 30 18->/db/databases/orcl/redo-01-a/redo-t01-g01-m1.loglrwx ------ 1 oracle dba 64 Dec 30 19->/db/databases/orcl/ redo-03-a/redo-t01-g01-m2.loglrwx ------ 1 oracle dba 64 Dec 30 20>/db/databases/orcl/redo-02-a/redo-t01-g02-m1.loglrwx ------ 1 oracle dba 64 Dec 30 21->/db/databases/ orcl/redo-04-a/redo-t01-g02-m2.loglrwx ------ 1 oracle dba 64 Dec 30 22->/db/databases/orcl/redo-01-a/redo-t01-g03-m1.loglrwx ------ 1 oracle dba 64 Dec 30->/db/ databases/orcl/redo-03-a/redo-t01-g03-m2.loglrwx ------ 1 oracle dba 64 Dec 30->/db/databases/orcl/redo-02-a/redo-t01-g04-m1.loglrwx ------ 1 oracle dba 64 Dec 30->/ db/databases/orcl/redo-04-a/redo-t01-g04-m2.log |
Now we can see that the handles of the current log files on my machine are 20 and 21.
Now we get the following results:
| 1234567891011121314151617181920212223242526 |
11:13:55. 603245 pwrite (20, "\ 1 \" \ 0 \ 0J! "..., 1536,436 3264) = 153611: 13: 55.603569 pwrite (21," \ 1 \ "\ 0 \ 0J! "..., 1536,436 3264) = 153611: 13: 55.606888 pwrite (20," \ 1 \ "\ 0 \ 0 M! "..., 1536,436 4800) = 153611: 13: 55.607172 pwrite (21," \ 1 \ "\ 0 \ 0 M! "..., 1536,436 4800) = 153611: 13: 55.607934 pwrite (20," \ 1 \ "\ 0 \ 0 P! "..., 1536,436 6336) = 153611: 13: 55.608199 pwrite (21," \ 1 \ "\ 0 \ 0 P! "..., 1536,436 6336) = 153611: 13: 55.610260 pwrite (20," \ 1 \ "\ 0 \ 0 S! "..., 1536,436 7872) = 153611: 13: 55.610530 pwrite (21," \ 1 \ "\ 0 \ 0 S! "..., 1536,436 7872) = 153611: 14: 00.602446 pwrite (20," \ 1 \ "\ 0 \ 0 V! "..., 1536,436 9408) = 153611: 14: 00.602750 pwrite (21," \ 1 \ "\ 0 \ 0 V! "..., 1536,436 9408) = 153611: 14: 00.606386 pwrite (20," \ 1 \ "\ 0 \ 0Y! "..., 1536,437 0944) = 153611: 14: 00.606676 pwrite (21," \ 1 \ "\ 0 \ 0Y! "..., 1536,437 0944) = 153611: 14: 00.607900 pwrite (20, "\ 1 \" \ 0 \\"..., 1024,437 2480) = 102411: 14: 00.608161 pwrite (21, "\ 1 \" \ 0 \\"..., 1024,437 2480) = 102411: 14: 00.608816 pwrite (20, "\ 1 \" \ 0 \ 0 ^! "..., 1024,437 3504) = 102411: 14: 00.609071 pwrite (21," \ 1 \ "\ 0 \ 0 ^! "..., 1024,437 3504) = 102411: 14: 00.611142 pwrite (20," \ 1 \ "\ 0 \ 0 '! "..., 1536,437 4528) = 153611: 14: 00.611454 pwrite (21," \ 1 \ "\ 0 \ 0 '! "..., 1536,437 4528) = 153611: 14: 05.602804 pwrite (20," \ 1 \ "\ 0 \ 0c! "..., 1024,437 6064) = 102411: 14: 05.603119 pwrite (21," \ 1 \ "\ 0 \ 0c! "..., 1024,437 6064) = 102411: 14: 05.607731 pwrite (20," \ 1 \ "\ 0 \ 0e! "..., 1024,437 7088) = 102411: 14: 05.608020 pwrite (21," \ 1 \ "\ 0 \ 0e! "..., 1024,437 7088) = 102411: 14: 05.608690 pwrite (20," \ 1 \ "\ 0 \ 0g! "..., 1024,437 8112) = 102411: 14: 05.608962 pwrite (21," \ 1 \ "\ 0 \ 0g! "..., 1024,437 8112) = 102411: 14: 05.611022 pwrite (20," \ 1 \ "\ 0 \ 0i! "..., 1536,437 9136) = 153611: 14: 05.611283 pwrite (21," \ 1 \ "\ 0 \ 0i! "...., 1536,437 9136) = 1536 |
Unfortunately, we see that the log file write operation is performed every 5s, probably because the system is idle? This requires further research.
Additional reading
- Strace @ sourceforge
- Man strace
- Strace
- Introducing strace
- Troubleshooting of all blocks in Apache processes caused by Linux random number generator
- Http://jarfield.iteye.com/blog/1739834