Site links for this experiment: MIT 6.828 Lab 1 Exercise 12.
Topic
Exercise 12. Modify your stack backtrace function to display, for each EIP, the function name, source file name, and line number Corres Ponding to that EIP.
In Debuginfo_eip, where does __stab_* come from? This question has a long answer; To the Discover the answer, here is some things you might want to do:
- Look in the file kern/kernel.ld for __stab_*
- Run Objdump-h Obj/kern/kernel
- Run Objdump-g Obj/kern/kernel
- Run Gcc-pipe-nostdinc-o2-fno-builtin-i.-md-wall-wno-format-djos_kernel-gstabs-c-S kern/init.c, and look at INI T.s.
- See if the bootloader loads the symbol table in memory as part of loading the kernel binary
Complete the implementation of DEBUGINFO_EIP by inserting the call to Stab_binsearch to find the line number for an addres S.
ADD a backtrace command to the kernel monitor, and extend your implementation of Mon_backtrace to call Debuginfo_eip and P Rint a line for each stack frame of the form:
K> BackTrace
Stack BackTrace:
EBP f010ff78 eip f01008ae args 00000001 f010ff8c 00000000 F0110580 00000000
kern/monitor.c:143:monitor+106
EBP f010ffd8 eip f0100193 args 00000000 00001aac 00000660 00000000 00000000
kern/init.c:49:i386_init+59
EBP f010fff8 eip f010003d args 00000000 00000000 0000ffff 10cf9a00 0000ffff
Kern/entry. S:70:
Each line gives the file name and line within this file of the stack frame ' s EIP, followed by the name of the function and The offset of the EIP from the first instruction of the function (e.g., monitor+106 means the return eip is 106 bytes pas T the beginning of monitor).
Be sure to print the file and function names on a separate line, to avoid confusing the grading script.
TIP:PRINTF format Strings provide an easy, albeit obscure, on-the-print non-null-terminated strings like those in stabs T Ables. printf ("%.*s", length, string) prints at the most length characters of string. Take a look at the printf Mans page to find out why this works.
You could find that some functions is missing from the backtrace. For example, you'll probably see a call to monitor () and not to Runcmd (). This is because the compiler in-lines some function calls. Other optimizations could cause you-see unexpected line numbers. If you get rid of The-o2 from Gnumakefile, the backtraces could make more sense (but your kernel would run more slowly).
Answer the question where does the __stab_* from the 1:DEBUGINFO_EIP function come from?
- __stab_begin__,__stab_end__,__stabstr_begin__,__stabstr_end__ and other symbols are in the kern/ The Kern.ld file definition, which represents the address where the. Stab and. STABSTR two segments begin and end respectively.
/* Include debugging information in kernel memory */.stab : { PROVIDE(__STAB_BEGIN__ = .); *(.stab); PROVIDE(__STAB_END__ = .); BYTE(0) /* Force the linker to allocate space for this section */}.stabstr : { PROVIDE(__STABSTR_BEGIN__ = .); *(.stabstr); PROVIDE(__STABSTR_END__ = .); BYTE(0) /* Force the linker to allocate space for this section */
- Executes the
objdump -h obj/kern/kernel
command, as shown in the results below. To save space, only 5 segments, such as. text,. Rodata,. Stab,. Stabstr, and. Data, are displayed here. It is observed that these 5 segments are placed sequentially from the beginning of the loading address. I guess stab_begin=0xf0102204, stab_end=0xf0102204 + 0x3cb5-1 = 0xf0105eb8,stabstr_begin=0xf0105eb9, STABSTR_END= 0XF0105EB9 + 0x1974-1 = 0xf010772c.
along:~/src/6.828/lab$ objdump -h obj/kern/kernelobj/kern/kernel: file format elf32-i386Sections:Idx Name Size VMA LMA File off Algn 0 .text 00001ab9 f0100000 00100000 00001000 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .rodata 00000744 f0101ac0 00101ac0 00002ac0 2**5 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .stab 00003cb5 f0102204 00102204 00003204 2**2 CONTENTS, ALLOC, LOAD, READONLY, DATA 3 .stabstr 00001974 f0105eb9 00105eb9 00006eb9 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 4 .data 00009300 f0108000 00108000 00009000 2**12 CONTENTS, ALLOC, LOAD, DATA
- Executes the
objdump -G obj/kern/kernel
command, showing 1294 stab of information, to save space here only a small part is given.
along:~/src/6.828/lab$ objdump -G obj/kern/kernelobj/kern/kernel: file format elf32-i386Contents of .stab section:Symnum n_type n_othr n_desc n_value n_strx String-1 HdrSym 0 1294 00001973 1 0 SO 0 0 f0100000 1 {standard input}1 SOL 0 0 f010000c 18 kern/entry.S2 SLINE 0 44 f010000c 0 15 OPT 0 0 00000000 49 gcc2_compiled.16 LSYM 0 0 00000000 64 int:t(0,1)=r(0,1);-2147483648;2147483647;17 LSYM 0 0 00000000 106 char:t(0,2)=r(0,2);0;127;108 FUN 0 0 f0100040 2946 test_backtrace:F(0,25)118 FUN 0 0 f01000a6 2987 i386_init:F(0,25)
- Execute
gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c
, and then view the Init.s file. Also, to save space, here are just a few of them.
. File "init.c". Stabs "kern/init.c", 100,0,2,. Ltext0.text.ltext0:.stabs "gcc2_compiled", 60,0,0,0.stabs "int:t (0,1) =r (0,1); -2147483648;2147483647;", 128,0,0,0. Stabs "Char:t (0,2) =r (0,2); 0;127;", 128,0,0,0.stabs "long Int:t (0,3) =r (0,3); -0;4294967295;", 128,0,0,0.stabs " unsigned int:t (0,4) =r (0,4); 0;4294967295; ", 128,0,0,0.stabs" Long unsigned int:t (0,5) =r (0,5); 0;-1; ", 128,0,0,0.stabs" Long Double:t (0,16) =r (0,1); 16;0; ", 128,0,0,0.stabs" _float32:t (0,17) =r (0,1); 4;0; ", 128,0,0,0.stabs" Ssize_t:t (4,17) = (4,8) ", 128,0,0,0.stabs" off_t:t (4,18) = (4,8) ", 128,0,0,0.stabn 162,0,0,0.stabn 162,0,0,0.section. rodata.str1.1," AMS ", @progbits, 1.lc0:.string" Entering Test_backtrace%d\n ". Lc1:.string "Leaving Test_backtrace%d\n". Text.p2align 4,,15.stabs "test_backtrace:f (0,25)", 36,0,0,test_ Backtrace.stabs "X:p (0,1)", 64,0,0,3.globl Test_backtrace.type test_backtrace, @functiontest_backtrace:. stabn 68,0,13,. lm0-. LFBB1
- Verify that the boot loader loaded the symbol table into memory when the kernel is loaded. How to confirm it? Use GDB to see if the position of the symbol table is stored with symbolic information. First, according to the output of the 3rd step, we know that. The load memory address for the STABSTR segment is 0XF0105EB9, using the
x/8s 0xf0105eb9
first 8 string information printed, as shown below. The symbol table is loaded into memory when the kernel is loaded.
(gdb) x/8s 0xf0105eb90xf0105eb9: ""0xf0105eba: "{standard input}"0xf0105ecb: "kern/entry.S"0xf0105ed8: "kern/entrypgdir.c"0xf0105eea: "gcc2_compiled."0xf0105ef9: "int:t(0,1)=r(0,1);-2147483648;2147483647;"0xf0105f23: "char:t(0,2)=r(0,2);0;127;"0xf0105f3d: "long int:t(0,3)=r(0,3);-2147483648;2147483647;"
Problem 2:debuginfo_eip function for finding line numbers based on address
The key to solving this problem is to be familiar with the meaning of each line of stabs, and I've been tossing it for two hours to figure it out. First, using the objdump -G obj/kern/kernel > output.md
output of the kernel's symbol table information to the Output.md file, you can see the following fragment in the Output.md file:
Symnum n_type n_othr n_desc n_value n_strx String118 FUN 0 0 f01000a6 2987 i386_init:F(0,25)119 SLINE 0 24 00000000 0 120 SLINE 0 34 00000012 0 121 SLINE 0 36 00000017 0 122 SLINE 0 39 0000002b 0 123 SLINE 0 43 0000003a 0
What does this fragment mean? The first step is to understand the meaning of each column field given in the first row:
- Symnum is a symbolic index, in other words, the entire symbol table is considered an array, and Symnum is the subscript of the current symbol in the array
- N_type is a symbol type, fun refers to the function name, sline refers to the line number in the Text field
- N_othr is not currently in use and its value is fixed at 0
- N_desc indicates the line number in the file
- N_value represents the address. It is important to note that only the address of the symbol of the fun type here is the absolute address, the address of the sline symbol is the offset, and its actual address is the function entry address plus the offset. For example, line 3rd means that the address f01000b8 (=0xf01000a6+0x00000012) corresponds to the 34th line of the file.
Once you understand the meaning of each row of stabs, call Stab_binsearch to find the line number for an address. Since the previous code has found the address in which function and function entry address, the original address minus the function entry address can get offset, and then according to the offset in the symbol table in the specified interval to find the corresponding record. The code looks like this:
stab_binsearch(stabs, &lfun, &rfun, N_SLINE, addr - info->eip_fn_addr); if (lfun <= rfun) { info->eip_line = stabs[lfun].n_desc; }
Issue 3: Add the BackTrace command to the kernel emulator and add the print file name, function name, and line number to the Mon_backtrace
- Add the BackTrace command to the kernel emulator
It's easy to add a copy of an existing command in the Kern/monitor.c file.
static struct Command commands[] = { { "help", "Display this list of commands", mon_help }, { "kerninfo", "Display information about the kernel", mon_kerninfo }, { "backtrace", "Display a backtrace of the function stack", mon_backtrace },};
- Add print file name, function name, and line number in Mon_backtrace
Through the above exploration, this problem is easy to solve. Call Debuginfo_eip in Mon_backtrace to get the file name, function name, and line number. Note that the Eip_fn_name field of the returned eipdebuginfo struct has a tail in addition to the function name, for example test_backtrace:F(0,25)
, ": F (0,25)" needs to be removed and can be used printf("%.*s", length, string)
to implement. The code is as follows:
int mon_backtrace(int argc, char **argv, struct Trapframe *tf){ uint32_t *ebp; struct Eipdebuginfo info; int result; ebp = (uint32_t *)read_ebp(); cprintf("Stack backtrace:\r\n"); while (ebp) { cprintf(" ebp %08x eip %08x args %08x %08x %08x %08x %08x\r\n", ebp, ebp[1], ebp[2], ebp[3], ebp[4], ebp[5], ebp[6]); memset(&info, 0, sizeof(struct Eipdebuginfo)); result = debuginfo_eip(ebp[1], &info); if (0 != result) { cprintf("failed to get debuginfo for eip %x.\r\n", ebp[1]); } else { cprintf("\t%s:%d: %.*s+%u\r\n", info.eip_file, info.eip_line, info.eip_fn_namelen, info.eip_fn_name, ebp[1] - info.eip_fn_addr); } ebp = (uint32_t *)*ebp; } return 0;}
The output results are as follows:
Stack backtrace:ebp f010ff18 eip f0100078 args 00000000 00000000 00000000 f010004a f0111308 Kern/ini t.c:16:test_backtrace+56 ebp f010ff38 eip f01000a1 args 00000000 00000001 f010ff78 f010004a f0111308 kern/init. c:16:test_backtrace+97 ebp f010ff58 eip f01000a1 args 00000001 00000002 f010ff98 f010004a f0111308 kern/init.c: 16:test_backtrace+97 ebp f010ff78 eip f01000a1 args 00000002 00000003 f010ffb8 f010004a f0111308 kern/init.c:16 : test_backtrace+97 ebp f010ff98 eip f01000a1 args 00000003 00000004 00000000 f010004a f0111308 kern/init.c:16: test_backtrace+97 ebp f010ffb8 eip f01000a1 args 00000004 00000005 00000000 f010004a f0111308 kern/init.c:16:te st_backtrace+97 ebp f010ffd8 eip f01000dd args 00000005 00001aac f010fff8 F01000BD 00000000 kern/init.c:43:i386 _init+55 ebp f010fff8 eip f010003e args 00000003 00001003 00002003 00003003 00004003 {standard input}:0: <unk Nown>+0
Note
- printf ("%.*s", length, string) prints at the most length characters of string.
Stabs
Stabs (Symbol TABle Strings) is a debugging data format for storing information about computer programs for use by Symboli C and Source-level debuggers.
The assembler creates and sections, a section named. Stab which contains an array of fixed length structures, one s Truct per stab, and a section named. stabstr containing all the variable length strings that is referenced by stabs in th E. Stab section.
- Symbol Table Format:see the following. Notice:if the stab have a string, the N_strx field holds the offset in bytes of the string within the string table. The string is terminated by a NUL character. If the stab lacks a string (for example, it's produced by a. Stabn or. stabd directive), the N_strx field is zero.
struct internal_nlist { unsigned long n_strx; /* index into string table of name */ unsigned char n_type; /* type of symbol */ unsigned char n_other; /* misc info (usually empty) */ unsigned short n_desc; /* description field */ bfd_vma n_value; /* value of symbol */};
- There is three overall formats for stab assembler directives, differentiated by the first word of the stab. The name of the directive describes which combination of four possible data fields follows. It is either. Stabs (String),. STABN (number), or. STABD (dot). The overall format of each class of stab is:
.stabs "string",type,other,desc,value.stabn type,other,desc,value.stabd type,other,desc
Stabstr
- The. Stabstr section always starts with a null byte (so, string offsets of zero reference a null string), followed by Random length strings, each of the which is null byte terminated.
Questions
How do I understand the characters following the colon of a symbol in a stabs table? such as test_backtrace:F(0,25)
andchar:t(0,2)=r(0,2);0;127;
printf("%.*s", length, string)
You can print a string of the specified length, how exactly is it implemented?
Resources
- The "stabs" representation of debugging information
"MIT 6.828 Lab 1 Exercise 12" experimental report