We probably talked about how to write a simple character device driver, we are not God, writing code will definitely have problems, we need to write code in the process of constant debugging. In a normal C application, we often use printf to output information, or use GDB to debug a program, so how does the driver debug it? We know that when debugging the program often encountered problem is the wild pointer or array out of the problem, run this program in the application will be reported segmentation fault error, and due to the particularity of the driver, this kind of situation will often directly cause system downtime, and will throw oops information. So how do we analyze oops information and even locate specific lines of code based on oops information? The following is a simple example of how to debug a driver.
How to locate a line of code based on oops
We borrowed the Linux device driver second: construct and run the Hello World program inside the module to demonstrate the error, and the Hello World with the error code is as follows:
#include <linux/init.h> #include <linux/module.h>module_license ("Dual BSD/GPL"), Static int Hello_init ( void) { char *p = NULL; memcpy (P, "test", 4); PRINTK (Kern_alert "Hello, world\n"); return 0;} static void Hello_exit (void) { printk (kern_alert "Goodbye, cruel world\n");} Module_init (Hello_init); Module_exit (Hello_exit);
The makefile file is as follows:
Ifneq ($ (kernelrelease),) obj-m: = Helloworld.oelsekerneldir? =/lib/modules/$ (Shell uname-r)/buildpwd: = $ (shell pwd) Default: $ (make)-C $ (Kerneldir) m=$ (PWD) Modulesendifclean: RM-RF *.o *~ core depend. *.cmd *.ko *.mod.c. Tmp_v Ersions Modules.order module.symvers
Obviously, line 8th of the above code is a null pointer error. The following oops message appears after Insmod:
[459.516441] bug:unable to handle kernel NULL pointer dereference at (NULL) [459.516445] <span style= "Col Or: #ff0000; " >IP: [<ffffffffc061400d>] hello_init+0xd/0x30 [helloworld]</span>[459.516448] PGD 0 [459.516450] Oops: 0002 [#1] SMP [459.516452] Modules linked In:helloworld (oe+) vmw_vsock_vmci_transport vsock coretemp Crct10dif_pclmul Crc32_pclmul Ghash_clmulni_intel aesni_intel vmw_balloon snd_ens1371 aes_x86_64 lrw snd_ac97_codec Gf128mul glue_helper Ablk_helper cryptd ac97_bus gameport snd_pcm serio_raw snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_ Device Snd_timer vmwgfx btusb ttm snd drm_kms_helper DRM soundcore shpchp vmw_vmci i2c_piix4 rfcomm BNEP bluetooth 6lowpan _IPHC parport_pc PPDEV mac_hid LP Parport hid_generic Usbhid hid psmouse ahci libahci floppy e1000 vmw_pvscsi vmxnet3 mpts Pi Mptscsih mptbase scsi_transport_spi pata_acpi [last unloaded:helloworld][459.516476] cpu:0 pid:4531 comm:insmod T Ainted:g OE 3.16.0-33-generic #44 ~14.04.1-ubuntu[459.516478] Hardware Name:vmware, Inc. VMware Virtual PLATFORM/440BX Desktop Refe Rence Platform, BIOS 6.00 05/20/2014[459.516479] task:ffff88003821f010 ti:ffff880038fa0000 task.ti:ffff880038fa0000[ 459.516480] Rip:0010:[<ffffffffc061400d> [<ffffffffc061400d>] hello_init+0xd/0x30 [helloworld][ 459.516483] RSP:0018:FFFF880038FA3D40 eflags:00010246[459.516484] rax:ffff88000c31d901 rbx:ffffffff81c1a020 rcx:00 0000000004b29f[459.516485] rdx:000000000004b29e rsi:0000000000000017 rdi:ffffffffc0615024[459.516485] rbp:ffff8800 38FA3DB8 r08:0000000000015e80 r09:ffff88003d615e80[459.516486] r10:ffffea000030c740 r11:ffffffff81002138 r12:ffff88 000c31d0c0[459.516487] r13:0000000000000000 r14:ffffffffc0614000 r15:ffffffffc0616000[459.516488] Fs:00007f8a6fa8 6740 (0000) gs:ffff88003d600000 (0000) knlgs:0000000000000000[459.516489] cs:0010 ds:0000 es:0000 cr0:000000008005003 3[459.516490] cr2:0000000000000000 cr3:0000000038760000 cr4:00000000003407f0[459.516522] dr0:0000000000000000 dr1:0000000000000000 dr2:0000000000 000000[459.516524] dr3:0000000000000000 dr6:00000000fffe0ff0 dr7:0000000000000400[459.516524] Stack:[459.516537] FFFF880038FA3DB8 ffffffff81002144 0000000000000001 0000000000000001[459.516540] 0000000000000001 ffff880028ab5040 0000000000000001 ffff880038fa3da0[459.516541] ffffffff8119d0b2 ffffffffc0616018 00000000bd1141ac ffffffffc0616018[ 459.516543] Call trace:[459.516548] [<ffffffff81002144>]? do_one_initcall+0xd4/0x210[459.516550] [<ffffffff8119d0b2>]? __vunmap+0xb2/0x100[459.516554] [<ffffffff810ed9b1>] load_module+0x13c1/0x1b80[459.516557] [< Ffffffff810e9560>]? store_uevent+0x40/0x40[459.516560] [<ffffffff810ee2e6>] sys_finit_module+0x86/0xb0[459.516563] [< Ffffffff8176be6d>] system_call_fastpath+0x1a/0x1f[459.516564] Code: <c7> xx, xx, C0 48 e5 E8 A2 86 C1 [459.516573] RIP [<ffffffffc061400d>] hello_init+0xd/0x30 [helloworld][459.516575] RSP <ffff8800 38fa3d40>[459.516576] cr2:0000000000000000[459.516578]---[end trace 7c52cc8624b7ea60]---
The following is a simple analysis of the contents of the Oops information.
By bug:unable to handle kernel null pointer dereference at (NULL) the reason for the error is that a null pointer was used. The red part determines the function that is specific to the error. Modules linked In:helloworld shows the specific modules that cause oops problems. Call Trace lists the invocation information for the function. The red part of this information is the most useful, and we can find the line of code that is specific to the error based on its information. Here's how to navigate to the line of code that makes a specific error.
The first step is to use Objdump to disassemble the compiled bin file, and here we are HELLOWORLD.O, as the following command saves the disassembly information to the Err.txt file:
Objdump helloworld.o-d > Err.txt
Err.txt content is as follows:
Helloworld.o:file format elf64-x86-64disassembly of section. Text:<span style= "color: #ff0000;" >0000000000000000 <init_module>:</span> 0:e8 callq 5 <init_module+0x5> 5:55 Push%RBP 6:48 C7 C7 (xx) xx $0x0,%rdi d:c7, xx movl $0x74736574,0x0 14: 18:31 C0 xor%eax,%eax 1a:48 e5 mov%rsp,%rbp 1d:e8 00 00 00 00 CALLQ <init_module+0x22> 22:31 C0 xor%eax,%eax 24:5d pop%rbp 25:c3 RETQ 26:66 2e 0f 1f nopw xx%cs:0x0 (%rax,%rax,1) 2d:00 xx 0000000000000030 <cleanup_mo Dule>: 30:e8 xx callq <cleanup_module+0x5> 35:55 push%RBP 36:48 C7 C7 XX/xx mov $0x0,%rdi 3d:31 c0 xor%eax,%eax 3f:48-e5 mov%rsp,%rbp 42:e8 00 XX callq <cleanup_module+0x17> 47:5d pop%rbp 48:c3 retq disassembly of section. rodata.str1.1:00000 00000000000 <.rodata.str1.1>: 0:01 add%esi, (%RCX) 2:48 Rex. W 3:65 gs 4:6c INSB (%DX),%es: (%rdi) 5:6c InSb (%DX),%es: ( %rdi) 6:6f OUTSL%ds: (%rsi), (%DX) 7:2c Sub $0x20,%al 9:77 6f JA 7a <cleanup_module+0x4a> b:72 6c JB <cleanup_module+0x49> d:64 0a 00 or%fs: (%rax),%al 10:01 add%esi, (%RCX) 12:47 6f rex. RXB OUTSL%ds: (%rsi), (%DX) 14:6f OUTSL%ds: (%rsi), (%DX) 15:64 FS 16:62 (bad) 17:79 jns 7e <cleanup_module+0x4e> 19:2c Sub $0x20,%al 1 b:63 MOVSLQ 0x75 (%RDX),%esi 1e:65 GS 1f:6c INSB (%DX),%es: (%rdi) 20:20 6f and%dh,0x6f (%rdi) 23:72 6c JB <cleanup_module+0x61> 25:64 0a or%fs: (%rax),%aldisassembly of section. m odinfo:0000000000000000 <__unique_id_license0>: 0:6c InSb (%DX),%es: (%rdi) 1:69 6e 73 3d Imul $0x3d65736e,0x65 (%RBX),%esp 8:44, Rex. R jne 6c <cleanup_module+0x3c> b:6c InSb (%DX),%es: (%rdi) c:20 and%al, 0x53 (%RDX) f:44 2f Rex. R (Bad) 11:47-Rex. RXB Push%R8 13:4c rex. WR ... Disassembly of section. comment:0000000000000000 <.comment>: 0:00, add%al,0x43 (%rdi) 3:43 3 A-Rex. XB CMP (%R8),%spl 6:28, Sub%dl,0x62 (%RBP) 9:75 6e jne 9> b:74 JE <cleanup_module+0x52> d:20 2e and%dh, (%rsi,%rbp,1) 10:38 2e cmp%ch, (%rsi) 12:32 2d 31 The 0x62753931 xor (%rip),%ch # 62753949 <cleanup_module+0x62753919> 18:75 6e jne <cleanup_module+0x58> 1a:74 JE 1c:31 29 Xor%ebp, (%RCX) 1e:20 2e and%dh, (%rsi,%rbp,1) 21:38 2e cmp%ch, (%rsi) 23:32 00 XOR (%rax),%aldisassembly of section __mcount_loc:0000000000000000 <__mcount_loc>:
By oops information we know where the error is hello_init address offset 0xd. The dump information knows that the address of the Hello_init is the address of init_module, because Hello_init is the initialization of the module, if there are errors in other functions, the dump message will have the corresponding symbol address. The address where we get the error is 0xd, and next we can use Addr2line to locate the specific line of code:
Addr2line-c-f-e HELLOWORLD.O D
This command will give you the line number. The above is to locate the line number that drives the crash by oops information.
Other Debugging Tools
The above is through the Oops information to obtain the specific cause of the crash of the line of code, which is used in the case of serious errors caused the kernel to hang up, and the more commonly used debugging means to use the PRINTK to output printing information. PRINTK is used similar to printf, just pay attention to the print level, detailed in the Linux device driver second: Construction and operation of the module has been described, in addition to note that a large number of use of PRINTK will seriously slow down the system, so use the process should also be noted.
The above two debugging means is the most commonly used in my work, there are some other debugging means, such as the use of/proc file system, using trace and other user space programs, using GDB,KGDB, etc., these debugging means are generally not easy to use or inconvenient to use, so here is not introduced.
After the introduction of the driver debugging method, the next chapter will introduce the Linux driver concurrency and race, welcome attention.
Programmer interaction Alliance (coder_online) , sweep the QR code below or search number coder_online can be followed, we can communicate online.
Linux device driver Fourth: How to locate oops code line talk about driving debugging method