Debugging and stack backtracking of oops information in Linux-linux people know that this is a good thing! __php

Source: Internet
Author: User
Tags tainted

=============================================================================

Original address: http://blog.micro-studios.com/?p=615#comment-1069

Look after the feeling: this is more careful than the LDD3 said

November 29, 2012 11:24:17: There are bug_on to be no disassembly ...

November 30, 2012 11:14:13: The callback function is missing.

The situation I encountered: http://my.csdn.net/my/code/detail/28858

=============================================================================

Oops information source and format
Oops the word meaning "surprise"
, when the kernel error (such as access to illegal address) printed information is
called Oops information. The
Oops information contains the following sections.
11 Paragraph text description information.
such as information such as "Unable to handle kernel NULL pointer dereference in virtual Address 00000000
" indicates what kind of error occurred. The ordinal number of the
2 Oops information.
For example, 1th time, 2nd time, and so on. This information is similar to the following, with the data in brackets representing the ordinal number. The name of the module loaded in the
Internal error:oops:805 [#1]
3 kernel, or it may not be, starts with the following typeface.
Modules Linked In:
4 The sequence number of the wrong CPUs occurred for a single-processor system with an ordinal number of 0, for example:
cpu:0
Not tainted (2.6.22.6 #36)
5 CPU When an error occurs Values for each register.
6 The name and process ID of the current process, such as:
Process Swapper (pid:1, stack limit = 0xc0480258)
This is not to say that the error is the process, but rather that the current process is it when the error occurs. Errors can be
Born in kernel code, drivers, and possibly errors in this process.
7 stack information.
8 Stack backtracking information, from which you can see the function call relationship, in the form of:
BackTrace:
[<c001a6f4>] (s3c2410fb_probe+0x0/0x560) from [<c01bf4e8 "] (platform_drv_
probe+0x20/0x24)
... The machine code for the instruction near the
9 error directive, such as (Error instruction in parentheses)
:
code:e24cb004 e24dd010 e59f34e0 e3a07000 (e5873000)

Configuring the kernel to make stack backtracking information for Oops information more intuitive
The Linux 2.6.22 itself has the debugging function, can make the printed Oops information more intuitive. Through the Oops letter
The value of the PC register can know the address of the error instruction, the stack backtracking information can know the function call when the error
Relationship, depending on these two points can quickly locate the error.
To be able to print stack backtracking information when the kernel is in error, increase the "-fno-omit-frame-pointer" selection when compiling the kernel
Items, which can be implemented by configuring Config_frame_pointer. View the configuration files in the kernel directory. config,
Make sure the config_frame_pointer has been defined, and if not, perform a "made Menuconfig" command heavy
The new configuration kernel. Config_frame_pointer may be automatically selected by other configuration items.
18.3.3
To debug an instance of the kernel using Oops information
1. Access to Oops information
This section intentionally modifies the LCD driver DRIVERS/VIDEO/S3C2410FB.C and adds an error code: in S3c2410fb_
Add the following two code to the beginning of the probe function:
int *ptest = NULL;
*ptest = 0x1234;
Re-compile the kernel, and it will error after startup and print out the following Oops information:
Unable to handle kernel NULL pointer dereference in virtual Address 00000000
PGD = c0004000
[00000000] *pgd=00000000
Internal error:oops:805 [#1]
Modules Linked In:
cpu:0
Not tainted (2.6.22.6 #36)
PC is at s3c2410fb_probe+0x18/0x560
LR is at platform_drv_probe+0x20/0x24
PC: [<c001a70c>]
LR: [<c01bf4e8>]
psr:a0000013
Sp:c0481e64 ip:c0481ea0 fp:c0481e9c
r10:00000000 r9:c0024864 r8:c03c420c
r7:00000000 r6:c0389a3c r5:00000000 r4:c036256c
r3:00001234 r2:00000001 r1:c04c0fc4 r0:c0362564
FLAGS:NZCV IRQs on Fiqs on Mode svc_32 Segment kernel
control:c000717f table:30004000 dac:00000017
Process swapper (pid:1, stack limit = 0xc0480258)
Stack: (0xc0481e64 to 0xc0482000)
1e60:c02b1f70 00000020 c03625d4 c036256c c036256c 00000000
1E80:C0389A3C c03c420c c0024864 00000000 C0481eac c0481ea0 c01bf4e8 c001a704
1ea0:c0481ed0 c0481eb0 c01bd5a8 c01bf4d8 c0362644 c036256c c01bd708 c0389a3c
1ec0:00000000 c0481ee8 c0481ed4 c01bd788 c01bd4d0 00000000 C0481EEC c0481f14
1EE0:C0481EEC c01bc5a8 c01bd718 c038dac8 c038dac8 C03625b4 00000000 c0389a3c
1f00:c0389a44 c038d9dc c0481f24 c0481f18 c01bd808 c01bc568 c0481f4c c0481f28
1f20:c01bcd78 c01bd7f8 c0389a3c 00000000 00000000 c0480000 C0023ac8 00000000
1f40:c0481f60 c0481f50 c01bdc84 c01bcd0c 00000000 c0481f70 c0481f64 C01BF5FC
1f60:c01bdc14 c0481f80 c0481f74 c019479c c01bf5a0 c0481ff4 c0481f84 c0008c14
1f80:c0194798 e3c338ff e0222423 00000000 00000001 e2844004 00000000 00000000
1fa0:00000000 c0481fb0 c002bf24 c0041328 00000000 00000000 C0008B40 c00476ec
1fc0:00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
1fe0:00000000 00000000 00000000 C0481ff8 c00476ec c0008b50 c03cdf50 c0344178
BackTrace:
[<c001a6f4>] (s3c2410fb_probe+0x0/0x560) from [<c01bf4e8>] (platform_drv_
probe+0x20/0x24)
[<c01bf4c8>] (platform_drv_probe+0x0/0x24) from [<c01bd5a8>] (driver_probe_
DEVICE+0XE8/0X18C)
[<c01bd4c0>] (driver_probe_device+0x0/0x18c) from [<c01bd788>] (__driver_
ATTACH+0X80/0XE0)
r8:00000000 r7:c0389a3c r6:c01bd708 r5:c036256c r4:c0362644
[<c01bd708>] (_ _driver_attach+0x0/0xe0) from [<c01bc5a8>] (Bus_for_each_
dev+0x50/0x84)
R5:C0481EEC r4:00000000
[<c01bc558>] (bus_for_each_dev+0x0/0x84) from [<c01bd808>] (driver_attach+
0X20/0X28)
R7:C038D9DC r6:c0389a44 r5:c0389a3c r4:00000000
[<c01bd7e8>] (driver_attach+0x0/0x28) from [<c01bcd78>] (bus_add_driver+
0X7C/0X1B4)
[<c01bccfc>] (BUS_ADD_DRIVER+0X0/0X1B4) from [<c01bdc84>] (driver_register+
0x80/0x88)
[<c01bdc04>] (driver_register+0x0/0x88) from [<c01bf5fc>] (platform_driver_
register+0x6c/0x88)
r4:00000000
[<c01bf590>] (platform_driver_register+0x0/0x88) from [<c019479c>] (s3c2410fb_
INIT+0X14/0X1C)
[<c0194788>] (s3c2410fb_init+0x0/0x1c) from [<c0008c14>] (kernel_init+0xd4/
0X28C)
[<c0008b40>] (kernel_init+0x0/0x28c) from [<c00476ec>] (do_exit+0x0/0x760)
code:e24cb004 e24dd010 e59f34e0 e3a07000 (e5873000)
Kernel panic-not syncing:attempted to kill init!
Analyze Oops Information
(1 Clear the cause of the error.
"Unable to handle kernel NULL pointer dereference in virtual Address 00000000" by error message "
The kernel is using a null pointer because of an illegal address access error.
(2) to find the function call relationship according to the stack backtracking information.
When the kernel crashes, you can learn from the PC registers the function of the crash and the error command. But in many cases, the wrong
It is possible that it was introduced by its caller, so it is important to find the invocation relationship of the function.
Some stack backtracking information is as follows:
[<c001a6f4>] (s3c2410fb_probe+0x0/0x560) from [<c01bf4e8>] (platform_drv_
probe+0x20/0x24)
This line of information is divided into two parts,
Indicates that the following platform_drv_probe function calls the preceding S3c2410fb_probe
Function.
The first half of the meaning is:
"C001a6f4" is the address where the first address of the S3c2410fb_probe function is offset 0, which is large
Small for 0x560.
The second half of the meaning is:
"C01bf4e8" is the address of the Platform_drv_probe first address offset 0x20, this letter
Number size is 0x24.
In addition, the second half of "[<c01bf4e8>]" indicates the return address after s3c2410fb_probe execution.
For stack backtracking information similar to the following, where the R8~R4 indicates that the Driver_probe_device function has just been invoked when this
The value of some registers.
[<c01bd4c0>] (driver_probe_device+0x0/0x18c) from [<c01bd788>] (__driver_
ATTACH+0X80/0XE0)
r8:00000000 r7:c0389a3c r6:c01bd708 r5:c036256c r4:c0362644
From the stack backtracking information above, we can know that the function call relation of kernel error is as follows,
Finally in S3c2410fb_probe
function collapsed inside.
Do_exit->
Kernel_init->
S3c2410fb_init->
Platform_driver_register->
Driver_register->
Bus_add_driver->
Driver_attach->
Bus_for_each_dev->
__driver_attach->
Driver_probe_device->
Platform_drv_probe->
S3c2410fb_probe
(3) determine the error location according to the value of the PC register.
The register values for errors in the above Oops information are as follows:
PC is at s3c2410fb_probe+0x18/0x560
LR is at platform_drv_probe+0x20/0x24
PC: [<c001a70c>]
LR: [<c01bf4e8>]
psr:a0000013
...
"PC is at s3c2410fb_probe+0x18/0x560" indicates an error instruction that is offset in the S3c2410fb_probe function as
0x18 's instructions.
"PC: [<c001a70c>]" indicates that the address of the error instruction is c001a70c (hexadecimal).
(4) The combination of kernel source code and disassembly code positioning problem.
Mr. Vmlinux.dis, the disassembly code for the kernel, executes the following command:
$ cd/work/system/linux-2.6.22.6
$ arm-linux-objdump-d vmlinux > Vmlinux.dis
The partial assembler code near the error address c001a70c is as follows:
C001a6f4 <s3c2410fb_probe>:
C001A6F4:E1A0C00D mov IP, SP
C001A6F8:E92DDFF0 Stmdb
c001a6fc:e24cb004 Sub fp, IP, #4; 0x4
c001a700:e24dd010 Sub sp, SP, #16; 0x10
C001a704:e59f34e0 LDR R3, [PC, #1248]; C001abec <.init+0x1284c>
c001a708:e3a07000 mov R7, #0
c001a70c:e5873000 STR R3, [R7]
C001A710:E59030FC LDR R3, [R0, #252]
sp!, {r4, R5, R6, R7, R8, R9, SL, FP, IP, LR, PC}
; 0x0
<=========== Error command
The error directive is "STR R3, [R7]"
, it puts the value of the R3 register into memory, and the memory address is the value of the R7 register.
According to the register value in Oops information: R3 is 0X00001234,R7 0. 0 The address is inaccessible, so an error occurred.
The part C code for the S3c2410fb_probe function is as follows:
static int __init s3c2410fb_probe (struct platform_device *pdev)
{
struct S3c2410fb_info *info;
struct FB_INFO
*fbinfo;
struct S3C2410FB_HW *mregs;
int ret;
int IRQ;
int i;
U32 Lcdcon1;
int *ptest = NULL;
*ptest = 0x1234;
Mach_info = pdev->dev.platform_data;
Combined with disassembly code, it is easy to know that "*ptest = 0x1234;" Causes an error, where the ptest is empty.
In most cases, it is not so easy to navigate from the disassembly code to the C code, which requires a strong reading assembly
The ability of the program. By using the stack backtracking information to know the call relationship of the function, this can help to locate many problems.

manual stack backtracking using Oops stack information
Previously said, from the Oops information of the PC register value know that the crash occurred when the function, error instructions. But wrong
The error may have been introduced by its caller, so it is also necessary to find the invocation relationship of the function.
Because the kernel is configured with Config_frame_pointer, stack backtracking information is printed when Oops information appears. Such as
The kernel is not configured Config_frame_pointer, then you can analyze the stack information, find the function of the call relationship.
1. The function of stack
A program contains code snippets, data segments, BSS segments, heaps, stacks, where data segments are used to store an initial value of not 0
Global data, the BSS segment is used to store global data with an initial value of 0, the heap is used for dynamic memory allocation, and the stack is used to implement the
Number calls, and stores local variables.
The called function will store the value of some registers in the stack, including the return address register, before executing it.
Lr. If you know the value of the saved LR deposit, then you know who the caller is. In the stack information, a
function to find all the stored LR values on a function.
You can know each call function,
This is the principle of stack backtracking.
2. Stack Backtracking Example Analysis
Still take the previous LCD driver as an example,
Using the stack information of the Oops information above for analysis,
Stack information is as follows:
Stack: (0xc0481e64 to 0xc0482000)
1e60:c02b1f70 00000020 c03625d4 c036256c c036256c 00000000
1E80:C0389A3C c03c420c c0024864 00000000 C0481eac c0481ea0 c01bf4e8 c001a704
1ea0:c0481ed0 c0481eb0 c01bd5a8 c01bf4d8 c0362644 c036256c c01bd708 c0389a3c
1ec0:00000000 c0481ee8 c0481ed4 c01bd788 c01bd4d0 00000000 C0481EEC c0481f14
1EE0:C0481EEC c01bc5a8 c01bd718 c038dac8 c038dac8 C03625b4 00000000 c0389a3c
...
1 Find the first function according to the PC register value, determine its stack size, and determine the calling function.
From the Oops information knowable pc value is c001a70c,
Use it in the kernel disassembly program Vmlinux.dis to know it
is located within the S3c2410fb_probe function.
According to the Assembly code of the beginning part of this function, we can know the size of the stack, the position of the LR return value in the stack,
Code as follows:
C001a6f4 <s3c2410fb_probe>:
C001A6F4:
e1a0c00d
mov IP, SP
C001A6F8:E92DDFF0 Stmdb
c001a6fc:e24cb004 Sub fp, IP, #4; 0x4
sp!, {r4, R5, R6, R7, R8, R9, SL, FP, IP, LR, PC}
c001a700:e24dd010 Sub sp, SP, #16; 0x10
e5873000 STR R3, [R7]
...
C001A70C:
PC value c001a70c the corresponding instruction
...
{r4, R5, R6, R7, R8, R9, SL, FP, IP, LR, PC} These 11 registers are stored on the stack, instruction "sub sp, SP, #16"
Also makes the stack down 16 bytes, so the stack size of this function is (11X4+16) byte, that is, 15 double words.
The 15 data in the beginning of the stack information is the stack content of this function, and the registers that they save are listed below.
1e60:
C02b1f70 00000020 c03625d4 c036256c c036256c 00000000
R4
R5
R6
1E80:C0389A3C c03c420c c0024864 00000000 C0481eac c0481ea0 c01bf4e8 c001a704
R7
R8
R9
Sl
Fp
Ip
Lr
Pc
Where the LR value is C01bf4e8, which indicates the return address of function s3c2410fb_probe after execution, it is called function
The address in. The following is used to repeat the backtracking procedure for this step using the LR value.
2 Find the calling function according to the LR register value, determine its stack size, and determine the upper-level call function.
The LR value (C01BF4E8) obtained from the previous step can be known in the kernel Disassembler Vmlinux.dis to be located in the
Within the Platform_drv_probe function.
According to the disassembly code of the beginning part of this function, the size of the stack and the position of the LR return value are saved in the stack are known.
The code is as follows:
C01bf4c8 <platform_drv_probe>:
C01BF4C8:E1A0C00D mov IP, SP
c01bf4cc:e92dd800 Stmdb sp!, {fp, IP, LR, PC}
e89da800 Ldmia sp, {fp, SP, PC}
...
C01bf4e8:
LR value (C01BF4E8) corresponding to the instruction
{FP, IP, LR, PC} These 4 registers are stored in the stack, the stack size of this function is 4 double words. In Oops stack information,
The 4 data below the stack of the previous function s3c2410fb_probe is the stack content of the function platform_drv_probe, such as
Shown below:
1ea0:c0481ed0 c0481eb0 c01bd5a8 C01bf4d8
Fp
Ip
Lr
Pc
Where the LR value is C01bd5a8, representing the return address of the function platform_drv_probe after execution, it is the upper level
Call the address in the function. Using the LR value, repeat the search process for this step until the stack information is parsed or no longer
Analysis, so you can find all the function call relationships.
Some functions are simple and do not use stacks (SP values do not change in this function)
, or not save LR in the stack
Value. These situations require the reader to be flexible, and a strong reading ability of the assembler is the key.

Turn from: http://blog.csdn.net/kangear/article/details/8217329

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.