Linux Kernel oops

Source: Internet
Author: User
Tags tainted

What is Oops? From a linguistic perspective, oops should be a anthropomorphic word. After a minor accident or an embarrassing task, you can say "oops". translated into Chinese, it is called "". "Sorry, sorry, I didn't mean to break your cup ". That's what oops means.

What is oops in Linux kernel development? In fact, there is no essential difference between it and the above explanation, but the main character of the speech has become Linux. When some fatal problems occur, our Linux kernel will say sorry to us: "Oh, sorry, I screwed it up ". The Linux kernel prints oops information when the kernel panic occurs, and shows the current Register status, stack content, and complete Call trace to us. This will help us locate the error.

 

Next, let's look at an instance. To highlight oops, the only role of this example is to create a null pointer reference error.

Obviously, the error is caused by the 8th rows.

Next, we compile this module and use insmod to insert it into the kernel space. As we expected, oops emerged.

[100.731783] f1b9ff88 c0101131 f82cf040 c076d240 fffffffc f82cf040 0072cff4 f82d2000

Oops first describes the bug, and then points out the location of the Bug, that is, "IP: [<f82d2005>] hello_init + 0x5/0x11 [Hello]".

Here, we need a helper tool, objdump, to help analyze problems. Objdump can be used for disassembly. The command format is as follows:

 

Objdump-s hello. o

 

The following is the result of Hello. O disassembly, which is mixed with the C code and intuitive.

According to the oops prompts, we can clearly see that the assembly code for the error location hello_init + 0x5 is:

 

1 5: C7 05 00 00 00 01 movl $0x1, 0x0

 

The purpose of this Code is to store the value 1 to the address 0. This operation is of course invalid.

We can also see that the corresponding C code is:

 

1 * P = 1;

 

Bingo! With the help of oops, we quickly solved the problem.

 

Let's go back and check the above oops to see if there is any other useful information left for us in the Linux kernel.

 

Oops: 0002 [#1]

 

Here, 0002 indicates the oops error code (write error, occurs in the kernel space), and #1 indicates that this error occurs once.

The oops error code has different definitions based on the cause of the error. For examples in this article, refer to the following definition (if you find that the oops you encounter cannot match the following, it is best to search in the kernel code ):

Sometimes, oops prints tainted information. This information is used to indicate the reason why the kernel is tainted "). The specific definition is as follows:

Basically, this tainted information is left for Kernel developers. If you encounter oops when using Linux, you can send oops content to kernel developers for debugging, based on the tainted information, the kernel developer can determine the kernel running environment in the kernel panic. If we only debug our own driver, this information will be meaningless.

 

The example in this article is very simple. Oops does not cause downtime after it occurs, so that we can view the complete information from dmesg. However, the system also goes down when oops occurs. At this time, these error messages are too late to be stored in the file. After the power is turned off, you cannot see them again. We can only record it in other ways: hand copy or photograph.

Even worse, if there are too many oops information, the screen on one page is incomplete. How can we view the complete content? The first method is to use the VGA parameter in grub to specify a higher resolution so that more content can be displayed on the screen. Obviously, this method cannot solve too many problems. The second method uses two machines to print the oops information of the debugging machine to the screen of the host machine through the serial port. But now most laptops do not have serial ports, and this solution also has great limitations. The third method is, use the kernel dump tool kdump to dump the memory and CPU register content when oops occurs into a file, and then use GDB to analyze the problem.

 

The problem that may occur during the development of the kernel driver is strange. The debugging methods are also diverse. Oops is a prompt from the Linux kernel and we should make good use of it.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.