Kernel-State Call Trace
There are three types of error in the kernel state, namely bug, oops and panic.
A bug is a minor error, such as a call to sleep during spin_lock, causing a potential deadlock problem, and so on.
Oops represents a user process error and needs to kill the user process. At this point, if the user process takes up some signal locks, these signal locks will never be released, resulting in system potential instability. Note that oops itself does not cause system crash, only the open panic on oops option will trigger panic causing system crash.
Panic is a serious error that represents the entire system crash.
Oops
When Linux oops, it enters the die function in TRAPS.C.
int die (const char *str, struct pt_regs *regs, long Err)
... ...
Show_regs (regs);
The Show_stack function is called in the void Show_regs (struct pt_regs * regs) function, which prints the system's kernel-state stack. The specific principle is:
From the register to find the current stack, in the stack pointer will have a previous call function stack pointer, according to this pointer back to the previous level of the stack, and so on.
In the PowerPC Eabi standard, the stack bottom of the current stack (note that the stack, not the top of the stack, the address of the frame header) is stored in the register GPR1. In the stack space that GPR1 points to, the first DWORD is the frame header pointer (back Chain word) that invokes the function at the top level, and the second DWORD is the return address (LR Save Word) of the current function in the upper-level function. In this way, the whole call dump is completed by a level-up backtracking. In addition to this approach, the built-in function __builtin_frame_address function should theoretically work, though not seen in the kernel. (The 2.6.29 ftrace module uses the __builtin_return_address function).
When the Show_regs function is in call trace, it just prints the information in the stack with PRINTK. If the current system does not have a terminal, then it is necessary to modify the kernel to save the stack information to other places as required.
For example, you can make a space in the flash of your system that is dedicated to saving the printed information. Then, write a kernel module and add a callback function to the die function. In this way, whenever the callback function is called, the custom kernel module is notified, and in the module, the call stack and other interesting information can be saved to the dedicated flash space. One thing to note here is that the kernel may be unstable during oops, so in order to ensure that the information can be correctly written to Flash, try not to use the interrupt in the function of flash writing, but use the round-robin way. Other functions that may cause blocking, such as semaphores, sleep, and so on, are also not used.
In addition, since the Oops system is still running, you can send a message (signal, netlink, etc.) to the user space, inform the user space to do some information collection work.
Panic
Panic, Linux is in the most serious error state, marking the entire system is not available, that is, interrupts, process scheduling, etc. have been stopped, but the stack has not been destroyed. Therefore, the stack backtracking in Oops is theoretically still available. The PRINTK function is also able to be used because it is not blocked.
User-State Call Trace
The user program can call trace in the following situations to facilitate debugging:
1) When the program crashes, it will receive a signal. When a Linux system receives certain signals, it automatically prints the call trace.
2) Adding checkpoints to the user program, similar to the assert mechanism, executes call trace if the condition of the checkpoint is not satisfied.
The call trace of the user state is the same as the kernel state, also satisfies the Eabi standard, the principle is as follows:
In the GNU standard, there is a built-in function __builtin_frame_address. This function returns the stack bottom (Frame Header) Pointer of the current execution context (which is also a pointer to back Chain word), which gives the current call stack through this pointer. In this call stack, there will be a top-level call to the function of the stack-bottom pointer, through this pointer back to the previous level of the call stack. And so on to complete the call dump process.
Once the address of the function is obtained, the function name can be obtained through the symbol table. If it is a function defined in a dynamic library, you can also get the dynamic library information of this function by extension function dladdr.
Linux Kernel Debug Method summary call Trace