Red Hat Linux Fault location technology detailed and examples (2)

Last Update:2014-09-09 Source: Internet

Author: User

Tags systemtap dmesg

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Red Hat Linux Fault location technology detailed and examples (2) 2011-09-28 14:26 beareyes.comI want to comment (0) font size:T | T

On-line fault location is in the event of a fault, the operating system environment is still accessible, fault handlers can be logged into the operating system through console, SSH, etc., in the shell to perform a variety of operation commands or test procedures in the manner of the failure environment to observe, analyze, test, To locate the cause of the failure.

AD:2014WOT Global Software Technology Summit Beijing Station course video release

3, kernel failure situation and processing

(1) Kernel panic

Panic is the kernel's most direct fault location report, when panic occurs, the kernel has assumed that the fault location has caused the operating system to no longer have the condition of normal operation. When panic occurs, Linux shuts down all CPU interrupts and process scheduling functions, so the system is unresponsive, and if the user launches a graphical interface, no information about panic can be seen on the screen.

We usually encounter, the machine does not respond, Ping does not pass the situation, most of them are panic. Panic occurs, the kernel directly on the console to print the Panic code location of the call stack, the traditional user with the serial port connected to the machine to collect information on the console printing, but the serial port this way, it is obviously inconvenient to use, now Linux, such as RHEL5, RHEL6, the Kdump method is used to collect panic information. When Kdump is configured, the system loads and switches to a new kernel (placed in a pre-allocated memory location) with Kexec and saves all or part of the memory data of the system, such as a disk or network, panic.

With the Kdump collection of panic data, users can view the code path that leads to panic directly with the crash tool.

Panic is generally very intuitive, panic stack information can directly reflect the cause of the bug, such as MCE failure, NMI failure, data structure allocation failure. But sometimes panic is because the kernel proactively discovers critical data structure inconsistencies, when, what code causes it, it's not clear, and may require multiple tests to capture with tools like Systemtap

(2) Multi-processor environment kernel execution path generated deadlock

Kernel deadlock is not the same as panic, when a deadlock occurs, the kernel does not actively put itself in a suspended state. However, when a kernel deadlock occurs, the execution path of more than two CPUs cannot be pushed in the kernel state, is in a blocking state, and is 100% CPU (spin-lock), which causes the process on all CPUs to be unable to dispatch directly or indirectly. There are two types of kernel deadlocks:

-Deadlocks involving interrupt contexts. In this case, at least one interrupt on the CPU is blocked. The system may not be able to respond to the ping request. Since there is a CPU that is unable to respond to interrupts, the local APIC timer interrupt on it does not work and can be detected by the NMI watchdog method (check the local APIC handler maintained counter variable), NMI Watchdog can call panic () in its handler, and the user can collect memory information with Kdump to analyze the call stacks on each deadlock CPU and investigate the logical cause of the deadlock.

-Deadlocks that do not involve interrupt contexts. In this case the deadlock, the interrupts on each CPU are normal, the system can respond to the ping request, then NMI watchdog cannot be triggered. In the kernel before 2.6.16, there is no good way to deal with this situation. In the RHEL5, RHEL6 kernel, a watchdog kernel thread is provided on each CPU, and in the event of deadlock, the watchdog kernel thread on the deadlock CPU cannot be dispatched (even if it is the highest priority real-time process). It can not update the corresponding counter variable, the NMI watchdog interrupt of each CPU periodically check its CPU corresponding counter, found no updated, will call Panic (), the user can kdump to collect memory information, Analyze the call stacks on each deadlock CPU to investigate the logical cause of the deadlock.

(3) Oops or warning of the kernel

Oops and warning and panic are similar in that they are proactively reporting exceptions because of inconsistencies found by the kernel. However, oops and warning cause a much more serious problem than panic, so that the kernel does not need to hang the system when it handles the problem. Generating oops and warning, the kernel has typically recorded considerable information in DMESG, especially oops, which prints at least the call trace of the faulted place. Oops can also be converted to Panic/kdump for offline-debugging, as long as the panic_on_oops variable under/proc/sys/kernel is set to 1.

There are many direct reasons for oops and warning, such as the segment fault in the kernel, or the counter value of a data structure discovered by the kernel, while the segment fault and counter values change for a deeper reason. Usually can not be seen from the kernel DMESG information, to solve this problem is to use SYSTEMTAP to probe, if the value of a counter is found to be wrong, use SYSTEMTAP to do a probe to record all the code to the counter access, and then analysis .

locating oops and warning is much more difficult than locating an application's memory access fault, because the kernel does not track the allocation and usage of data structures as if it were a valgrind to a trace application.

2. Other (Hardware-related) failures

Automatic machine Restart is a common failure situation, usually caused by hardware such as physical memory failure, the software failure will only lead to deadlock or panic, the kernel almost no code in the case of the problem found reboot machine. There is a parameter "panic" in the/proc/sys/kernel directory, and if the value is set to not 0, the kernel restarts the machine after a few seconds of panic occurs. Now the high-end PC server, are trying to use software to handle physical memory failure, such as the MCA "Hwpoison" method will isolate the physical page of the failure, kill the fault page in the process is OK, RHEL6 now support "Hwpoison". Those machines that do not have MCA capabilities, physical memory failures, do not produce MCE anomalies, and are directly reboot by the hardware mechanism

4, RHEL6 on the Debugging technology introduction

(1) Kdump fault location collection and crash Analysis

Kdump is used to collect system memory information in the case of kernel panic, the user can also use the SysRq ' C ' key to trigger in online case. Kdump uses a non-polluting kernel to perform dump work, so it is more reliable than the previous diskdump, lkcd method. With Kdump, users can choose to dump the data to their site or network, or to filter the memory information to be collected by defining makedumpfile parameters, reducing the downtime required for Kdump

Crash is the tool for analyzing the information of kdump. It is actually a wrapper of GDB. When using crash, it is best to install the Kernel-debuginfo package, which resolves symbolic information for kdump collected kernel data. The ability to use crash to locate problems depends entirely on the user's ability to understand and analyze the kernel code.

Refer to "#>man kdump.conf", "#>man Crash", "#>man makedumpfile" to learn how to use Kdump and crash. Visit http://ftp.redhat.com to download debuginfo files

(2) locating the bug with Systemtap

Systemtap belongs to the probe class positioning tool, it can probe the kernel or user code at the specified location, when executed to the specified location or access to the data at the specified location, the user-defined probe function is automatically executed, you can print out the call stack, parameter value, variable value and other information. Systemtap choose the position of probe is very flexible, this is Systemtap's powerful function. The probe points of Systemtap can include the following aspects:

-all system calls in the kernel, entry or exit points for all functions in the kernel and module

-Custom timer probe point

-Any specified code or data access location in the kernel

-arbitrary code or data access locations in a specific user process

-A number of pre-set probe points for each functional subsystem, such as tcp,udp,nfs,signal each subsystem has pre-set many detection points

Systemtap scripts are written in the STAP scripting language, script code calls STAP provided API for statistics, print data, etc., about the STAP language provides API functions, refer to "#> man Stapfuncs". Refer to "#> man STAP" for the function and use of Systemtap, "#> man Stapprobes"

(3) Ftrace

Ftrace is an event tracking mechanism implemented in the Linux kernel using the tracepoints infrastructure, and its role is to give a clearer picture of the activities that a system or process performs during a certain period of time, such as function call paths, process switching flows, and so on. The ftrace can be used to observe the latency of various parts of the system for optimization of real-time applications, and it also helps to locate faults by recording kernel activity over time. Use the following method to trace a process's function call at one end of the time

#> echo "function" >/sys/kernel/debug/tracing/current_tracer
#> echo "xxx" >/sys/kernel/debug/tracing/set_ftrace_pid
#> Echo 1 >/sys/kernel/debug/tracing/tracing_enabled

In addition to tracing function calls, ftrace can also tracing the process of the system, wake-up, block device access, kernel data structure allocation and other activities. Note that tracing and profile are different, tracing records all activities over a period of time, not statistics, the user can set the size of the buffer by buffer_size_kb under/sys/kernel/debug/tracing, To record data for longer periods of time.

The specific use of ftrace can refer to the content of the kernel source Documenation/trace

(4) Oprofile and perf

Both Oprofile and perf are tools for system profile (sampling, statistics) that are primarily used to address system and application performance issues. Perf features more powerful, more comprehensive, while perf user space tools and kernel source code together to maintain and release, so that users can timely enjoy the new features of the perf kernel. Perf is in RHEL6, there is no Perf in RHEL5. Both Oprofile and perf use the hardware counters that are present in the modern CPUs for statistical work, but Perf can also use the "Software Counter" and "Trace points" defined in the kernel to do more work. The sampling of the oprofile takes advantage of the NMI interrupt of the CPU, while the perf can take advantage of both NMI interrupts and the periodic interrupts provided by the hardware counters. Users can easily use perf to oprofile the execution time distribution of a process or system, such as

#> Perf top-f 1000-p

You can also use the system-defined "software counter" and the "trace points" of each subsystem to analyze the system, such as

#>perf stat-a-e kmem:mm_page_alloc-e kmem:mm_page_free_direct-e Kmem:mm_pagevec_free sleep 6

Ability to count the activity of the KMEM subsystem within 6 seconds (this is actually achieved using the tracepoints provided by Ftrace)

I think that with perf, users don't need to use oprofile.

Red Hat Linux Fault location technology detailed and examples (2)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More