Red Hat Linux Fault location technology detailed and examples (3)
On-line fault location is in the event of a fault, the operating system environment is still accessible, fault handlers can be logged into the operating system through console, SSH, etc., in the shell to perform a variety of operation commands or test procedures in the manner of the failure environment to observe, analyze, test, To locate the cause of the failure.
AD:2014WOT Global Software Technology Summit Beijing Station course video release
5, using Kdump tool core Fault Location example
A) Deployment Kdump
The steps to deploy Kdump collect fault information are as follows:
(1) Set the relevant kernel boot parameters.
Add the following to the/boot/grub/menu.lst
- crashkernel=[email protected] nmi_watchdog=1
Where the Crashkernel parameter is used to reserve memory for the Kdump kernel; Nmi_watchdog=1 is used to activate NMI interrupts, and we need to deploy NMI watchdog to ensure that the panic is triggered if an outage is not determined if the failure is interrupted. Restart the system to ensure the settings are in effect
(2) Set the relevant sysctl kernel parameters.
Last add a row in/etc/sysctl.conf
- Kernel.softlookup_panic = 1
This setting ensures that panic is called when softlock occurs, triggering kdump behavior execution #>sysctl-p ensuring that the settings take effect
(3) Configuration/etc/kdump.conf
Add the following lines to the/etc/kdump.conf
- Ext3/dev/sdb1
- Core-collector makedumpfile-c–message-level 7-d 31-i/mnt/vmcoreinfo
- Path/var/crash
- Default reboot
Where/DEV/SDB1 is the file system for placing DumpFile, dumpfile files are placed under/var/crash, to create/DEV/SDB1 directories in advance under the/var/crash partition. "-D 31" Specifies the level of filtering for the dump content, which is important for the dump partition not to hold all of the memory content or if the user does not want the dumping to break the business for too long. The Vmcoreinfo file is placed in the/directory of the/DEV/SDB1 partition and needs to be generated using the following command:
#>makedumpfile-g//vmcoreinfo-x/usr/lib/debug/lib/modules/2.6.18-128.el5.x86_64/vmlinux
The "Vmlinux" file is provided by the Kernel-debuginfo package, which requires the kernel-debuginfo and Kernel-debuginfo-common of the appropriate kernel to be installed two packages before running the Makedumpfile, and the two packages need to be from a http://ftp.redhat.com download. "Default reboot" is used to tell Kdump to restart the system after collecting the dump information.
(4) Activating kdump
Run the #>service kdump Start command, and you will see that a initrd-2.6.18-128.el5.x86_64kdump.img file is generated in the/boot/directory when it is successfully completed. The file is the initrd file of the kernel loaded by kdump, and the job of collecting the dump information is carried out in the INITRD startup environment. Look at the code for the/etc/init.d/kdump script, and you can see that it calls the MKDUMPRD command to create the INITRD file for the dump.
1. Test the effectiveness of Kdump deployment
In order to test the effectiveness of the kdump deployment, I wrote the following kernel module, through Insmod load the kernel module, you can generate a kernel thread, after 10 seconds or so, occupy 100% of the CPU, after 20 seconds or so to trigger kdump. After the system restarts, check the contents of the/oracle partition/var/crash directory to confirm that the Vmcore file is generated.
- ZQFTHREAD.C #include
- #include
- #include
- #include
- #include
- #include
- Module_author ("[email protected]");
- Module_description ("A module to test ....");
- Module_license ("GPL");
- static struct task_struct *zqf_thread;
- static int zqfd_thread (void *data);
- static int zqfd_thread (void *data)
- {
- int i=0;
- while (!kthread_should_stop ()) {
- i++;
- if (I < ) {
- Msleep_interruptible (1000);
- PRINTK ("%d seconds\n", i);
- }
- if ( i = =)//Running in the kernel
- i = 11;
- }
- return 0;
- }
- static int __init zqfinit (void)
- {
- struct Task_struct *p;
- p = kthread_create (zqfd_thread, NULL, "%s", "ZQFD");
- if (p) {
- Zqf_thread = p;
- Wake_up_process (Zqf_thread); Actually start it up
- return (0);
- }
- Return (-1);
- }
- static void __exit Zqffini (void)
- {
- Kthread_stop (Zqf_thread);
- }
- Module_init (Zqfinit);
- Module_exit (Zqffini)
- Makefile Obj-m + = ZQFTHREAD.O
- Making #> Make-c/usr/src/kernels/2.6.32-71.el6.x86_64/ m= ' pwd ' modules
2. Analyze Vmcore files with crash tools
The command line format for parsing vmcore with the crash command is shown below. Open Vmcore with crash, mainly with DMESG and BT command to print out the problem of the execution path of the call trace, with dis disassembly code, finally confirm the call trace corresponding C source location, and then the logical analysis.
- #>crash/usr/lib/debug/lib/modules/2.6.18-128.el5.x86_64/vmlinux/boot/system.map-2.6.18-128.el5.x86_64./ Vmcore
Red Hat Linux Fault location technology detailed and examples (3)