1. Enable core dump
In Linux, core dump files are not generated by default.
$ Ulimit-C
Check the coredump file size. Generally, the starting value is 0, indicating that the kernel dump is invalid.
$ Ulimit-C unlimited // do not limit the size of the kernel dump file
$ Ulimit-C 1024 // set the unspecified size of the dump file (unit: Kbytes)
2. Generate a core dump file in a dedicated directory
The dump file can be set through the kernel. core_pattern variable in/etc/sysctl. conf.
For example, place the core dump file under/var/core:
$ CAT/etc/sysctl. conf
Kernel. core_pattern =/var/CORE/% T-% E-% P-% C. Core
Kernel. core_uses_pid = 0
$ Sysctl-P
If kernel. core_uses_pid = 1, the PID is added to the end of the file.
Format characters that can be set by kernel. core_pattern:
% P-insert PID into filename add PID
% U-insert current uid into filename add current uid
% G-insert current GID into filename add current GID
% S-insert signal that caused the coredump into the filename added to generate core Signal
% T-insert UNIX time that the coredump occurred into filename UNIX time when the core file is generated
% H-insert hostname where the coredump happened into filename Add the Host Name
% E-insert coredumping executable name into filename add command name
Iii. Automatic compression of dump files
The user mode helper program is used to compress the core dump file.
Set the kernel. core_pattern variable in/etc/sysctl. conf.
$ CAT/etc/sysctl. conf
Kernel. core_pattern = | usr/local/sbin/core_helper % T % E % P % C
Kernel. core_uses_pid = 0
$ Sysctl-P
Core_helper content:
$ Cat usr/local/sbin/core_helper
#! /Bin/sh
Execgzip->/var/CORE/g01-00002-00003-00004.core.gz
4. Set the shared memory for the dump
Filter the shared memory by setting/proc/<pid>/coredump_filter
Coredump_filter uses a bit mask to represent memory.
Bit 0 anonymous dedicated memory
Bit1 anonymous shared dedicated memory
Bit2 file-backed dedicated memory
Bit3 file-backed shared memory
Bit4 ELF File ing
To skip all the shared memory segments, set coredump_filter to 1.
$ Echo 1>/proc/<pid>/coredump_filter
V. Examples
After the core dump is added during compilation, use GDB to view the content of the core file to locate the lines that cause coredump in the file.
GDB [exec file] [core file]
For example:
GDB./test. Core
After entering GDB, run the BT command to check where the program is running and locate the file line of core dump.
// Test. c
Void ()
{
Char * P = NULL;
Printf ("% d/N", * P );
}
Intmain ()
{
A ();
Return0;
}
Compile
Gcc-g-o Test test. c
Run./test
Segmentationfault (core dump)
If test. Core.
GDB./test. Core
Vi. Differences between system dump and core dump
(1) system dump
All open operating systems have system dump problems.
Cause:
Critical System/core processes cause serious unrecoverable errors. To avoid greater damage to system-related resources, the operating system forcibly stops running and stores various structures in the current memory, the error location of the core process and its code status are saved for further analysis. The most common cause is that commands are flying, buffer overflow, or memory access is out of bounds. If the code flow is faulty, the execution of commands in a certain step is messy, jump to a command location that does not belong to it to execute some inexplicable things (no one knows where it is code or data, and whether it is the correct code start position ), or call the memory space that does not belong to this process. Those who have written C Programs and assembler programs should be clear about these phenomena.
Features of the system dump generation process:
During the dump generation process, to avoid too many operation structures and cause the problem to be located in the resources involved in the dump generation process, the dump cannot be generated normally, the operating system is completed with as simple code as possible, so it avoids all complicated management structures, such as file systems) LVM and so on. So that is why almost all open systems, all require that the dump device space is physically consecutive-you do not need to locate data blocks one by one, and write data from the beginning of the dump device until it is completed. This process can only be performed at the BIOS level. This is also why dump devices can only be bare devices rather than file system files when enterprise-level UNIX generally uses LVM, [B] Only [/B] is used as a dump device
LVM images are useless-the system does not have any LVM operations at this time, so it will write the first image continuously without any image.
Therefore, the UNIX system writes the dump to a bare device or tape device. During the restart, if the configured dump directory (directory in the file system) has enough space, it will be converted into a file system file. By default, [B] for Aix, It is a vmcore * file under/var/adm/RAS/. For HPUX, It is a directory and file under/var/adm/crash. [/B]
Of course, you can also transfer it to a tape device.
The main causes of system dump are:
System patch levels are inconsistent or missing.) system kernel extensions have bugs (for example, Oracle will install system kernel extensions )) drivers have bugs (because device drivers generally work at the kernel level. So once similar system dump occurs frequently, you can consider updating the system patch package to the latest and consistent) to upgrade the microcode) to upgrade the device driver (including FC multi-channel redundancy software )) upgrade and install the patch package of the kernel extension software.
(2) Process core dump
The technical reasons for the Process core dump are basically equivalent to the system dump, that is, the program principle is basically the same.
However, a process runs at a lower priority (this priority is different from the priority defined in the system for the process, but the priority of the CPU code command) and is controlled by the operating system, therefore, the operating system can abort the running of a process without affecting other processes when a process goes wrong and save the related environment. This is the core dump file, available for analysis.
If the process is compiled and compiled in advanced languages and the user's source program is used, you can include a diagnostic symbol table during compilation (all advanced language compilers have this function ), by using the analysis tool provided by the system and the core file, you can analyze the problem caused by the source program statement, and then easily correct the problem. Of course, you must do this, except for compiling with a symbolic table at the beginning, you can only re-compile the program and re-run the program to reproduce the error.
If you do not have a source program, you can only analyze it to the level of assembly instructions, and it is difficult to find and correct the problem. In this case, you do not have to worry about it, and there is no way to find the problem.