Some of the issues that arise with Linux under core files

Source: Internet
Author: User
Tags arithmetic posix

Previously reproduced an article about the creation and debugging of core files using the settings, but in the use of some issues needing attention, such as in what circumstances will correctly produce the core file.

List some common issues:

One, how to use the core file

1. Using the core file

In the directory where the core file is located, type:

Gdb-c Core

It launches the GNU debugger to debug the core file, and displays the program name that generated the core file, the signal to abort the program, and so on.

If you already know what program generated this core file, such as MyServer crashes generating core.12345, debug with this command:

Gdb-c Core MyServer

Here's how to learn the use of GDB

2. A small way to test the resulting core file

Direct input command: kill-s SIGSEGV $$

Second, the reason why the program produces core

There are many reasons for the coredump of the program, and here is a summary based on previous experience:

1 memory access out of bounds
A) array access is out of bounds due to the use of the wrong subscript
b) When searching for a string, rely on the string terminator to determine whether the string ends, but the string does not have a normal use terminator
c) Use strcpy, strcat, sprintf, strcmp, strcasecmp and other string manipulation functions, the target string read/write burst. You should use strncpy, strlcpy, Strncat, Strlcat, snprintf, strncmp, strncasecmp, and other functions to prevent read and write out of bounds.

2 multithreaded threads use a thread-unsafe function.
The following reentrant functions should be used, with particular attention to the red-labeled functions, which are easily used incorrectly:
Asctime_r (3c) Gethostbyname_r (3n) getservbyname_r (3n) ctermid_r (3s) gethostent_r (3n) getservbyport_r (3n) Ctime_r (3c) Getlogin_r (3c) Getservent_r (3n) Fgetgrent_r (3c) Getnetbyaddr_r (3n) Getspent_r (3c) Fgetpwent_r (3c) Getnetbyname_r (3n) Getspnam_r (3c) Fgetspent_r (3c) Getnetent_r (3n) Gmtime_r (3c) Gamma_r (3m) Getnetgrent_r (3n) lgamma_r (3m) Getauclassent_ R (3) Getprotobyname_r (3n) Localtime_r (3c) Getauclassnam_r (3) etprotobynumber_r (3n) nis_sperror_r (3n) Getauevent_r (3) Getprotoent_r (3n) Rand_r (3c) Getauevnam_r (3) Getpwent_r (3c) Readdir_r (3c) Getauevnum_r (3) Getpwnam_r (3c) Strtok_r (3c) Getgrent_r (3c) Getpwuid_r (3c) Tmpnam_r (3s) getgrgid_r (3c) Getrpcbyname_r (3n) Ttyname_r (3c) Getgrnam_r (3c) Getrpcbynumber_r (3n) gethostbyaddr_r (3n) getrpcent_r (3n)

3 multi-thread read-write data is not locked for protection.
For global data that can be accessed concurrently by multiple threads, you should pay attention to lock protection, otherwise it is easy to cause core dump

4 illegal pointers
A) use a null pointer
b) Feel free to use pointer conversions. A pointer to a piece of memory, unless it is determined that the memory was originally assigned to a struct or type, or to an array of this structure or type, do not convert it to a pointer of this structure or type, but should copy the memory into one such structure or type, and then access the structure or type. This is because if the start address of this memory is not aligned with this structure or type, then it is easy to access it because of the bus error and core dump.

5 Stack Overflow
Do not use large local variables (because local variables are allocated on the stack), which can easily cause a stack overflow, destroying the stack and heap structure of the system, resulting in inexplicable errors.

Three, the problem of attention

In Linux to ensure that the program crashes when generating coredump to be aware of these issues:

To ensure that the directory in which the coredump is stored exists and that the process has write access to the directory. The directory where the coredump is stored is the current directory of the process, which is usually the directory where the command was launched to start the process. However, if you start with a script, the script may modify the current directory, and the actual current directory of the process will be different from the directory where the script was executed. You can then view the target of the/proc/< process PID>/CWD symbolic link to determine the actual current directory address of the process. Processes initiated through system services can also be viewed in this way.

If the program calls Seteuid ()/setegid () changes the active user or group of the process, the system does not generate coredump for these processes by default. Many service programs will call Seteuid (), such as MySQL, regardless of what user you use to run Mysqld_safe startup Mysql,mysqld The active user is always MSYQL user. If you have run a program with User A, but the user who sees the program in PS is B, then these processes are called seteuid. To enable these processes to generate core dump, you need to change the contents of the/proc/sys/fs/suid_dumpable file to 1 (which is typically 0 by default).

Three, to set a large enough core file size limit. The size of the core file that is generated when the program crashes is the amount of memory that the program consumes when it runs. However, the behavior of the program crashes can not be estimated as usual behavior, such as buffer overflow and other errors may cause the stack to be destroyed, so often the value of a variable is modified to a mess, and then the program uses this size to apply memory can cause the program more memory than usual. So no matter how little memory is used when the program is running properly, it's good to make sure that you build the core file or set the size limit to unlimited.

Use the command in the shell: Ulimit-c Unlimited, so the modification is only valid for this session, is temporary, if you want to make the change permanent, you need to modify the configuration file, such as. Bash_profile,/etc/profile, or/etc/ Security/limits.conf

As follows:

[Email protected] ~]# Vim/etc/profile

Results


Comment The line in the red box and add a new line:

Ulimit-c Unlimited

Save and exit.

Four, the time to produce core files


When our program crashes, it is possible for the kernel to map the current memory of the program to the core file, so that the programmer can find out where the program is having problems. Most often, the error that almost all C programmers have had is "segment error". is also the most difficult to isolate the cause of the problem of a mistake. Below we will analyze the generation of the core file for "segment error" and how we can use the core file to find the place where the crash occurred.

What is a core file

When a program crashes, the stored image of the process is copied in the core file of the current working directory of the process. The core file is simply a memory image (plus debugging information), which is used primarily for debugging purposes.

The core file is generated when the program receives the following UNIX signals:

Name

Description

ANSI C posix.1

SVR4 4.3+bsd

Default action

Sigabrt

Aborted (abort)

. .

. .

Terminate W/core

Sigbus

Hardware failure

.

. .

Terminate W/core

Sigemt

Hardware failure

. .

Terminate W/core

SIGFPE

Arithmetic exceptions

. .

. .

Terminate W/core

Sigill

Illegal hardware instructions

. .

. .

Terminate W/core

Sigiot

Hardware failure

. .

Terminate W/core

Sigquit

Terminal Exit character

.

. .

Terminate W/core

SIGSEGV

Invalid storage access

. .

. .

Terminate W/core

Sigsys

Invalid system call

. .

Terminate W/core

SIGTRAP

Hardware failure

. .

Terminate W/core

Sigxcpu

CPU limit exceeded (Setrlimit)

. .

Terminate W/core

Sigxfsz

File length limit exceeded (setrlimit)

. .

Terminate W/core

In the system default Action column, "Terminate W/core" means that the stored image of the process is copied in the core file of the current working directory of the process (the file is called core, which can be seen as part of a UNIX feature long before). Most Unix debuggers use the core file to check the state of the process at the time of termination.

The creation of the core file is not part of the posix.1, but is the implementation feature of many UNIX versions. Unix version 6th does not have a check condition (a) and (b), and its source code contains the following description: "If you are looking for a protection signal, it is possible to generate a large number of such signals when the set-user-id command executes." 4.3 + BSD produces a file named Core.prog, where prog is the first 1 6 characters of the program name being executed. It gives some kind of identity to the core file, so it's an improved feature.

The hardware failure in the table corresponds to the hardware failure that is defined by the implementation. Many of these names were taken from the previous implementations of Unix on DP-11. Please review the Manual of the system you are using to determine exactly which error types these signals correspond to.

These signals are described in more detail below.

This signal is generated when the Abort function is called by SIGABRT. The process terminated abnormally.

Sigbus indicates an implementation-defined hardware failure.

SIGEMT indicates an implementation-defined hardware failure.

EMT This name comes from PDP-11 's emulator trap directive.

SIGFPE This signal represents an arithmetic operation exception, such as dividing by 0, floating-point overflow, and so on.

Sigill This signal indicates that the process has executed an illegal hardware instruction.

4.3BSD generates this signal by the abort function. SIGABRT is now used for this.

Sigiot This indicates an implementation-defined hardware failure.

The name IoT comes from PDP-11 's abbreviation for the input/output trap (Input/output trap) directive. Earlier versions of System V, this signal is generated by the abort function. SIGABRT is now used for this.

Sigquit When the user presses the exit key on the terminal (generally using ctrl-/), this signal is generated and sent to the front

All processes in a process group. This signal not only terminates the foreground process group (as SIGINT did), but also produces a core file.

SIGSEGV indicates that the process has made an invalid storage access.

The name SEGV represents "segment violation (segmentation violation)".

Sigsys indicates an invalid system call. For some unknown reason, the process executes a system call instruction,

However, it indicates that the parameters of the system invocation type are invalid.

Sigtrap indicates an implementation-defined hardware failure.

This signal name is derived from the trap instruction of the PDP-11.

sigxcpu SVR4 and 4.3+BSD support the concept of resource constraints. This signal is generated if the process exceeds its soft c P u time limit.

Sigxfsz If a process exceeds its soft file length limit, SVR4 and 4.3+BSD generate this signal.

Excerpt from the 10th chapter of advanced Programming for the UNIX environment.

To debug a program using the core file

Look at the following example:

/*core_dump_test.c*/
#include <stdio.h>
const char *STR = "Test";
void Core_test () {
Str[1] = ' T ';
}

int main () {
Core_test ();
return 0;
}

Compile:
Gcc–g Core_dump_test.c-o Core_dump_test

If you need to debug a program, use GCC compile with the-G option, so debugging the core file is easier to find the wrong place.

Perform:
./core_dump_test
Segment Error

A "Fragment error" occurred while running the Core_dump_test program, but the core file was not produced. This is because the system default core file size is 0, so it is not created. You can use the Ulimit command to view and modify the size of the core file.
Ulimit-c 0
ULIMIT-C 1000
ULIMIT-C 1000

-c Specifies the size of the core file to be modified, and 1000 specifies the core file size. You can also make no restrictions on the size of the core file, such as:

Ulimit-c Unlimited
Ulimit-c Unlimited

If you want the modification to take effect permanently, you need to modify the configuration file, such as. Bash_profile,/etc/profile, or/etc/security/limits.conf.

Execute again:
./core_dump_test
Segment error (Core dumped)
LS core.*
core.6133

You can see that a core.6133 file has been created. 6133 is the process ID that the Core_dump_test program runs.

Modal Core file
The core file is a binary file and requires a tool to parse the memory image of the program when it crashes.

File core.6133

Core.6133:elf 32-bit LSB Core file Intel 80386, version 1 (SYSV), Svr4-style, from ' Core_dump_test '

You can use GDB to debug a core file under Linux.

GDB core_dump_test core.6133

GNU gdb Red Hat Linux (5.3POST-0.20021129.18RH)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU general public License, and you are
Welcome to change it and/or distribute copies of it under certain.
Type "Show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "Show warranty" for details.
This GDB is configured as "I386-redhat-linux-gnu" ...
Core is generated by './core_dump_test '.
Program terminated with signal one, segmentation fault.
Reading symbols From/lib/tls/libc.so.6...done.
Loaded symbols for/lib/tls/libc.so.6
Reading symbols From/lib/ld-linux.so.2...done.
Loaded symbols for/lib/ld-linux.so.2
#0 0x080482fd in Core_test () at Core_dump_test.c:7
7 str[1] = ' T ';
(GDB) where
#0 0x080482fd in Core_test () at Core_dump_test.c:7
#1 0x08048317 in Main () at Core_dump_test.c:12
#2 0x42015574 in __libc_start_main () from/lib/tls/libc.so.6

GDB type where, you will see the program crashes when the stack information (the current function before the list of all called functions (including the current function), GDB shows only the last few), we can easily find our program in the last crash when the call to core_dump_test.c 7th line of code, Cause the program to crash. Note: The option-G is added when compiling the program. You can also try other commands, such as Fram, List, and so on. For more detailed usage, consult the GDB documentation.

Where is the core file created?

Created under the current working directory of the process. Usually the same path as the program. However, if the ChDir function is called in the program, it is possible to change the current working directory. The core file is then created under the path specified by ChDir. There are a lot of programs that crash, but we can't find where the core files are located. It is related to the ChDir function. Of course the program crashes and does not necessarily produce core files.

When not to produce a core file

Core files are not produced under the following conditions:
(a) The process is set-user-id, and the current user is not the owner of the program file;
(b) The process is set-group-id, and the current user is not the group owner of the program file;
(c) The user does not have permission to write the current working directory;
(d) The document is too large. The license for the core file (assuming that the file does not exist before this) is usually a user read/write, group read, and other read.

Using GDB to debug the core file, we are no longer helpless when we encounter a program crash.

Article Source: http://blog.csdn.net/fengxinze/article/details/6800175

Some of the issues that arise with Linux under core files

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.