The debugging solution for the error of the program appearing in Linux __linux

Source: Internet
Author: User
Tags arithmetic

The following error occurred in today's program:

TESTROUTER[17281]: Segfault at 13A4 IP 0000003c0ac0920b SP 00007f1ebdd64bc0 error 4 in libc-2.15.so[3c0ac00000+20000]

Viewing the error category is a segment error and gives the position that the stack pointer points to. There are a number of reasons why a segment error can occur:

1. Memory Access out of bounds
A) array access is out of bounds due to the use of incorrect subscript
b When searching a string, the string terminator is used to determine whether the string ends, but the string does not use the end character properly
c) Use string manipulation functions such as strcpy, Strcat, sprintf, strcmp, strcasecmp to read/write the target string to the burst. Functions such as strncpy, strlcpy, Strncat, Strlcat, snprintf, strncmp, strncasecmp, etc. should be used to prevent reading and writing from crossing boundaries.
2. Multithreaded programs use a thread-unsafe function.
You should use the following reentrant functions:
Asctime_r (3c) Gethostbyname_r (3n) getservbyname_r (3n) ctermid_r (3s) gethostent_r (3n) getservbyport_r (3n) Ctime_r (3c) Getlogin_r (3c) Getservent_r (3n) Fgetgrent_r (3c) Getnetbyaddr_r (3n) Getspent_r (3c) Fgetpwent_r (3c) Getnetbyname_r (3n) Getspnam_r (3c) Fgetspent_r (3c) Getnetent_r (3n) Gmtime_r (3c) Gamma_r (3m) Getnetgrent_r (3n) lgamma_r (3m) Getauclassent_ R (3) Getprotobyname_r (3n) Localtime_r (3c) Getauclassnam_r (3) etprotobynumber_r (3n) nis_sperror_r (3n) Getauevent_r (3) Getprotoent_r (3n) Rand_r (3c) Getauevnam_r (3) Getpwent_r (3c) Readdir_r (3c) Getauevnum_r (3) Getpwnam_r (3c) Strtok_r (3c) Getgrent_r (3c) Getpwuid_r (3c) Tmpnam_r (3s) getgrgid_r (3c) Getrpcbyname_r (3n) Ttyname_r (3c) Getgrnam_r (3c) Getrpcbynumber_r (3n) gethostbyaddr_r (3n) getrpcent_r (3n)

3, multithreading data is not protected by lock.
For global data that will be accessed by multiple threads at the same time, you should pay attention to lock protection, otherwise it can easily cause core dump
4. Illegal pointers
A) using null pointers
b free use of pointer conversions. A pointer to a piece of memory, you should not convert this memory into a pointer to this structure or type unless you determine that it was originally assigned to a struct or type, or an array of that structure or type, and you should copy that memory into one of these structures or types, and then access that structure or type. This is because if the beginning address of this memory is not aligned according to this structure or type, it is easy to access it with the core dump because of bus error.
5, Stack Overflow
Do not use large local variables (because local variables are allocated on the stack), which can easily cause stack overflow, damage the system stack and heap structure, resulting in inexplicable errors.

Linux system by default, is not generating a segment error file, you can use the following command to view the system default segment error file size:

Ulimit-c
The result of the general display is 0. You can set the segment error file size to 2048 bytes and unlimited size by using the following two commands:
ULIMIT-C 2048
Ulimit-c Unlimited
Note that the above input in the terminal is only temporary, to the permanent effect of the need to add one of the above commands to the/etc/profile or/file.

The core file functions as follows:

When our program crashes, it is possible for the kernel to map the current memory of the program to the core file, so that the programmer can find out where the program is having problems. Most often, almost all C programmers have errors that are "segment errors." When a program crashes, a stored image of the process is replicated in the core file of the current working directory of the process. The core file is just a memory image (plus debugging information), mainly for debugging. Linux/unix will produce core files when they receive the following signal:

    SIGABRT abort     (abort) termination                         w/core
    Sigbus hardware failure termination w/core sigemt        hardware failure
    W/core SIGFPE          Arithmetic exception                                       termination w/core
    sigill           Illegal hardware instruction                               termination w/core
   sigiot            hardware failure                                       termination W/core
   Sigquit         Terminal exit terminating                                   w/core
   SIGSEGV        Invalid storage access                                termination w/core
   sigsys           Invalid system call                                 termination W/core
  sigtrap         hardware failure                                       termination W/core
  sigxcpu        exceeding CPU limit (setrlimit)               terminating w/core
  sigxfsz         Exceeding file length limit (setrlimit)          termination W/core

Here are a few signals to make a detailed description:

This signal is generated when SIGABRT calls the Abort function. The process terminated abnormally.

Sigbus indicates a hardware failure defined by an implementation.

SIGEMT indicates a hardware failure defined by an implementation.

EMT This name comes from PDP-11 's emulator trap instruction.

SIGFPE This signal represents an arithmetic operation exception, such as dividing by 0, floating point overflow, and so on.

Sigill This signal indicates that the process has executed an illegal hardware directive.

4.3BSD this signal is generated by the abort function. SIGABRT is now being used for this.

Sigiot This indicates a hardware failure defined by the implementation.

The name IoT is derived from the abbreviation of PDP-11 for the input/output trap (input/output trap) instruction. Earlier versions of System V, generated by the abort function. SIGABRT is now being used for this.

Sigquit When the user presses the exit key (generally using ctrl-/) on the terminal, this signal is generated and sent to the foreground

All processes in the process group. This signal not only terminates the foreground process group (as SIGINT did), but also produces a core file.

SIGSEGV indicates that the process made an invalid storage access.

The name SEGV says "paragraph violation (segmentation violation)".

Sigsys indicates an invalid system call. For some unknown reason, the process executes a system call instruction.

However, the parameters that indicate the type of system call are not valid.

Sigtrap indicates a hardware failure defined by an implementation.

This signal name is from the PDP-11 trap instruction.

SIGXCPUSVR4 and 4.3+BSD support the concept of resource constraints. This signal is generated if the process exceeds its soft c P u time limit.

Sigxfsz If the process exceeds its soft file length limit, SVR4 and 4.3+BSD generate this signal.

The following example shows a program that produces a segment error:

#include <stdio.h>

int main ()
{

	 char *ptr= "test";
	 strcpy (PTR, "TEST");
	  return 0;
}

Gcc–g Test.c-o Core_test
When the compilation is complete, run the program./core_test, the program will be interrupted, resulting in a "paragraph error" this hint. will see the current directory will appear a core.15649 file, we use this file for error lookup, using the Debugging tool GDB implementation:

GdB./core_test
Displays the following information:

The GNU gdb Red Hat Enterprise Linux (7.2-60.el6)
Copyright (C) is free Software Foundation, Inc.
License gplv3+: GNU GPL version 3 or later 
When debugging, we execute run under GDB

(GDB) Run
starting program:/home/cyl/openwrt/switch/switch/test_core 

received signal SIGSEGV, Segmentation fault.
0x00762f16 in __memcpy_ssse3 () from/lib/libc.so.6
Missing separate Debuginfos, Use:debuginfo-install glibc-2.12-1.132.el6.i686
From this information can be seen, received SIGSEGV signal, triggering a segment error, and prompt address 0x00762f16, call _MEMCPY_SSSE3 () reported the fault, located in the/lib/libc.so.6 library.

When you are finished debugging, type quit to exit debug mode.

Use Objdump disassembly to find segment error codes.

First redirect The disassembly results to a file, executing the following command:

objdump-d./test_core > Segfault3dump
Then we look at Segfault3dump to see where the error message is generated.

DMESG can view the address and instruction pointer that generated a segment error and the address of the stack:

TEST_CORE[15649]: Segfault at 80484c4 IP 00762f16 SP bfc9d7b8 error 7 in libc-2.12.so[62e000+191000]
It then matches the place where the segment error occurred:

The results are as follows:

[Direwolf@direwolf switch]$ grep-n-A 10-b "80484c4"./segfault3dump 121-80483b9:c7 (MOVL) $0x8 049580, (%ESP) 122-80483c0:ff D0 call *%eax 123-80483c2:c9 leave 124-80483c3:c 	3 ret 125-126-080483c4 <main>: 127-80483c4:55 push%EBP 128-80483C5:              	e5 mov%esp,%ebp 129-80483c7:83 e4 f0 and $0xfffffff0,%esp EC 20 Sub $0X20,%ESP 131:80483cd:c7 1c c4 MOVL $0x80484c4,0x1c (%esp) 132-80483d4:08 133-80 483d5:c7 movl $0x5,0x8 (%ESP) 134-80483dc:00 135-80483dd:c7-C9-MOVL $0x80484c9             	, 0x4 (%ESP) 136-80483e4:08 137-80483e5:8b 1c mov 0x1c (%ESP),%eax 138-80483e9:89 04 24       	mov%eax, (%ESP) 139-80483ec:e8 FF FF FF call 80482F4 <memcpy@plt> 140-80483f1:b8 00 00 00 00 MOV $0x0,%eax 141-80483f6:c9     
You can see that the segment error occurs within the main function, and you can see the assembly code to know that the corresponding assembly instruction is MOVL $0x80484c4,0x1c (%ESP). Next to the C language source code, we use the following command to disassemble, Sir into a debug file, and then disassembly. Gcc-g test_core.c

Objdump-s a.out
To get the following information, here we just find the entry for the main function:

080483C4 <main>:
 ************************************************************************/

# include<stdio.h>

int main ()
{
 80483c4:                   	push   %ebp
 80483c5:	e5                	mov    %esp,%ebp
 80483c7:	e4 f0 and    $0xfffffff0,%esp
 80483ca:	EC             	Sub    $0x20,%esp

	 char *ptr= "test";
 80483CD:	C7 1c C4 	MOVL   $0x80484c4,0x1c (%ESP)
 80483d4:	a strcpy 
	 (PTR, "TEST" );
 80483D5:	C7 	movl   $0x5,0x8 (%ESP)
 80483dc: 
 80483dd:	C7-C9 	movl   $0x80484c9,0x4 (%ESP)
 80483e4: 
 80483e5	8b 1c          	MOV    0x1c ( %ESP),%eax
 80483e9:    %eax mov, (%ESP)
 80483ec:	E8 FF   FF FF call 80482F4 <memcpy@plt> return
	  0;
 80483F1:	b8       	mov    $0x0,%eax
}
Find Movl $0x80484c4,0x1c (%ESP) This assembly instruction, you can know that the program is in the call strcpy error occurred.

The above two methods is to find a section of the common method of error, when there are errors, you need to consider the factors:

1, when there is a segment error, you should first think of the definition of the paragraph, from which to consider the cause of the error.

2. When using the pointer, remember to initialize the pointer after defining the pointer, and remember to judge whether it is null when used.

3, when using the array, notice whether the array is initialized, whether the array subscript is out of bounds, whether the array element exists, and so on.

4. When accessing variables, notice whether the variable occupies the address space has been released by the program.

5, in processing variables, pay attention to the variable format control is reasonable and so on.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.