Causes of Linux memory errors and debugging methods

Source: Internet
Author: User
Linux memory error causes and debugging methods-general Linux technology-Linux programming and kernel information. The following is a detailed description. A block error occurs when you access the wrong memory segment. Generally, you do not have the permission or the corresponding physical memory exists, especially when you access the 0 address.

Generally, a segment error means that the accessed memory exceeds the memory space of the program provided by the system. Generally, this value is saved by gdtr, which is a 48-bit register, the 32-bit table stores the gdt table pointed to by it, and the last 13 BITs are saved to the corresponding gdt subscript, the last three digits include whether the program is in the memory and the running level of the program in the cpu. The gdt pointing to is a table in 64 bits, this table stores the code segment for running the program, the starting address of the data segment, the corresponding segment limit and page switch, the program running level, and the memory granularity. Once an out-of-bounds access occurs to a program, the cpu will generate corresponding exception protection, so segmentation fault will appear.

In programming, the following methods may easily cause segment errors, which are basically caused by incorrect pointer usage.

1) access the system data zone, especially writing data to the memory address protected by the System
The most common is to give a pointer A 0 address
2) memory out of bounds (array out of bounds, variable types inconsistent, etc.) access to areas not in your memory

Solution

When we write programs in C/C ++, most of the work of memory management needs to be done. In fact, memory management is a tedious task. No matter how clever you are and how experienced you are, it's hard to avoid making minor mistakes here, these errors are usually so simple and easy to eliminate. However, manual debugging is often inefficient and annoying, this article will talk about how to quickly locate these "segment errors" statements about memory access out-of-bounds errors.
The following describes several debugging methods for a program with a segment error:

1 dummy_function (void)
2 {
3 unsigned char * ptr = 0x00;
4 * ptr = 0x00;
5}
6
7 int main (void)
8 {
9 dummy_function ();
10
11 return 0;
12}

As a skilled C/C ++ programmer, the bug of the above Code should be very clear, because it tries to operate on the memory area with the address 0, this memory area is usually inaccessible, and of course there will be errors. Let's compile and run it:

Xiaosuo @ gentux test $./a. out
Segment Error

As expected, it went wrong and exited.

1. Use gdb to gradually find the segment error:

This method is also widely known and widely used. First, we need an executable program with debugging information. Therefore, we add the "-g-rdynamic" parameter to compile the program, use gdb to debug and run the newly compiled program. The specific steps are as follows:

Xiaosuo @ gentux test $ gcc-g-rdynamic d. c
Xiaosuo @ gentux test $ gdb./a. out
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
Welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"... Using host libthread_db library "/lib/libthread_db.so.1 ".

(Gdb) r
Starting program:/home/xiaosuo/test/a. out

Program received signal SIGSEGV, Segmentation fault.
0x08048524 in dummy_function () at d. c: 4
4 * ptr = 0x00;
(Gdb)

It seems that we did not need to debug step by step to find the Error Path line 4th of the d. c file, which is actually so simple.

We also found that the process ended with the SIGSEGV signal. After further reading the document (man 7 signal), we know that the default handler action of SIGSEGV is to print the error message of "segment error" and generate a Core file, therefore, method 2 is generated.

2. Analyze the Core file:

What is a Core file?

The default action of certain signals is to cause a process to terminate and produce a core dump file, a disk file containing an image of the process's memory at the time of termination. A list of the signals which cause a process to dump core can be found in signal (7 ).

The above information is taken from man page (man 5 core ). But it's strange that the core file is not found on my system. Later, I recalled that in order to gradually reduce the number of pull files on the system (I am somewhat clean, which is one of the reasons I like Gentoo), and disabled the generation of core files, check that the following is true, limit the size of the system core file to kb, and try again:

Xiaosuo @ gentux test $ ulimit-c
0
Xiaosuo @ gentux test $ ulimit-c 1000
Xiaosuo @ gentux test $ ulimit-c
1000
Xiaosuo @ gentux test $./a. out
Segment error (core dumped)
Xiaosuo @ gentux test $ ls
A. out core d. c f. c g. c pango. c test_iconv.c test_regex.c


The core file is finally generated. Use gdb to debug it:

Xiaosuo @ gentux test $ gdb./a. out core
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
Welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"... Using host libthread_db library "/lib/libthread_db.so.1 ".


Warning: Can't read pathname for load map: input/output error.
Reading symbols from/lib/libc. so.6...... done.
Loaded symbols for/lib/libc. so.6
Reading symbols from/lib/ld-linux.so.2... done.
Loaded symbols for/lib/ld-linux.so.2
Core was generated by './a. out '.
Program terminated with signal 11, Segmentation fault.
#0 0x08048524 in dummy_function () at d. c: 4
4 * ptr = 0x00;

In another step, I got to the location where the error was located and admired the design of the Linux/Unix system.
Next, when I used Internet Explorer in windows, sometimes some web pages may encounter "runtime errors ", at this time, if a windows compiler is installed on your machine, a dialog box will pop up asking you if you want to debug it. If you choose yes, the compiler will be opened, and enter the debugging status to start debugging.

How can we achieve this in Linux? My brain is spinning at a high speed. Now, let it call gdb in the handler of SIGSEGV, so the third method is born again:

3. Start debugging when a segment error occurs:

# Include
# Include
# Include
# Include

Void dump (int signo)
{
Char buf [1024];
Char cmd [1024];
FILE * fh;

Snprintf (buf, sizeof (buf), "/proc/% d/define line", getpid ());
If (! (Fh = fopen (buf, "r ")))
Exit (0 );
If (! Fgets (buf, sizeof (buf), fh ))
Exit (0 );
Fclose (fh );
If (buf [strlen (buf)-1] = 'n ')
Buf [strlen (buf)-1] = '';
Snprintf (cmd, sizeof (cmd), "gdb % s % d", buf, getpid ());
System (cmd );

Exit (0 );
}

Void
Dummy_function (void)
{
Unsigned char * ptr = 0x00;
* Ptr = 0x00;
}

Int
Main (void)
{
Signal (SIGSEGV, & dump );
Dummy_function ();

Return 0;
}


The compilation and running effect is as follows:

Xiaosuo @ gentux test $ gcc-g-rdynamic f. c
Xiaosuo @ gentux test $./a. out
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
Welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"... Using host libthread_db library "/lib/libthread_db.so.1 ".

Attaching to program:/home/xiaosuo/test/a. out, process 9563
Reading symbols from/lib/libc. so.6...... done.
Loaded symbols for/lib/libc. so.6
Reading symbols from/lib/ld-linux.so.2... done.
Loaded symbols for/lib/ld-linux.so.2
0xffffe410 in _ kernel_vsyscall ()
(Gdb) bt
#0 0xffffe410 in _ kernel_vsyscall ()
#1 0xb7ee4b53 in waitpid () from/lib/libc. so.6
#2 0xb7e925c9 in strtold_l () from/lib/libc. so.6
#3 0x08048830 in dump (signo = 11) at f. c: 22
#4
#5 0x0804884c in dummy_function () at f. c: 31
#6 0x08048886 in main () at f. c: 38


How is it? Is it still cool?

The above methods are implemented on the premise that gdb is available on the system. If not, what should I do? Actually, glibc provides us with such function clusters that can dump stack content. For details, see/usr/include/execinfo. h (no man page is provided for these functions, so we can't find them). You can also learn from the gnu manual.

4. Use backtrace and objdump for analysis:

The rewrite code is as follows:

# Include
# Include
# Include
# Include

/* A dummy function to make the backtrace more interesting .*/
Void
Dummy_function (void)
{
Unsigned char * ptr = 0x00;
* Ptr = 0x00;
}

Void dump (int signo)
{
Void * array [10];
Size_t size;
Char ** strings;
Size_t I;

Size = backtrace (array, 10 );
Strings = backtrace_symbols (array, size );

Printf ("Obtained % zd stack frames. n", size );

For (I = 0; I <size; I ++)
Printf ("% sn", strings );

Free (strings );

Exit (0 );
}

Int
Main (void)
{
Signal (SIGSEGV, & dump );
Dummy_function ();

Return 0;
}


The compilation and running results are as follows:

Xiaosuo @ gentux test $ gcc-g-rdynamic g. c
Xiaosuo @ gentux test $./a. out
Obtained 5 stack frames.
./A. out (dump + 0x19) [0x80486c2]
[0xffffe420]
./A. out (main + 0x35) [0x802136f]
/Lib/libc. so.6 (_ libc_start_main + 0xe6) [0xb7e02866]
./A. out [0x8048601]

This time you may be disappointed. It seems that you have not provided enough information to mark the error. Don't worry. First, let's take a look at what can be analyzed. Use the objdump disassembly program to find the code location corresponding to address 0x801166f:

Xiaosuo @ gentux test $ objdump-d a. out



8048765: e8 02 fe ff call 804856c
80100006a: e8 25 ff call 8048694
80100006f: b8 00 00 00 mov $0x0, % eax
8048774: c9 leave

We still found the function (dummy_function) in which the error occurred. The information is not complete, but it is always better!

Postscript:

This article provides several methods for analyzing "segment errors". Do not think this is the same as the four methods of "back" written by Mr. Kong Yiji, because each method has its own applicability and applicable environment, please use it as appropriate or follow the doctor's advice.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.