Transfer from http://blog.csdn.net/sduliulun/article/details/7732906
Reference Document: Http://bbs.ednchina.com/BLOG_ARTICLE_1772918.HTM
What is a valgrind?
Valgrind is a set of simulation debugging Tools for Linux, open source (GPL V2). The valgrind consists of the kernel and other kernel-based debugging tools. The kernel is similar to a framework (framework) that simulates a CPU environment and provides services to other tools, while other tools are similar to plug-ins (plug-in) that use the services provided by the kernel to perform a variety of specific memory debugging tasks. The architecture of the Valgrind is as follows:
Structure diagram of Valgrind
Valgrind includes some of the following tools:
- Memcheck. This is the most widely used tool in Valgrind, a heavyweight memory checker that discovers most memory errors in development, such as using uninitialized memory, using freed memory, memory access, and so on. This is also the part that this article will focus on.
- Callgrind. It is primarily used to check for problems that occur during function calls in the program.
- Cachegrind. It is primarily used to check for problems with cache usage in the program.
- Helgrind. It is primarily used to check for competition issues that occur in multithreaded programs.
- Massif. It is primarily used to check for problems that occur in the stack usage in the program.
- Extension. You can use the functionality provided by the core to write your own specific memory debugging tools
Memory space placement under Linux:
A typical Linux C program memory space consists of the following parts:
- code snippet (. Text). This is where the CPU is going to execute the instructions. Code snippets are shareable, the same code has only one copy in memory, and this segment is read-only, preventing the program from modifying its own instructions due to errors.
- initializes the data segment (. data). here is a variable that needs to be explicitly assigned the initial value in the program, such as a global variable outside of all functions: int val= "100". It should be emphasized that the above two paragraphs are in the program's executable file, and the kernel reads from the source program file when calling the Exec function to start the program.
- Uninitialized data segment (. BSS). The data in this section is initialized to 0 or null before the kernel executes the program. For example, a global variable that appears outside of any function: int sum;
- Heap. This section is used to make a dynamic memory request in a program, such as a frequently used Malloc,new series function to request memory from this segment.
- stack (stack). The local variables in the function and the temporary variables that are produced during the function call are saved in this paragraph.
Memcheck can detect memory problems, the key is that it has established two global tables.
- Valid-value table:
For each byte in the entire address space of the process, there are 8 bits corresponding to it, and there is a bit vector corresponding to each register of the CPU. These bits are responsible for recording the byte or whether the register value has a valid, initialized value.
- Valid-address table
For each byte in the process's entire address space (byte), there are 1 bits corresponding to it, which is responsible for recording whether the address can be read or written.
Detection principle:
- When you want to read and write a byte in memory, first check the byte of a bit. If the a bit shows that the location is invalid, Memcheck reports a read-write error.
- The kernel (core) is similar to a virtual CPU environment, so that when a byte in memory is loaded into the real CPU, the corresponding V bit of that byte is also loaded into the virtual CPU environment. Once the value in the register is used to generate the memory address, or the value can affect the program output, Memcheck checks the corresponding v bits, and if the value has not been initialized, it reports the use of an uninitialized memory error.
Valgrind Use
Usage: valgrind [Options] Prog-and-args [options]: Common options for all valgrind tools
- -tool=<name> the most common options. Run the tool named ToolName in Valgrind. Default Memcheck.
- H–HELP Displays help information.
- -version Displays the version of the Valgrind kernel, with each tool having its own version.
- Q–quiet runs silently, printing only error messages.
- V–verbose more detailed information, increase the number of error statistics.
- -trace-children=no|yes Tracking Child threads? [No]
- -track-fds=no|yes trace the Open file description? [No]
- -time-stamp=no|yes add time stamp to log information? [No]
- -log-fd=<number> output log to descriptor file [2=stderr]
- The-log-file=<file> writes the output information to filename. PID file, the PID is the ID of the running program
- -log-file-exactly=<file> output log information to file
- -log-file-qualifier=<var> gets the value of the environment variable to be the file name of the output information. [None]
- -log-socket=ipaddr:port output log to socket, Ipaddr:port
Log information output
- -xml=yes output information in XML format, only Memcheck available
- -num-callers=<number> show <number> callers in stack traces [12]
- -error-limit=no|yes if too many errors, stop displaying the new error? [Yes]
- -error-exitcode=<number> returns an error code if an error is found [0=disable]
- -db-attach=no|yes When an error occurs,Valgrind automatically launches the debugger gdb. [No]
- -db-command=<command> command-line options to start the debugger [GDB-NW%f%p]
Relevant options for the Memcheck tool:
- -leak-check=no|summary|full require detailed information on leak? [Summary]
- -leak-resolution=low|med|high how much bt merging in leak check [low]
- -show-reachable=no|yes show reachable blocks in leak check? [No]
Valgrind Use Example (i)
The following is a problematic C program code TEST.C
#i nclude <stdlib.h>void f (void) { int* x = malloc (int)); X[10] = 0; Issue 1: Array subscript out of bounds} //Issue 2: Memory not released
int main (void) { f (); return 0; }
Use valgrind to check the program bugvalgrind--tool=memcheck--leak-check=full./test
Use of uninitialized memory issues
Problem Analysis:
For variables that are in different segments of the program, their initial values are different, and the initial values for global and static variables are 0, while local variables and dynamically requested variables have their initial values as random values. If a program uses a variable that is random, the behavior of the program becomes unpredictable.
The following program is a common scenario in which uninitialized variables are used. Array A is a local variable whose initial value is a random value, and the initialization does not initialize all its array members, so there is a potential memory problem when using the array next.
Results Analysis:
Suppose this file is named:badloop.c, and the generated executable is Badloop. Test it with Memcheck, and the output is as follows.
The output shows that in line 11th of the program, the program's jump relies on an uninitialized variable. The problems in the above-mentioned procedure are found accurately.
Memory read/write out of bounds
Problem Analysis:
This situation refers to access to the memory address space that you should not/do not have access to, such as out of bounds when accessing an array, and exceeding the requested memory size range for dynamic memory access. The following program is a typical array out-of-bounds problem. PT is a local array variable with a size of 4,p initially pointing to the starting address of the PT array, but after the P loop is superimposed, p exceeds the range of the PT array, and if you write to p at this point, the consequences will not be expected.
Results Analysis:
Assuming this file is named Badacc.cpp, the resulting executable program is BADACC, tested with Memcheck, and output as follows.
The output shows an illegal write operation on line 15th of the program, and an illegal read operation on line 16th. Accurately identified the above problems.
Memory overwrite
Problem Analysis:
C language is powerful and scary is that it can directly manipulate memory, C standard library provides a large number of such functions, such as strcpy, strncpy, memcpy, strcat, etc., these functions have a common feature is the need to set the source address (SRC), and the destination address (DST), The addresses that SRC and DST point to cannot overlap, otherwise the results will not be expected.
Here is an example of a src and DST overlap. In lines 15 and 17, SRC and DST point to an address that differs by 20, but the specified copy length is 21, which overwrites the previous copy value. The 24th line of the program is similar, the SRC (x+20) and DST (x) point to the address of 20, but the length of DST is 21, which will also occur memory overwrite.
Results Analysis:
Assuming this file is named Badlap.cpp, the resulting executable program is BADLAP, tested with Memcheck, and output as follows.
The output shows the 15,17,24 line in the above program, and the source address and destination address settings overlap. The above problems were found accurately.
Dynamic memory Management Errors
Problem Analysis:
Common memory allocation methods are divided into three kinds: static storage, stack allocation, heap allocation. Global variables are static storage, they are allocated storage space at compile time, local variables within functions are allocated on the stack, and the most flexible memory usage is allocated on the heap, also called memory dynamic allocation. Commonly used memory dynamic allocation functions include: malloc, Alloc, realloc, new, etc., dynamic release functions including free, delete.
Once the dynamic memory is successfully applied, we need to manage it ourselves, which is the most error-prone. The following program includes errors that are common in memory dynamic management.
Common memory dynamic management errors include the following:
-
- Inconsistent application and release
Because C + + is compatible, and C is different from C + + memory request and release functions, there are two sets of dynamic memory management functions in C + + programs. One immutable rule is that the memory applied in C is released in C, and the memory applied in C + + is released in C + +. That is, the memory that is applied by the Malloc/alloc/realloc method is released with free, and the memory requested in new mode is freed with delete. In the above procedure, the use of malloc to apply for memory and delete to release, although this will not be a problem in many cases, but this is definitely a potential problem.
-
- Application and release mismatch
How much memory is requested and how much will be released after the use is completed. If not released, or less released is a memory leak, more release will also cause problems. In the above procedure, the pointer p and PT point to the same piece of memory, but are released two times in succession.
-
- Still read and write after release
Essentially, the system will maintain a dynamic memory linked list on the heap, if released, it means that the block memory can continue to be allocated to other parts, if the memory is freed and then accessed, it may overwrite the other part of the information, this is a serious error, the above program in line 16th is released after the release of the memory is still written.
Results Analysis:
Assuming this file is named Badmac.cpp, the resulting executable program is BADMAC, tested with Memcheck, and output as follows.
The output shows that the 14th row allocation and deallocation functions are inconsistent, the 16th line has an illegal write operation, that is, the memory address to be freed, and the 17th line frees the memory function to be invalid. The above three problems were found accurately.
Memory leaks
Problem Description:
Memory leak refers to the memory that is dynamically requested in the program and is not released after use and cannot be accessed by other parts of the program. Memory leaks are the most vexing problem in developing large programs, so that some people say that memory leaks are unavoidable. In fact, to prevent memory leaks from good programming habits, the other important thing is to strengthen unit test, and Memcheck is such an excellent tool.
The following is a typical memory leak case. The main function calls the MK function to generate the tree node, but after the call is complete, there is no corresponding function: Nodefr frees the memory so that the tree structure in memory cannot be accessed by other parts, causing a memory leak.
In a single function, everyone's memory leak awareness is relatively strong. However, in many cases, we will do some packaging for malloc/free or new/delete to meet our specific needs and not be able to use and release in one function. This example also illustrates where memory leaks are most likely to occur: The two-part interface part, a function to request memory, and a function to free up memory. And these functions are developed and used by different people, which makes memory leaks more likely. This requires a good habit of unit testing to eliminate memory leaks in the initial phase.
Results Analysis:
Assuming that the above file fame tree.h, Tree.cpp, Badleak.cpp, the generated executable program is badleak, test it with Memcheck, output is as follows.
The sample program is the process of building a tree, with each tree node having a size of 12 (considering memory alignment), a total of 8 nodes. As can be seen from the above output, all memory leaks are discovered. Memcheck divides memory leaks into two types, one is the possible memory leak (possibly lost), and the other is a deterministic memory leak (definitely lost). Possibly lost refers to a pointer that still has access to a block of memory, but the pointer is no longer the first address of the memory. Definitely lost refers to the memory that has not been able to access this block. The definitely lost is divided into two types: direct and indirect (indirect). The direct and indirect difference is that there is no pointer to the memory directly, and the pointer to that memory is located at the memory leak. In the above example, the root node is directly lost, while the other nodes are indirectly lost.
Introduction to the use of "turn" Valgrind