C ++ Memory leakage mechanism

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For a C/C ++ programmer, memory leakage is a common and troublesome problem. Many technologies have been developed to address this problem, such
Smart pointer and garbage collection. Smart pointer technology is relatively mature. STL already supports smart
Pointer class, but it does not seem to be widely used, and it cannot solve all the problems. Garbage collection technology is mature in Java, however, the development in the C/C ++ field is not smooth, although some people have long considered that GC support is also added to C ++. The real world is like this. As a C/C ++ programmer, memory leakage is always in your heart. Fortunately, there are many tools that can help us verify the existence of memory leaks and find out the problematic code.

　　Memory leakage Definition

　
Generally, memory leakage refers to heap memory leakage. Heap memory refers to the memory allocated by the program from the heap, which is of any size (the size of the memory block can be determined during the running period). After use, the released memory must be displayed.
Save. Applications generally use functions such as malloc, realloc, and new to allocate a block of memory from the heap. after use, the program must call the free or delete function to release the block.
Memory block. Otherwise, this memory cannot be used again. We will say this memory is leaked. The following applet demonstrates heap memory leakage:

Void myfunction (INT nsize)
{
Char * P = new char [nsize];
If (
! Getstringfrom (p, nsize)
){
MessageBox ("error ");
Return;
}
... // Using the string pointed
By P;
Delete
P;
}

Example 1

When the getstringfrom () function returns zero, the memory pointed to by the pointer P will not be released. This is a common case of Memory leakage. The program allocates memory at the entrance and releases the memory at the exit. However, the C function can exit from anywhere. Therefore, a memory leak may occur if a certain exit does not release the memory that should be released.

　
Broadly speaking, memory leaks include not only heap memory leaks, but also system resource leaks (such as core handle, GDI object, socket,
Interfaces and so on. Basically, these objects allocated by the operating system also consume memory. If these objects are leaked, memory leakage will eventually occur. In addition, some objects consume the core state.
When these objects are seriously leaked, the entire operating system is unstable. Therefore, the system resource leakage is more serious than the heap memory leakage.

GDI
Object leakage is a common resource leakage:

Void cmyview: onpaint (CDC * PDC)
{
Cbitmap BMP;
Cbitmap *
Poldbmp;
BMP. loadbitmap (idb_mybmp );
Poldbmp = PDC-> SelectObject (
& BMP );
...
If (something ()
){
Return;
}
PDC-> SelectObject (poldbmp
);
Return;
}

Example 2

　
When something () returns a non-zero value, the program does not select poldbmp back to PDC before exiting. This causes the hbitmap object pointed to by poldbmp to leak.
Leakage. If this program runs for a long time, the entire system may be blurred. This problem is easily exposed in Win9x, because the GDI Heap of Win9x is much smaller than that of Win2k or NT.
Yes.

Memory leakage occurs in the following ways:

Memory leakage can be classified as follows:

1.
Frequent Memory leakage. Code with Memory leakage will be executed multiple times, resulting in a memory leak each time it is executed. For example, if the something () function returns true all the time, the hbitmap object pointed to by poldbmp always leaks.

　
2. Occasional Memory leakage. Memory leakage occurs only in certain environments or operations. For example, if the something () function is returned only in a specific environment
True, the hbitmap object pointed to by poldbmp does not always leak. The frequency and frequency are relative. For a specific environment, unexpected events may become frequent. So
The test environment and test method are crucial for detecting memory leaks.

3.
One-time memory leakage. The code with Memory leakage is executed only once, or due to algorithm defects, there will always be only one piece of Memory leakage. For example, to allocate memory in the class constructor
The memory is not released in the constructor, but because this class is a Singleton, the memory leakage only occurs once. Another example:

Char * g_lpszfilename = NULL;

Void setfilename (const char *
Lpcszfilename)
{
If (g_lpszfilename ){
Free (g_lpszfilename
);
}
G_lpszfilename = strdup (lpcszfilename
);
}

Example 3

If the program does not release the string pointed to by g_lpszfilename at the end, even if setfilename () is called multiple times, there will always be one piece of memory, and only one piece of memory will leak.

　
4. Implicit Memory leakage. The program continuously allocates memory during the running process, but it does not release the memory until it ends. Strictly speaking, there is no memory leakage because the program releases all requested memory. However
It is for a server program that needs to run for several days, weeks, or even months. If the memory is not released in time, it may eventually exhaust all the memory of the system. Therefore, we call this type of Memory leakage an implicit memory leak. For example:

Class connection
{
Public:
Connection (socket
S );
~ Connection ();
...
PRIVATE:
Socket
_ Socket;
...
};

Class
Connectionmanager
{
Public:
Connectionmanager (){}
~ Connectionmanager (){
List: iterator
It;
For (IT = _ connlist. Begin (); it! = _ Connlist. End (); ++ it
){
Delete (* it );
}
_ Connlist. Clear ();
}
Void
Onclientconnected (socket s ){
Connection * P = new
Connection (s );
_ Connlist. push_back (P );
}
Void
Onclientdisconnected (connection * pconn ){
_ Connlist. Remove (pconn
);
Delete pconn;
}
PRIVATE:
List
_ Connlist;
};

Example 4

　
Assume that after the client is disconnected from the server, the server does not call the onclientdisconnected () function.
The connection object will not be deleted in time (when the server program exits, all the connection objects will be analyzed in the connectionmanager
Is deleted ). Implicit Memory leakage occurs when connections are established and disconnected constantly.

From the user's perspective, the memory leakage will not cause any harm.
For general users, there is no memory leakage. The real danger is the accumulation of Memory leakage, which will eventually consume all the memory of the system. From this point of view, there is no risk for one-time memory leakage.
Because it does not accumulate, and Implicit Memory leakage is very harmful, because it is more difficult to detect than frequent and occasional memory leaks.

Detect memory leakage

The key to detecting memory leaks is to be able to intercept calls to functions that allocate and release memory. By intercepting these two functions, we can track each function.
Block memory lifecycle. For example, every time a block of memory is successfully allocated, its pointer is added to a global list; every time a block of memory is released, delete its pointer from the list. In this way, when
At the end of the program, the remaining pointer in the list points to the memory that has not been released. Here is a simple description of the basic principle of detecting memory leaks. For detailed algorithms, see Steve
<Writing solid code> of Maguire.

To detect heap memory leakage, intercept
Malloc/realloc/free and new/delete can be used (in fact, new/delete eventually uses malloc/free, so as long as the previous
). For other leaks, you can use a similar method to intercept the corresponding allocation and release functions. For example, to detect BSTR leaks, you must intercept
Sysallocstring/sysfreestring; to detect hmenu leaks, you must intercept createmenu/
Destroymenu. (There are multiple resource allocation functions and only one release function. For example, sysallocstringlen can also be used to allocate BSTR.
Multiple allocation functions are intercepted)

In Windows platform, memory leakage detection tools commonly used three, ms c-runtime
Library built-in detection functions; external detection tools, such as purify and boundschecker; Use the performance provided by Windows NT
Monitor. These three tools have their own advantages and disadvantages, ms c-Runtime Library although the function is weaker than the external hanging tool, but it is free; Performance
Although the monitor cannot mark the code that causes the problem, it can detect the existence of Implicit Memory leakage, which is beyond the control of other two tools.

We will discuss in detail the three detection tools below:

Memory leakage detection method under VC

Applications developed with MFC are automatically added with Memory leakage detection code after compilation in debug mode. After the program ends, if memory leakage occurs, all leaked memory blocks will be displayed in the debug window, the following two lines show the information of a leaked memory block:

E:/testmemleak/testdlg. cpp (70)
: {59} normal block at 0x00881710,200 bytes long.

Data:
<Abcdefghijklmnop> 61 62 63 64 65 66 67 68 69 6a 6B 6C 6D 6e 6f
70

　
The first line shows that the memory block is allocated by the testdlg. cpp file. The address is 0 x 70th and the size is 00881710 bytes. {59} refers to the request that calls the memory allocation function.
For more information, see the help of _ crtsetbreakalloc () in msdn. The second line shows the content of the first 16 bytes of the memory block.
ASCII display, followed by hexadecimal display.

Generally, we mistakenly assume that these memory leak detection functions are provided by MFC, but they are not. MFC only
Encapsulate and utilize the debug function of Ms C-Runtime Library. Non-MFC programs can also use the debug function of the Ms C-Runtime Library.
Function is added to the memory leakage detection function. Ms c-runtime
Library has built-in memory leakage detection function when implementing functions such as malloc/free and strdup.

Note that the MFC Application
The project generated by wizard has such a macro definition in the header of each CPP file:

# Ifdef _ debug
# Define new debug_new
# UNDEF this_file
Static char
This_file [] =
_ File __;
# Endif

With this definition, all new in this CPP file will be replaced with debug_new when the debug version is compiled. So what is debug_new? Debug_new is also a macro. The following is from afx. H, row 1632.

# Define debug_new new (this_file,
_ Line __)

So if there is such a line of code:

Char * P = new char [200];

After macro replacement, it becomes:

Char * P = new (this_file,
_ Line _) Char [200];

According to the C ++ standard, the compiler will find the operator defined in this way for the above new usage.
New:

Void * operator new (size_t, lpcstr,
INT)

We found an operator new like this in row afxmem. cpp 63.
Implementation

Void * afx_cdecl operator new (size_t nsize, lpcstr lpszfilename, int
Nline)
{
Return: Operator new (nsize, _ normal_block, lpszfilename,
Nline );
}

Void * _ cdecl operator new (size_t nsize, int ntype, lpcstr
Lpszfilename, int nline)
{
...
Presult = _ malloc_dbg (nsize, ntype,
Lpszfilename, nline );
If (presult! = NULL)
Return
Presult;
...
}

The second operator
The new function is relatively long. For the sake of simplicity, I only extract some. Obviously, the final memory allocation is implemented through the _ malloc_dbg function, which belongs to Ms C-runtime.
Library debug
Function. This function not only requires the input memory size, but also has two parameters: File Name and row number. The file name and row number are used to record the code produced by this allocation. If
If the program is not released before it ends, the information is output to the debug window.

Here, by the way, this_file ,__ file and
_ Line __. Both _ file _ and _ line _ are macros defined by the compiler. When _ file _ is encountered, the compiler replaces _ file _ with a string.
String is the path name of the currently compiled file. When _ line _ is encountered, the compiler replaces _ line _ with a number, which is the row number of the current line of code. In
The definition of debug_new does not directly use _ file __, but uses this_file to reduce the size of the target file. Assume that a CPP file contains
New is used in Section 100. If _ file __is used directly, the compiler will generate 100 constant strings, all of which are constant? /Span> CPP file path
Obviously redundant. If this_file is used, the compiler will generate only one constant string. In this case, all new calls at Part 1 use pointers to constant strings.

　
Next, let's take a look at the MFC Application
For projects generated by wizard, we will find that only new is mapped in the CPP file. If you use the malloc function to allocate memory directly in the program, call the malloc file name and line
No. Ms c-runtime if this memory leaks
Library can still be detected, but when the output information of this memory block does not contain the file name and row number allocated to it.

To enable the Memory Leak Detection Function in a non-MFC program, you only need to add the following lines of code at the entrance of the program:

Int tmpflag = _ crtsetdbgflag (_ crtdbg_report_flag );

Tmpflag | =
_ Crtdbg_leak_check_df;

_ Crtsetdbgflag (tmpflag
);

In this way, after the function winmain, main, or dllmain returns, if there are still memory blocks not released, their information will be printed into the debug window.

If you try to create a non-MFC Application and add the above Code at the entrance of the program, and deliberately do not release some memory blocks in the program, you will see the following information in the debug window:

{47} normal block at 0x00c91c90, 200 bytes long.

Data: <> 00
01 02 03 04 05 06 07 08 09 0a 0b 0C 0d 0e
0f

The memory leakage is indeed detected, but the file name and row number are missing compared with the above MFC program example. It is very difficult to solve problems for a large program without such information.

To know where the leaked memory block is allocated, You need to implement a MFC ing function similar to MFC to map functions such as new and maolloc to the _ malloc_dbg function. I will not go into details here. You can refer to the source code of MFC.

　
Because the debug function is implemented in MS
C-runtimelibrary, so it can only detect heap memory leakage, and only limited to memory allocated by malloc, realloc or strdup, and those system resources
Source, such as handle, GDI object, or not through C-runtime
Memory allocated by the Library, such as the leakage of variant and BSTR, cannot be detected, which is a major limitation of this method. In addition, in order to record where the memory block is
The allocated source code must be matched accordingly. It is very troublesome to debug some old programs. After all, modifying the source code is not a worry-free task. This is another limitation of this method.

Pair
In the development of a large program, ms c-runtime
Library provides far from enough detection functions. Next we will look at the external inspection tools. I use more boundschecker, because it has more functions.
And more importantly, its stability. If these tools are unstable, they will be too busy. In the end, it's from the famous numbench. Basically, there is no big problem in using it.
Use boundschecker to detect memory leakage:

Boundschecker is called
Code injection technology to intercept calls to functions that allocate and release memory. To put it simply, when your program starts running, the boundschecker DLL is automatically loaded
(This can be implemented through the system-level hook), and then it modifies the function calls for memory allocation and release in the process, so that these calls are first transferred to its code, and then
Then execute the original code. Boundschecker does not need to modify the source code or project configuration file of the program to be debugged, which makes it very simple and direct.

Here we use the malloc function as an example to intercept other functions in a similar way.

Functions to be intercepted may be in DLL or program code. For example, if the static link C-runtime
Library, then the code of the malloc function will be linked to the program. To intercept calls to such functions, boundschecker dynamically modifies the commands of these functions.

The following two pieces of assembly code, one without boundschecker intervention, and the other with boundschecker intervention:

126: _ cribd void * _ cdecl malloc (
127: size_t nsize
128 :)
129:
{

00403c10 push EBP
00403c11 mov EBP, ESP
130: Return
_ Nh_malloc_dbg (nsize, _ newmode, _ normal_block, null, 0 );
00403c13 push
0
00403c15 push 0
00403c17 Push 1
00403c19 mov eax, [_ newmode
(0042376c)]
00403c1e push eax
00403c1f mov ECx, DWORD PTR
[Nsize]
00403c22 push ECx
00403c23 call _ nh_malloc_dbg
(00403c80)
00403c28 add ESP, 14 h
131:
}

The following code involves boundschecker:

126: _ cribd void * _ cdecl malloc (
127: size_t nsize
128 :)
129:
{

00403c10 JMP 01f41ec8
00403c15 push 0
00403c17 Push 1
00403c19
MoV eax, [_ newmode (0042376c)]
00403c1e push eax
00403c1f mov ECx, DWORD
PTR [nsize]
00403c22 push ECx
00403c23 call _ nh_malloc_dbg
(00403c80)
00403c28 add ESP, 14 h
131 :}

　
After boundschecker intervened, the first three assembly commands of the malloc function were replaced with a JMP command. The original three commands were moved to the address 01f41ec8. Dangcheng
After going to malloc in sequence, run JMP to 01f41ec8, execute the original three commands, and then the world of boundschecker. In general, it records the return address of the function first.
(The return address of the function is on the stack, so it is easy to modify), then point the return address to the Code belonging to the boundschecker, and then jump to the original instruction of the malloc function.
It is in the place of 00403c15. When the malloc function ends, the return address is modified, and it is returned to the boundschecker code.
Boundschecker records the memory pointer allocated by malloc, and then jumps to the original return address.

If the memory allocation/release functions are in the DLL, boundschecker uses another method to intercept calls to these functions. Boundschecker modifies the program's DLL
Import table points the function address in table to its own address for interception.

　
By intercepting these allocation and release functions, the boundschecker can record the lifecycle of allocated memory or resources. The next question is how it relates to the source code, that is, when
Boundschecker detects Memory leakage. How does it report the Code allocated for this memory block. The answer is debug
Information ). When we compile a debug program, the compiler will record the correspondence between the source code and the binary code and put it in a separate file.
(. PDB) or directly link to the target program. By directly reading the debugging information, you can obtain the file on which the source code of a memory is allocated and the line on which the source code is located. Use code injection and debug
Information, so that boundschecker can not only record the location of the source code of the call assignment function, but also record the call stack and call
The source code position of the function on the stack. This is very useful when using a class library like MFC. Here is an example:

Void showxitemmenu ()
{
...
Cmenu
Menu;

Menu. createpopupmenu ();
// Add menu
Items.
Menu. trackpropupmenu ();
...
}

Void showyitemmenu (
)
{
...
Cmenu menu;
Menu. createpopupmenu ();
// Add menu
Items.
Menu. trackpropupmenu ();
Menu. Detach (); // This will cause hmenu
Leak
...
}

Bool cmenu: createpopupmenu ()
{
...
Hmenu =
Createpopupmenu ();
...
}

　
When showyitemmenu () is called, we intentionally cause hmenu leakage. However, for boundschecker, the leaked hmenu is in the class
Cmenu: allocated in createpopupmenu. Assume that many of your programs use the createpopupmenu () function of cmenu, such
Cmenu: createpopupmenu (). You still cannot confirm the root cause of the problem.
In showyitemmenu (), or is createpopupmenu () used in other places ()? With call
Stack information, the problem is easy. Boundschecker reports the leaked hmenu information as follows:

Function
File
Line

Cmenu: createpopupmenu
E:/8168/vc98/mfc/include/afxwin1.inl
1009

Showyitemmenu
E:/testmemleak/mytest. cpp
100

Other function calls are omitted here.

In this way, we can easily find that the function that causes the problem is showyitemmenu (). When you program using a class library such as MFC, most API calls are encapsulated in the class of the class library, with the call
Stack information, we can easily track the truly leaked code.

Record call
Stack information slows down the program, so boundschecker does not record the call stack information by default. Follow these steps to open the record call
Stack information Option Switch:

1. Open the menu: boundschecker | setting...

2. In error
On the detection page, select custom from the list of error detection scheme

3.
Select pointer and leak error check in the combox of category.

4. Hook the report call
Stack check box

5. Click OK.

Based on code injection, boundschecker also provides APIs
Parameter verification function, memory over
Run and other functions. These functions are very beneficial for program development. Because these contents do not belong to the topic of this article, we will not detail them here.

Although boundschecker is so powerful, it still seems pale in the face of Implicit Memory leakage. So let's take a look at how to use performance
Monitor detects Memory leakage.

Use Performance
Monitor detects Memory leakage
During the design process, the NT kernel has added the system monitoring function, such as CPU usage, memory usage, and I/O operation frequency, applications can read these counters to understand the running status of the entire system or a process. Performance
Monitor is such an application.

To detect memory leakage, we can monitor handle count and virutal of the process object.
Bytes and working set. Handle
Count records the number of handle opened by the process. Monitoring the counter helps us find whether the program has handle leakage. Virtual
Bytes records the size of the virtual memory used by the process in the virtual address space. NT Memory Allocation adopts two steps. First, in this case, the operating system
No physical memory is allocated, but a segment of address is reserved. Then, submit the space and the operating system will allocate physical memory. Therefore, virtual bytes is generally larger than the working of the program.
Set. Monitoring virutal bytes can help us find some underlying system problems; Working
Set records the total amount of memory submitted by the operating system for the process. This value is closely related to the total amount of memory applied by the program. If the program memory leaks, this value will continue to increase, but virtual
Bytes is a skip increase.
Monitoring these counters allows us to understand the memory usage of processes. If there is a leak, even if it is hidden
Memory leakage, and the counter value will continue to increase. However, we know that there is a problem, but we do not know where there is a problem, so we generally use performance
Monitor to verify whether there is a memory leak, and use boundschecker to locate and solve it.
When Performance
The monitor displays memory leaks, but the boundschecker cannot detect them. There are two possibilities: first, occasional memory leaks. In this case, make sure to use performance
When the monitor and boundschecker are used, the running environment and operation method of the program are the same. Second, implicit memory leakage occurs. Now you have to review the design of the program, however
After careful study of performance
The counter value change graph recorded by the monitor analyzes the relationship between the change and the program running logic, and finds some possible causes. This is a painful process, full of assumptions, conjecture, and verification.
Evidence, failure, but this is also an excellent opportunity to accumulate experience.
Summary
Memory leakage is a big and complex problem, even if Java and. Net have gabarge
There is also a possibility of leakage in the collection mechanism environment, such as implicit memory leakage. Due to space limitations and capacity limitations, this article can only make a rough research on this topic. Other problems, such
Multi-module leakage detection, how to analyze memory usage during the running of the program, and so on can be further studied. If you have any ideas, suggestions, or errors, please contact me.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

C ++ Memory leakage mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

C ++ Memory leakage mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support