[Turn] A brief talk on memory leakage and its detection tools

Source: Internet
Author: User

Transferred from: http://www.cnblogs.com/taoxu0903/archive/2007/10/27/939261.html

A memory leak is a common and frustrating problem for a C + + programmer. A number of techniques have been developed to address this problem, such as Smart pointer,garbage collection. Smart pointer technology is more mature, the STL already contains the class that supports smart pointer, but it doesn't seem to be widely used, and it doesn't solve all the problems; garbage collection technology is mature in Java, However, the development of the C + + field is not smooth, although it was early thought that in C + + also added GC support. The real world is like this, as a C + + programmer, memory leaks are the pain in your heart forever. Fortunately, however, there are many tools that can help us verify the existence of a memory leak and identify the code that has the problem.

  definition of memory leaks

In general, the memory leaks that we often say refer to the leaks in heap memory. Heap memory means that the program is allocated from the heap, arbitrarily sized (the size of the memory block can be determined during the program's run time), and the freed memory must be displayed after use. Applications typically use functions such as malloc,realloc,new to allocate a piece of memory from the heap, and after use, the program must be responsible for the corresponding call to free or delete to release the memory block, otherwise the memory cannot be reused, we say this memory leak. The following applet demonstrates a case where a heap memory leak occurs:

void MyFunction (int nSize)
{
char* p= New Char[nsize];
if (! Getstringfrom (P, nSize)) {
MessageBox ("Error");
Return
}
...//using the string pointed by P;
Delete p;
}


Example One

When the function Getstringfrom () returns zero, the memory pointed to by the pointer P is not freed. This is a common scenario in which a memory leak occurs. The program allocates memory at the entrance, frees the memory at the exit, but the C function can exit anywhere, so a memory leak occurs whenever there is an exit where the memory that should be freed is not released.

Broadly speaking, memory leaks contain not only the leak of heap memory, but also the leakage of system resources (resource leak), such as nuclear mentality handle,gdi object,socket, interface, etc., fundamentally, these objects allocated by the operating system also consume memory, If these objects leak, they can eventually lead to memory leaks. Also, some objects consume kernel-mind memory, which can cause the entire operating system to become unstable when they are severely compromised. Therefore, the leakage of system resources is more serious than the leak of heap memory.

The disclosure of GDI object is a common resource leak:

void Cmyview::onpaint (CDC * PDC)
{
CBitmap bmp;
cbitmap* poldbmp;
BMP. LoadBitmap (idb_mybmp);
Poldbmp = Pdc->selectobject (&bmp);
...
if (Something ()) {
return;
}
Pdc->selectobject (poldbmp);
return;
}


Example Two

When the function something () returns nonzero, the program does not select Poldbmp back to the PDC before exiting, which causes the Hbitmap object that poldbmp points to leak. If this program runs for a long time, it may cause the whole system to spend the screen. This problem is more easily exposed under Win9x, because the Win9x GDI heap is much smaller than Win2K or NT.

How a memory leak occurs:

In the way that happens, memory leaks can be categorized into 4 categories:

1. Frequent memory leaks. The code that occurs in memory leaks is executed multiple times, causing a memory leak each time it is executed. For example two, if the something () function always returns True, the Hbitmap object that poldbmp points to always leaks.

2. Accidental memory leaks. Code that occurs with a memory leak occurs only under certain circumstances or during operation. For example two, if the Something () function returns True only in a specific environment, the Hbitmap object pointed to by Poldbmp does not always leak. The occurrence and the incidental sex are opposite. For a given environment, the occasional may become a frequent occurrence. So test environments and test methods are critical to detecting memory leaks.

3. Disposable memory leaks. The code that occurs with a memory leak is only executed once, or because of an algorithmic flaw, there is always a piece of memory that leaks. For example, allocating memory in the class's constructor does not release the memory in the destructor, but because the class is a singleton, a memory leak only occurs once. Another example:

char* g_lpszfilename = NULL;

void Setfilename (const char* lpcszfilename)
{
if (g_lpszfilename) {
Free (g_lpszfilename);
}
G_lpszfilename = StrDup (lpcszfilename);
}


Example Three

if the program does not release the string that G_lpszfilename points to at the end, there is always a piece of memory, even if you call Setfilename () multiple times, and only one memory leaks.

4. An implicit memory leak. The program keeps allocating memory while it is running, but it does not release memory until the end. Strictly speaking, there is no memory leak, because the final program frees up all the requested memory. But for a server program that needs to run for days, weeks, or months, not releasing memory in time can also result in the eventual exhaustion of all of the system's memory. So, we call this kind of memory leak as an implicit memory leak. Give an example:  

Class Connection
{
Public
Connection (SOCKET s);
~connection ();
...
Private
SOCKET _socket;
...
};

Class ConnectionManager
{
Public
ConnectionManager () {}
~connectionmanager () {
List::iterator it;
for (it = _connlist.begin (); It! = _connlist.end (); ++it) {
Delete (*it);
}
_connlist.clear ();
}
void onclientconnected (SOCKET s) {
connection* p = new Connection (s);
_connlist.push_back (P);
}
void onclientdisconnected (connection* pconn) {
_connlist.remove (Pconn);
Delete Pconn;
}
Private
List _connlist;
};


Example Four

Assuming that the server does not call the onclientdisconnected () function after the client disconnects from the server, the connection object representing that connection will not be deleted in a timely manner (when the server program exits, All connection objects are deleted in the destructor of ConnectionManager). An implicit memory leak occurs when there are constant connections established and disconnected.

From the user's point of view of using the program, the memory leak itself does not have any harm, as a general user, there is no sense of memory leaks. What is really harmful is the accumulation of memory leaks, which eventually consumes all the memory of the system. From this point of view, a one-time memory leak is harmless, because it does not accumulate, and the implicit memory leak is very harmful because it is more difficult to detect than the usual and sporadic memory leaks.
Detecting Memory leaks

The key to detecting a memory leak is to be able to intercept calls to functions that allocate memory and free memory. Intercept these two functions, we can track the life cycle of each piece of memory, for example, each time a successful allocation of memory, the pointer is added to a global list, whenever a piece of memory is freed, and then its pointer is removed from the list. Thus, when the program finishes, the remaining pointers in the list are pointing to memory that is not freed. This is simply a simple description of the basic principle of detecting memory leaks, and the detailed algorithm can be found in the <<writing Solid code>> of Steve Maguire.

If you want to detect the heap memory leakage, then you need to intercept malloc/realloc/free and New/delete can be (in fact, New/delete is also used malloc/free, so long as the previous group can be intercepted). For other leaks, a similar method can be used to intercept the corresponding allocation and deallocation functions. For example, to detect a BSTR leak, you need to intercept sysallocstring/sysfreestring, to detect hmenu leakage, you need to intercept createmenu/destroymenu. (There are a number of allocation functions of the resources, the release function only one, for example, SysAllocStringLen can also be used to assign a BSTR, then you need to intercept multiple allocation functions)

Under the Windows platform, there are three commonly used tools for detecting memory leaks, the MS c-runtime Library built-in detection function, external detection tools, such as Purify,boundschecker, etc., using Windows NT comes with the performance Monitor. Each of the three tools have advantages and disadvantages, MS C-runtime Library, although the function of the tool to be weaker than the plug-in, but it is free; Performance Monitor cannot identify the code that has the problem, but it can detect the existence of an implicit memory leak, This is where the other two types of tools are powerless.

Below we discuss these three kinds of testing tools in detail:

The detection method of memory leak under VC

An application developed with MFC will automatically include a memory leak detection code after compiling in debug mode. At the end of the program, if a memory leak occurs, all of the leaked memory blocks are displayed in the Debug window, and the following two lines show information about a block of leaked memory:

E:\TestMemLeak\TestDlg.cpp (): {0x00881710} normal block at bytes long.

Data: <abcdefghijklmnop>----6A 6B 6C 6D 6E 6F 70

The first line shows that the memory block is allocated by the TestDlg.cpp file, the 70th line of code, the address is in 0x00881710, the size is 200 bytes, and {59} is the Request Order that calls the memory allocation function, and details about it can be found in MSDN _ Help with Crtsetbreakalloc (). The second line shows the contents of the first 16 bytes of the memory block, and the angle brackets are displayed in ASCII, followed by a 16 binary display.

It is generally assumed that these memory leak detection features are provided by MFC, but it is not. MFC simply encapsulates and leverages the debug Function of the MS C-runtime Library. Non-MFC programs can also use the MS C-runtime Library's debug function to add memory leak detection. The MS C-runtime Library has built in memory leak detection functions when implementing functions such as Malloc/free,strdup.

Notice the project generated by the MFC Application Wizard, which has a macro definition in the head of each CPP file:

#ifdef _DEBUG
#define NEW Debug_new
#undef This_file
static char this_file[] = __file__;
#endif


With this definition, all new files appearing in this CPP file are replaced with Debug_new when compiling the debug version. So what is Debug_new? Debug_new is also a macro, the following excerpt from afx.h,1632 line

#define DEBUG_NEW NEW (This_file, __line__)


So if there's a line of code like this:

char* p = new char[200];


After the macro substitution, it becomes:

char* p = new (This_file, __line__) char[200];


According to the standard of C + +, for the use of the new method above, the compiler will look for the definition of operator NEW:

void* operator New (size_t, LPCSTR, int)


We found an implementation of this operator new in Afxmem.cpp 63.

void* afx_cdecl operator New (size_t nSize, LPCSTR lpszfilename, int nLine)
{
Return:: operator new (NSize, _normal_block, lpszFileName, nLine);
}

void* __cdecl operator New (size_t nSize, int nType, LPCSTR lpszfilename, int nLine)
{
...
PResult = _malloc_dbg (NSize, NType, lpszFileName, nLine);
if (pResult! = NULL)
return pResult;
...
}


The second operator new function is longer, and for the sake of simplicity, I only excerpt the part. It is clear that the last memory allocation is still implemented through the _MALLOC_DBG function, which belongs to the debug function of the MS C-runtime Library. This function requires not only the size of the incoming memory, but also the file name and line number two parameters. The file name and line number are used to record which piece of code is responsible for this assignment. If there is no release before the program ends, the information is exported to the Debug window.

Here by the way, This_file,__file and __line__. Both __file__ and __line__ are compiler-defined macros. When __FILE__ is encountered, the compiler replaces __file__ with a string that is the path name of the currently compiled file. When __LINE__ is encountered, the compiler replaces the __line__ with a number, which is the line number of the current line of code. Instead of using __file__ directly in the definition of debug_new, this_file is used to reduce the size of the target file. Assuming that there are 100 uses of new in a CPP file, if you use __file__ directly, the compiler will produce 100 constant strings, and the 100 strings are the path names of the supper./span>cpp file, which is clearly redundant. If you use This_file, the compiler produces only a constant string, and the 100 new call uses a pointer to a constant string.

to look again at the project generated by the MFC Application Wizard, we will find that only new mappings are made in the CPP file, and if you use the malloc function to allocate memory directly in your program, call malloc's file name and line The number is not recorded. If this memory leaks, the MS C-runtime Library can still detect, but when the output of this block of memory information, it will not contain the file name and line number assigned to it.

to open the detection of memory leaks in non-MFC programs is very easy, you simply add the following lines of code at the entrance to the program:

int tmpflag = _CrtSetDbgFlag (_crtdbg_report_flag);

Tmpflag |= _CRTDBG_LEAK_CHECK_DF;

_CrtSetDbgFlag (Tmpflag);


Thus, at the end of the program, when the Winmain,main or DllMain function returns, their information will be printed to the debug window if the memory blocks are not released.

If you try to create a non-MFC application and add the above code at the entrance to the program and intentionally do not release some memory blocks in the program, you will see the following information in the Debug window:

{0X00C91C90} normal block at bytes long.

Data: < > Geneva, Geneva, 0A 0B 0C 0D 0E 0F


The memory leak was detected, but the file name and line number were missing compared to the example of the MFC program above. For a larger program, without this information, solving the problem will become very difficult.

In order to be able to know where the leaking memory blocks are allocated, you need to implement a mapping function similar to MFC and map New,maolloc functions to the _malloc_dbg function. I will not repeat here, you can refer to the source Code of MFC.

Because the debug function is implemented in MS C-runtimelibrary, it detects only the leak of the heap memory and is limited to allocated memory such as Malloc,realloc or strdup, and those system resources, such as Handle,gdi Object, or the memory that is not allocated through the C-runtime library, such as variant,bstr leaks, is undetectable and is a significant limitation of this detection method. In addition, in order to be able to record where the memory block is allocated, the source code must match, which in debugging some of the old program is very troublesome, after all, modify the source code is not a worry, this is another limitation of this detection method.

For the development of a large program, the MS C-runtime library provides detection capabilities that are far from enough. Next we'll look at the plug-in testing tools. I use more is BoundsChecker, one because its function is more comprehensive, more important is its stability. If this kind of tool is not stable, it will be busy in the chaos. In the end is the famous NuMega, I use down basically no big problem.
To detect a memory leak using BoundsChecker:

BoundsChecker uses a technique called Code injection to intercept calls to functions that allocate memory and free memory. Simply put, when your program starts running, BoundsChecker's DLL is automatically loaded into the process's address space (which can be implemented via System-level hooks), and then it modifies the memory allocation and release function calls in the process so that the calls are first transferred to its code, Then execute the original code. BoundsChecker does not need to modify the source code or engineering configuration file of the debugged program when doing these actions, which makes it very simple and straightforward to use.

Here we take the malloc function as an example and intercept other function methods similar to this.

The function that needs to be intercepted may be in the DLL or in the code of the program. For example, if the static link C-runtime Library, then the malloc function code will be linked to the program. To intercept calls to such functions, boundschecker dynamically modifies the instructions for these functions.

The following two pieces of assembly code, a period of no boundschecker intervention, the other section has BoundsChecker intervention:

126: _crtimp void * __cdecl malloc (
127:size_t nSize
128:)
129: {

00403C10 Push EBP
00403C11 mov Ebp,esp
130:return _nh_malloc_dbg (nSize, _newmode, _normal_block, NULL, 0);
00403C13 Push 0
00403C15 Push 0
00403C17 Push 1
00403C19 mov eax,[__newmode (0042376c)]
00403C1E push EAX
00403C1F mov ecx,dword ptr [nSize]
00403C22 push ECX
00403c23 call _nh_malloc_dbg (00403C80)
00403C28 Add esp,14h
131:}


The following section of code has BoundsChecker intervention:

126: _crtimp void * __cdecl malloc (
127:size_t nSize
:)
129: {

00403c10 jmp 01f41ec8
00403c15 push 0
00403C17 push 1
0 0403C19 mov eax,[__newmode (0042376c)]
00403c1e push eax
00403c1f mov ecx,dword ptr [nSize]
00403c22 push ECX
00403c23 call _nh_malloc_dbg (00403C80)
00403c28 add esp,14h
131:}


When BoundsChecker intervened, the first three assembly instructions for the function malloc were replaced with a jmp instruction, and the original three instructions were moved to the address 01f41ec8. When the program enters malloc, JMP goes to 01f41ec8, executes the original three instructions, and then boundschecker the world. Basically it records the return address of the function (the return address of the function is on the stack, so it is easy to modify), then points the return address to the code belonging to BoundsChecker, then jumps to the original instruction of the malloc function, which is where 00403C15. When the malloc function ends, because the return address is modified, it is returned to the BoundsChecker code, where BoundsChecker logs a pointer to the memory allocated by malloc and then jumps to the original return address.

If the memory allocation/deallocation function is in a DLL, BoundsChecker uses another method to intercept calls to these functions. BoundsChecker the function address in table by modifying the DLL Import table of the program to point to its own address for interception purposes.

intercepts these allocations and deallocation functions, and BoundsChecker can record the lifetime of the allocated memory or resource. The next question is how to relate to the source code, that is, when BoundsChecker detects a memory leak, how it reports what code this block of memory is allocated. The answer is debug information (debug information). When we compile a debug version of the program, the compiler will record the correspondence between the source code and the binary code, put it in a separate file (. pdb) or directly into the target program, by directly reading the debug information can be allocated to a block of memory source code in which file, which line. Using code injection and debug information, the BoundsChecker not only records the location of the source code of the call allocation function, but also records the calling stack at the time of allocation, as well as the source location of the functions on the calls stack. This is useful when using a class library like MFC, and I'll use an example to illustrate this:


void Showxitemmenu ()
{
...
CMenu menu;

Menu. CreatePopupMenu ();
Add menu items.
Menu. Trackpropupmenu ();
...
}

void Showyitemmenu ()
{
...
CMenu menu;
Menu. CreatePopupMenu ();
Add menu items.
Menu. Trackpropupmenu ();
Menu. Detach ();//this'll cause HMENU leak
...
}

BOOL Cmenu::createpopupmenu ()
{
...
HMenu = CreatePopupMenu ();
...
}


When calling Showyitemmenu (), we deliberately caused a hmenu leak. However, for BoundsChecker, the leaked hmenu are assigned in class Cmenu::createpopupmenu (). Assuming that your program has many places using CMenu's CreatePopupMenu () function, such as Cmenu::createpopupmenu (), you still can't confirm where the root knot of the problem is, in Showxitemmenu () or CreatePopupMenu () is also used in Showyitemmenu (), or anywhere else? With the call stack information, the problem is easy. BoundsChecker will report the information of the leaked hmenu as follows:

Function
File
Line

Cmenu::createpopupmenu
E:\8168\vc98\mfc\mfc\include\afxwin1.inl
1009

Showyitemmenu
E:\testmemleak\mytest.cpp
100


Other function calls are omitted here.

So, it is easy to find the function that has the problem is showyitemmenu (). When programming with class libraries such as MFC, most API calls are encapsulated in class library classes, and with call stack information, we can easily trace the code that really leaks.

Logging the call stack information makes the program very slow to run, so by default BoundsChecker does not log call stack information. You can open the option switch that records the call stack information by following these steps:

1. Open Menu: boundschecker| Setting ...

2. On the Error Detection page, select Custom in the list of error Detection scheme

3. Select Pointer and leak error check in the Combox of category

4. Hook on the report Call stack check box

5. Click OK

The code Injection,boundschecker also provides the ability to verify the API parameter, memory over run, and so on. These features are very useful for the development of a program. Because these items are not part of the topic of this article, they are not detailed here.

Although BoundsChecker is so powerful, it still looks weak in the face of an implicit memory leak. So let's look at how to detect a memory leak with Performance Monitor.

Detecting memory leaks using Performance Monitor
NT kernel in the design process has been added to the system monitoring functions, such as CPU utilization, memory usage, I/O operations, such as the frequency of the counter as a single, the application can read these counter to understand the whole system or a process of health. Performance Monitor is one such application.

In order to detect memory leaks, we can generally monitor the handle count,virutal Bytes and working set three counter of the process object. Handle Count records the number of Handle that are currently open for the process, and monitoring this counter helps us find out if the program has a Handle leak; NT memory allocation takes a two-step approach, first, in the virtual address space to reserve a space, then the operating system does not allocate physical memory, but retained a section of the address. The space is then committed before the operating system allocates physical memory. Therefore, Virtual bytes is generally larger than the working Set of the program. Monitoring Virutal bytes can help us find some of the system's underlying problems; Working Set records the total amount of memory that the operating system has committed for the process, which is closely related to the amount of memory the program is requesting, and if the program has a memory leak this value will continue to increase, but Virtual bytes is a leap-over.
Monitoring these counter allows us to understand how the process uses memory, and if a leak occurs, the values of these counter will continue to increase even if there is an implicit memory leak. However, we know that there is a problem and we don't know where it is, so we generally use Performance Monitor to verify if there is a memory leak and use BoundsChecker to find and resolve it.
When Performance Monitor shows a memory leak, and BoundsChecker cannot detect it, there are two possible: the first, a sporadic memory leak occurred. When you make sure that you use Performance Monitor and use BoundsChecker, the program runs in a consistent environment and operation method. Second, an implicit memory leak occurs. At this point you need to re-examine the design of the program, and then carefully study the performance Monitor recorded counter value of the change graph, analysis of the changes in the relationship between the program running logic, to find some possible reasons. It's a painful process, full of assumptions, guesses, validations, and failures, but it's also a great opportunity to accumulate experience.
Summarize
Memory leaks are a big and complex problem, even in environments with Gabarge collection mechanisms such as Java and. NET, where leaks can occur, such as implicit memory leaks. Due to the limitation of space and ability, this paper can only do a superficial study on this topic. Other problems, such as leak detection under multi-module, how to analyze the memory usage when the program is running, etc., are the topics that can be researched in depth. If you have any ideas, suggestions or found some errors, welcome to communicate with me.

[Turn] A brief talk on memory leakage and its detection tools

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.