Reprint: C + + memory leak mechanism

Source: Internet
Author: User
Tags memory usage cpu usage

A memory leak is a common and troubling problem for a C + + programmer. Many techniques have been developed to deal with this problem, such as Smart pointer,garbage collection. Smart pointer technology is more mature, the STL already contains the class to support the smart pointer, but it does not seem to be widely used, and it does not solve all the problems; garbage collection technology has matured in Java, However, the development of the C + + domain is not smooth, although it is very early to think in C + + also joined the GC support. The real world is like this, as a C/s + + programmer, memory leaks is your heart forever pain. Fortunately, however, there are a number of tools that can help us verify the existence of the memory leak and find the code for the problem.

   definition of memory leaks

In general we often say that the memory leak is the heap memory leakage. Heap memory is that the program is allocated from the heap, of any size (the size of the memory block can be determined during the program run), and the freed memory must be displayed after use. The application generally uses functions such as malloc,realloc,new to allocate a piece of memory from the heap, and after use, the program must be responsible for the corresponding call free or delete to release the memory block, otherwise, this piece of memory can not be reused, we say this memory leak. The following applet demonstrates a scenario in which heap memory leaks:

void MyFunction (int nsize)
{
char* p= New Char[nsize];
if (! Getstringfrom (P, nsize)) {
MessageBox ("Error");
Return
}
...//using the string pointed by P;
Delete p;
}

Example One

When the function Getstringfrom () returns zero, the memory pointed to by the pointer P is not released. This is a common scenario where memory leaks occur. The program allocates memory at the entrance, releasing the memory at the exit, but the C function can exit anywhere, so that a memory leak occurs when there is no release of the memory that should be freed at an exit.

In a broad sense, memory leaks include not only the leak of heap memory, but also the leakage of system resources (resource leak), such as the kernel mentality Handle,gdi object,socket, interface, etc., fundamentally these objects allocated by the operating system also consume memory, Leakage of these objects can eventually result in memory leaks. Also, some objects consume kernel-mindset memory, which can cause the entire operating system to become unstable when the objects are compromised. In contrast, system resource leaks are more severe than heap memory leaks.

The leakage of GDI object is a common resource leak:

void Cmyview::onpaint (cdc* pDC)
{
CBitmap bmp;
cbitmap* poldbmp;
Bmp. LoadBitmap (idb_mybmp);
Poldbmp = Pdc->selectobject (&bmp);
...
if (something ()) {
Return
}
Pdc->selectobject (poldbmp);
Return
}

Case II

When the function something () returns Non-zero, the program does not select the Poldbmp back into the PDC before exiting, which causes the Hbitmap object that poldbmp points to leak. This program, if run for a long time, may cause the entire system to spend screen. This problem is more easily exposed under Win9x because the Win9x GDI heap is much smaller than Win2K or NT.

How the memory leaks occur:

In the manner in which they occur, memory leaks can be categorized into 4 categories:

1. Frequent memory leaks. The code that has a memory leak is executed multiple times, causing a memory leak each time it is executed. For example, if the something () function always returns True, then the Hbitmap object that poldbmp points to always leaks.

2. Accidental memory leaks. The code that occurs with a memory leak only occurs under certain circumstances or procedures. For example, if the something () function returns True only in a particular environment, then the Hbitmap object that poldbmp points to does not always leak. The frequent and incidental are relative. For a given environment, the occasional may become a frequent occurrence. So test environments and test methods are critical to detecting memory leaks.

3. Disposable memory leak. The code that has a memory leak is executed only once, or because of an algorithm flaw, there will always be a single and one memory leak. For example, allocating memory in the constructor of a class does not release the memory in the destructor, but because this class is a singleton, the memory leak occurs only once. Another example:

char* g_lpszfilename = NULL;

void Setfilename (const char* lpcszfilename)
{
if (g_lpszfilename) {
Free (g_lpszfilename);
}
G_lpszfilename = StrDup (lpcszfilename);
}

Example Three

If the program does not release the G_lpszfilename point string at the end of the process, even if you call Setfilename () multiple times, there will always be a chunk of memory, and only one piece of memory will leak.

4. An implicit memory leak. The program keeps allocating memory while it is running, but it does not release memory until the end. Strictly speaking, there is no memory leak, because the final program frees up all of the requested memory. But for a server program, it can take days, weeks, or months to run out of memory, which may result in the eventual exhaustion of all of the system's memory. So, we call this kind of memory leak as an implicit memory leak. Give an example:

Class Connection
{
Public
Connection (SOCKET s);
~connection ();
...
Private
SOCKET _socket;
...
};

Class ConnectionManager
{
Public
ConnectionManager () {}
~connectionmanager () {
List::iterator it;
for (it = _connlist.begin (); it!= _connlist.end (); ++it) {
Delete (*it);
}
_connlist.clear ();
}
void onclientconnected (SOCKET s) {
connection* p = new Connection (s);
_connlist.push_back (P);
}
void onclientdisconnected (connection* pconn) {
_connlist.remove (Pconn);
Delete Pconn;
}
Private
List _connlist;
};

Example Four

Assuming that the server does not call the onclientdisconnected () function after the client disconnects from the server side, the connection object representing that connection will not be deleted in time (when the server program exits, All connection objects are deleted in the ConnectionManager destructor. An implicit memory leak occurs when a continuous connection is established and disconnected.

From the user's point of view of the program, the memory leak itself will not produce any harm, as a general user, do not feel the existence of memory leaks. What really harms is the accumulation of memory leaks, which ultimately consumes all the memory of the system. In this sense, a one-time memory leak is harmless because it does not accumulate, and the implicit memory leak is very harmful because it is more difficult to detect than frequent and accidental memory leaks.
Detecting Memory leaks

The key to detecting a memory leak is to be able to intercept calls to the function that allocates and frees memory. By intercepting these two functions, we can track the life cycle of each piece of memory, for example, each time a piece of memory is successfully allocated, the pointer is added to a global list, and every time a piece of memory is released, its pointer is removed from the list. Thus, when the program is finished, the remaining pointers in the list point to the memory that is not freed. This is just a simple description of the basic principle of detecting memory leaks, and detailed algorithms can be found in Steve Maguire's <<writing Solid code>>.

If you want to detect heap memory leakage, then need to intercept the Malloc/realloc/free and New/delete on it (in fact, New/delete is also the end of the Malloc/free, so as long as the interception of the previous group can). For other leaks, a similar method can be used to intercept the corresponding allocation and release functions. For example, to detect BSTR leakage, you need to intercept sysallocstring/sysfreestring, to detect hmenu leakage, you need to intercept createmenu/destroymenu. (Some of the resource allocation function has more than one release function, for example, SysAllocStringLen can also be used to allocate BSTR, then need to intercept multiple allocation functions)

In the Windows platform, detection of memory leaks tools commonly used in three, MS c-runtime library built detection function; External detection tools, such as, Purify,boundschecker, etc. using Windows NT self-performance Monitor. These three tools have their advantages and disadvantages, MS C-runtime Library, although the function is weaker than the plug-in tools, but it is free; Performance Monitor does not have the code to indicate the problem, but it can detect the existence of an implicit memory leak, This is where the other two types of tools are powerless.

Below we discuss in detail these three kinds of detection tools:

A method for detecting memory leakage under VC

Applications developed with MFC are automatically added to the memory leak detection code when compiled in debug version mode. At the end of the program, if a memory leak occurs, the Debug window displays information about all the memory blocks that are leaking, and the following two lines display information about a leaked block of memory:

E:/testmemleak/testdlg.cpp: {n} normal block at 0x00881710 bytes long.

Data: <abcdefghijklmnop> the 6A 6B 6C 6D 6E 6F 70

The first line shows that the memory block is allocated by the TestDlg.cpp file, the 70th line of code, the address is in 0x00881710, the size is 200 bytes, and {59} refers to the Request order for calling the memory allocation function, which can be seen in MSDN for more information. Help with Crtsetbreakalloc (). The second line shows the contents of the first 16 bytes of the memory block, which is shown in ASCII, followed by a 16-way display.

Generally, we all mistakenly think that these memory leak detection function is provided by MFC, but it is not. MFC only encapsulates and utilizes the debug Function of the MS C-runtime Library. Non-MFC programs can also take advantage of the debug function of Ms C-runtime Library to add memory leak detection capabilities. The MS C-runtime Library has built a memory leak detection function when implementing functions such as Malloc/free,strdup.

Take a look at the project generated by the MFC Application Wizard, which has a macro definition in the head of each CPP file:

#ifdef _DEBUG
#define NEW Debug_new
#undef This_file
static char this_file[] = __file__;
#endif

With this definition, when you compile the debug version, all new files that appear in this CPP file are replaced with Debug_new. So what is Debug_new? Debug_new is also a macro, and the following excerpt from the afx.h,1632 line

#define DEBUG_NEW NEW (This_file, __line__)

So if you have one line of code:

char* p = new char[200];

The macro substitution becomes:

char* p = new (This_file, __line__) char[200];

According to C + + standards, for the above new use method, the compiler will look for the definition of operator NEW:

void* operator New (size_t, LPCSTR, int)

We found one of these operator new implementations in the Afxmem.cpp 63 line.

void* afx_cdecl operator New (size_t nsize, lpcstr lpszfilename, int nline)
{
Return:: operator new (nsize, _normal_block, lpszFileName, nline);
}

void* __cdecl operator New (size_t nsize, int ntype, LPCSTR lpszfilename, int nline)
{
...
PResult = _malloc_dbg (nsize, Ntype, lpszFileName, nline);
if (PResult!= NULL)
return pResult;
...
}

The second operator new function is relatively long, for the sake of simplicity, I have only excerpted the part. It is clear that the final memory allocation is implemented through the _MALLOC_DBG function, which belongs to the debug function of the MS C-runtime Library. This function requires not only the size of the incoming memory, but also the filename and line number two parameters. The file name and line number are used to record which section of code this allocation is caused by. If the inside of this piece is not released before the end of the program, then the information will be exported to the Debug window.

Here, by the way, This_file,__file and __line__. Both __file__ and __line__ are compiler-defined macros. When encountering __file__, the compiler replaces __file__ with a string that is the pathname of the file currently being compiled. When the __line__ is encountered, the compiler replaces the __line__ with a number, which is the line number of the current line of code. Instead of using __file__ directly in the definition of debug_new, this_file is used to reduce the size of the target file. Suppose that there are 100 new in a CPP file, and if you use __file__ directly, the compiler produces 100 constant strings, which are all chiselled/span>cpp file pathname, apparently very redundant. If you use This_file, the compiler produces only a constant string, and the 100 new calls use pointers to constant strings.

Looking again at the project generated by the MFC Application Wizard, we will find that only the new mapping in the CPP file, if you use the malloc function to allocate memory directly in the program, the file name and line number of the call malloc will not be recorded. If this memory is leaking, the MS C-runtime Library can still detect it, but when the information for this block of memory is exported, the file name and line number assigned to it are not included.

To turn on memory leak detection in non-MFC programs is very easy, you just add the following lines to the entrance of the program:

int tmpflag = _CrtSetDbgFlag (_crtdbg_report_flag);

Tmpflag |= _CRTDBG_LEAK_CHECK_DF;

_CrtSetDbgFlag (Tmpflag);

Then, at the end of the program, when the Winmain,main or DllMain function returns, their information is printed to the debug window if there are still blocks of memory that are not released.

If you try to create a non-MFC application and add the above code at the entrance to the program and deliberately do not release some memory blocks in the program, you will see the following information in the Debug window:

{$} normal block at 0x00c91c90, bytes long.

Data: < > About 0A 0B 0C 0D 0E 0F

Memory leaks are detected, but the file name and line number are missing compared to the above examples of MFC programs. For a larger program, without this information, it becomes very difficult to solve the problem.

To be able to know where the leaked chunks of memory are allocated, you need to implement a mapping function similar to MFC, mapping functions such as New,maolloc to _malloc_dbg functions. Here I no longer repeat, you can refer to the MFC source code.

Because the debug function is implemented in MS C-runtimelibrary, it detects only a leak of heap memory and is limited to memory allocations such as Malloc,realloc or strdup, and those system resources, such as Handle,gdi Object, or memory that is not allocated through the C-runtime library, such as the Variant,bstr leak, is not detectable, which is a major limitation of this detection. In addition, in order to be able to record where the memory block is allocated, the source code must be matched, which in debugging some of the old program is very troublesome, after all, modify the source code is not a worry, this is another limitation of this detection method.

The detection capabilities offered by MS C-runtime Library are far from sufficient to develop a large program. Next we will look at the plug-in detection tool. I use more is BoundsChecker, one because its function is more comprehensive, more important is its stability. If this kind of tool is not stable, it will be busy in the chaos. In the end is from the famous NuMega, I use down basically no big problem.
To detect a memory leak using BoundsChecker:

BoundsChecker uses a technique called Code injection to intercept calls to functions that allocate memory and release memory. Simply put, when your program starts running, the BoundsChecker DLL is automatically loaded into the process's address space (which can be implemented through a system-level hook), and then it modifies function calls to memory allocation and release in the process, allowing these calls to go first to its code, And then execute the original code. BoundsChecker does not need to modify the source code or the project configuration file of the debugger when doing these actions, which makes it very simple and straightforward to use.

Here we take the malloc function as an example to intercept other function methods similar to this.

Functions that need to be intercepted may be in a DLL or in the code of a program. For example, if the static link C-runtime Library, then the malloc function code will be linked to the program. To intercept calls to such functions, boundschecker dynamically modifies the instructions for these functions.

The following two paragraph assembly code, a paragraph without boundschecker intervention, the other paragraph has boundschecker intervention:

126: _crtimp void * __cdecl malloc (
127:size_t nsize
128:)
129: {

00403C10 Push EBP
00403C11 mov Ebp,esp
130:return _nh_malloc_dbg (nsize, _newmode, _normal_block, NULL, 0);
00403C13 Push 0
00403C15 Push 0
00403C17 Push 1
00403C19 mov eax,[__newmode (0042376c)]
00403C1E push EAX
00403C1F mov ecx,dword ptr [nsize]
00403C22 push ECX
00403c23 call _nh_malloc_dbg (00403C80)
00403C28 Add esp,14h
131:}

The following section of code has BoundsChecker intervention:

126: _crtimp void * __cdecl malloc (
127:size_t nsize
128:)
129: {

00403C10 jmp 01f41ec8
00403C15 Push 0
00403C17 Push 1
00403C19 mov eax,[__newmode (0042376c)]
00403C1E push EAX
00403C1F mov ecx,dword ptr [nsize]
00403C22 push ECX
00403c23 call _nh_malloc_dbg (00403C80)
00403C28 Add esp,14h
131:}

When BoundsChecker intervened, the first three assembly instructions of the function malloc were replaced with a jmp instruction, and the original three instructions were moved to the address 01f41ec8. When the program entered the malloc first jmp to 01F41EC8, the implementation of the original three instructions, and then the boundschecker of the world. In general it will record the return address of the function (the return address of the function is on the stack, so it is easy to modify), then point the return address to the code belonging to BoundsChecker, and then jump to the original instruction of the malloc function, which is in the 00403C15 place. When the malloc function ends, because the return address is modified, it will return to the BoundsChecker code, at which point BoundsChecker will record the memory allocated by the malloc pointer, and then jump to the original return address.

If the memory allocation/deallocation function is in a DLL, BoundsChecker uses another method to intercept calls to those functions. BoundsChecker the function address in table by modifying the DLL Import table of the program to point to its own address for interception purposes.

By intercepting these allocation and deallocation functions, BoundsChecker can record the allocated memory or resource lifecycle. The next question is how to relate to the source code, that is, when BoundsChecker detects a memory leak, how it reports which piece of code is allocated for this block of memory. The answer is debugging information (debug information). When we compile a debug version of the program, the compiler records the correspondence between the source code and the binary code, puts it in a separate file (. pdb), or links directly to the target program, and can get the source of a block of memory by directly reading the debug information on which file and on which line. Using the code injection and debug information, the BoundsChecker not only records the location of the source code for the call allocation function, but also records calls stack at the time of allocation, and the source code location of functions on called stack. This is useful when using class libraries such as MFC, and here's an example to illustrate:


void Showxitemmenu ()
{
...
CMenu menu;

Menu. CreatePopupMenu ();
Add menu items.
Menu. Trackpropupmenu ();
...
}

void Showyitemmenu ()
{
...
CMenu menu;
Menu. CreatePopupMenu ();
Add menu items.
Menu. Trackpropupmenu ();
Menu. Detach ();//this'll cause hmenu leak
...
}

BOOL Cmenu::createpopupmenu ()
{
...
Hmenu = CreatePopupMenu ();
...
}

When Showyitemmenu () is invoked, we intentionally cause a hmenu leak. However, for BoundsChecker, the leaked hmenu are allocated in class Cmenu::createpopupmenu (). Suppose that there are many places where your program uses the CMenu CreatePopupMenu () function, such as Cmenu::createpopupmenu (), you still cannot confirm where the root knot of the problem is, in Showxitemmenu () or CreatePopupMenu () is also used in Showyitemmenu (), or in other places. With the information in call stack, the problem is easy. BoundsChecker will report the leaked HMENU information as follows:

Function
File
Line

Cmenu::createpopupmenu
E:/8168/vc98/mfc/mfc/include/afxwin1.inl
1009

Showyitemmenu
E:/testmemleak/mytest.cpp
100

The other function calls are omitted here

So, it's easy to find the function that has the problem is showyitemmenu (). When programming with class libraries such as MFC, most API calls are encapsulated in class libraries, and with call stack information, we can easily trace the code that actually leaks.

Logging Call stack information causes the program to run very slowly, so the call stack information is not logged by default BoundsChecker. You can follow these steps to turn on the option switches that record call stack information:

1. Open Menu: boundschecker| Setting ...

2. In the Error detection page, select Custom in the error detection scheme list

3. Select pointer and leak error check in category Combox

4. Hook on the Call stack check box

5. Click OK

Based on the code Injection,boundschecker also provides the API parameter verification function, memory over run and other functions. These features are very useful for program development. Since the content is not part of the subject of this article, it is not detailed here.

Although BoundsChecker is such a powerful feature, it still seems feeble to face an implicit memory leak. So let's look at how to use Performance Monitor to detect memory leaks.

Detecting memory leaks using Performance Monitor
NT's kernel in the design process has been added to the system monitoring functions, such as CPU usage, memory usage, I/O operation frequency and so on as a counter, the application can read these counter to understand the whole system or a process of the operation of the state. Performance Monitor is one such application.

To detect memory leaks, we can generally monitor the process object's handle Count,virutal Bytes and working set three counter. Handle Count records the number of Handle that the process is currently opening, and monitoring this counter helps us find out if the program has a Handle leak; NT memory allocation takes a two-step approach, first of all, in the virtual address space to retain a space, when the operating system does not allocate physical memory, just keep a paragraph address. Then, this space is submitted and the operating system allocates physical memory. Therefore, Virtual bytes is generally larger than the working Set of the program. Monitoring Virutal bytes can help us find some problems at the bottom of the system; Working set records the total amount of memory that the operating system has committed for the process, this value is closely related to the total amount of memory that the program requests, and if the program has a memory leak this value will continue to increase, but Virtual bytes is a leap-type increase.
Monitoring these counter allows us to understand how the process uses memory, and if there is a leak, the value of these counter will continue to increase even if there is an implicit memory leak. However, we know there is a problem but do not know where the problem, so generally use Performance Monitor to verify that there is a memory leak, and use BoundsChecker to find and solve.
When Performance Monitor shows a memory leak and BoundsChecker is unable to detect it, there are two possibilities: the first, there is an accidental memory leak. When you make sure that you use Performance Monitor and use BoundsChecker, the program's operating environment and operating methods are the same. Second, an implicit memory leak occurred. Then you have to re-examine the design of the program, and then carefully study the performance Monitor records of the counter value of the change diagram, analyze the changes in the relationship between the logic of the program, to find some possible reasons. It is a painful process, full of assumptions, conjectures, validations, failures, but it is also an excellent opportunity to accumulate experience.
Summarize
  Memory leaks are a large and complex problem, and even in environments like Java and. NET with Gabarge collection mechanisms, there are potential leaks, such as implicit memory leaks. Due to the limitation of space and ability, this article can only do a superficial research on this subject. Other problems, such as the leakage detection under multiple modules, how to analyze the memory usage when the program is running, etc., are all topics that can be studied in depth. If you have any ideas, suggestions or found some errors, welcome to communicate with me.  
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.