This article introduced a simple use of Linux system Tools Valgrind to detect memory leaks by applying Valgrind to discover the memory problems of Linux programs, This paper implements a tool for detecting memory leaks, including the principle description and implementation details.
the both articles are from the IBM community (suggested to read the original text), this article address: https://www.ibm.com/developerworks/cn/linux/l-mleak/
Briefly
This paper discusses the memory leak detection method and its implementation of C + + program under Linux. These include the basic principles of new and delete in C + +, the implementation principles and methods of the memory detection subsystem, and advanced topics in memory leak detection. As part of the memory-detection subsystem implementation, a mutex class with better usage attributes is provided.
1. Development background
When programming with VC under Windows, we usually need to run the program in debug mode, and the debugger will print out memory information that is allocated on the heap during the program's run, including the code file name, line number, and memory size, when exiting the program. This feature is a built-in mechanism provided by the MFC Framework, encapsulated within its class structure.
Under Linux or UNIX, our C + + program lacks the means to detect memory information, but only the top command to observe the total dynamic memory of the process. And when the program exits, we are not aware of any memory leak information. In order to better assist in the development of Linux under the program, we have designed and implemented a memory detection subsystem in our Class library project. The basic principles of new and delete in C + + are described below, and the implementation principles of the memory detection subsystem, the techniques of implementation, and the high-level topics of memory leak detection are discussed.
2. The principles of new and delete
When we write new and delete in our program, we actually invoke the new operator and delete operator built into the C + + language. The so-called language built-in means that we cannot change its meaning, its function is always consistent. In the case of new operator, it always allocates enough memory before invoking the constructor of the appropriate type to initialize the memory. Delete operator always calls the destructor of that type first and then frees the memory (Figure 1). The fact that we can exert influence is actually the method of allocating and freeing memory during the execution of new operator and delete operator.
The name of the function called by new operator for allocating memory is operator new, and its usual form is void * operator new (size_t size); Its return value type is void*, because this function returns an unprocessed (raw) pointer to uninitialized memory. The parameter size determines how much memory is allocated, and you can add an extra parameter to the overloaded function operator new, but the first parameter type must be size_t.
Delete operator The function called to release memory the name is operator delete, which is usually in the form of void operator delete (void *memorytobedeallocated) It releases the memory area that the incoming parameter points to.
One problem here is that when we call new operator to allocate memory, there is a size parameter that indicates how much memory needs to be allocated. But when the delete operator is called and there are no similar parameters, how can delete operator know the size of the memory block that the pointer points to? The answer is: for a system-owned data type, the language itself can differentiate the size of the memory block, and for custom data types (such as our custom classes), the operator new and operator delete need to pass information to each other.
When we use operator new to allocate memory for a custom type object, we actually get more memory than the actual object, which, in addition to storing the object data, needs to record the size of the memory, which is called a cookie. The implementation of this point is different depending on the compiler. (for example, MFC chooses to store the actual data of the object in the head of the allocated memory, while the latter part stores the bounding flag and the memory size information.) The g++ uses the first 4 of the allocated memory to store information about itself, while the memory behind it stores the actual data of the object. When we use delete operator for a memory release operation, delete operator can correctly release the block of memory that the pointer points to based on that information.
The above is about memory allocation/deallocation for a single object, while we are allocating/freeing memory for an array, although we still use new operator and delete operator, the internal behavior is different: new operator calls operator new The array version of Brother-operator New[], and then calls the constructor for each of the number of group members. Delete operator calls the destructor for each array member and then calls operator delete[] to free up memory. It is important to note that when we create or release an array of custom data types, the compiler also uses compiler-related cookie technology in order to be able to identify the size of the memory blocks that are required to be freed in operator delete[].
In summary, if we want to detect memory leaks, we must record and analyze the memory allocation and release in the program, that is to say we need to overload operator New/operator New[];operator Delete/operator delete[] Four global functions to intercept the memory operation information we need to verify.
3. Basic implementation principle of memory detection
As mentioned above, in order to detect memory leaks, memory allocation and release in the program must be recorded, the way to do this is to overload all forms of operator new and operator delete, intercept new operator and delete operator The memory operation information during execution. The following list is the overloaded form:
void* operator New (size_t nSize, char* pszfilename, int nlinenum) void* operator new[] (size_t nSize, char* pszFileName, int nlinenum) void operator delete (void *ptr) void operator delete[] (void *ptr)
We have defined a new version for operator, with the addition of the required size_t nSize parameter, the file name and line number, where the file name and line number are the file name and line number where the new operator operator was invoked. This information will be output when a memory leak is found to help the user locate the leak in the specific location. For operator delete, because a new version cannot be defined for it, we directly overwrite the two versions of the global operator delete.
In the overloaded version of the operator new function, we will call the corresponding version of the global operator new and pass in the corresponding size_t parameter, and then we will record the pointer value returned by the global operator new and the file name and line number of the sub-allocation, which The data structure used is an STL map with the pointer value as the key value. When the operator delete is invoked, if the call is made correctly (in the case where the call is incorrect), we can find the corresponding data item in the map with the incoming pointer value and delete it, and then call free to release the memory block pointed to by the pointer. When the program exits, the remaining data items in the map are the memory leak information we are trying to detect-the allocation information that has been allocated on the heap but not yet released.
These are the basic principles of memory detection, and there are two basic issues that are not resolved:
1) How to obtain the file name and line number of the memory allocation code, and let new operator pass it to our overloaded operator new.
2) When we create a map data structure for storing memory data, how to manage it, and when to print memory leak information.
Resolve Issue 1 first. First we can take advantage of C's precompiled macros __file__ and __line__, which will expand at compile time to the file's filename and line number at the specified location. We then need to replace the default global new operator with the version we have customized to pass in the file name and line number, which we define in the subsystem header file MemRecord.h:
#define DEBUG_NEW NEW (__file__, __line__)
Then, at the beginning of all CPP files that require a client program that uses memory detection, add
#include "MemRecord.h" #define NEW debug_new
You can replace the call to the global default new operator in the customer source file with the new (__file__,__line__) call, and the new operator in that form will call our operator new (size_t nSize, Char * pszfilename, int nlinenum), where nSize is computed and passed in by new operator, and the file name and line number of the new call point are passed in by our custom version of new operator. We recommend that you include the above macros in all users ' own source code files, and if some files use a memory detection subsystem and none, then the subsystem may output some leak warnings because it cannot monitor the entire system.
Say the second question. The map that we use to manage customer information must be created before the client first calls new operator or delete operator, and the leak information is printed after the last new operator and delete operator call. This means that it needs to be born before the client program and analyzed after the client program exits. Be able to accommodate the client program life cycle Indeed there is one person-the global object (appmemory). We can design a class to encapsulate this map and insert delete operations on it, then construct a global object (appmemory) of the class, create and initialize the data structure in the constructor of the global object (appmemory). And in the destructor of the data structure to analyze and output the remainder. Operator New will call the insert interface of this global object (appmemory) to record pointers, filenames, line numbers, memory block sizes, and so on to the map with pointer values, and the erase interface in Operator Delete will correspond to the value of the pointer Delete the data items in the map, and be careful not to forget that access to the map requires a mutex synchronization because there may be multiple threads at the same time that have memory operations on the heap.
OK, the basic features of memory detection are already available. But don't forget, we have added a layer of indirection to the global operator new in order to detect memory leaks, and to ensure that secure access to data structures is mutually exclusive, which can reduce the efficiency of the program's operation. So we need to make it easy for users to enable and disable this memory detection function, after all, memory leak detection should be done during the debugging and testing phase of the program. We can use the conditional compilation feature to use the following macro definition in the user's instrumented file:
#include "MemRecord.h" #if defined (mem_debug) #define NEW DEBUG_NEW#ENDIF
When a user needs to use memory detection, the instrumented file can be compiled with the following command
g++-c-dmem_debug Xxxxxx.cpp
You can enable the memory detection function, and when the user program is officially released, you can remove the-dmem_debug compile switch to disable memory detection function, eliminate the efficiency impact of memory detection.
Figure 2 shows the execution of the memory leak code and the results of the test after using the memory detection function
Figure 2
4. Problems caused by error mode removal
Now that we've built a subsystem with basic memory leak detection, let's look at some slightly more advanced topics on memory leaks.
First, when we compile C + + applications, we sometimes need to create a single object on the heap, and sometimes we need to create an array of objects. About the new and delete principles we know that for a single object and an array of objects, the memory allocation and deletion actions are very different, and we should always use the new and delete forms that match each other correctly. But in some cases, we can easily make mistakes, such as the following code:
Class Test {}; test* pary = new test[10];//creates an array of 10 Test objects test* POBJ = new test;//Creates a single object ... delete []pobj;//should have used a single-object form Delete POBJ Memory release, but the wrong use of the number//group form Delete pary;//should use the array form delete []pary for memory release, but the wrong use of the form of single//image
What problems do mismatched new and delete cause? The C + + standard answer to this is "undefined", that is, no one assures you what will happen, but one thing is certain: most of it is not a good thing-in some compiler-formed code, the program may crash, and some other compiler-formed code, the program may run without problems, but may lead to memory leaks.
Now that we know the problem with the form mismatch of new and delete, we need to expose this phenomenon relentlessly, after all we have overloaded all forms of memory operations operator New,operator New[],operator Delete, Operator delete[].
The first thing we think about is that when a user calls operator new in a particular way (either a single object or an array) to allocate memory, we can add an entry in the data structure that is related to the pointer to that memory to describe how it is allocated. When the user invokes a different form of operator Delete, we find the data structure corresponding to the pointer in the map, and then compare the distribution mode and the release method to match, the match will delete the data structure normally in the map, and a mismatch would transfer that data structure to a so-called " Errordelete "list, printing with memory leak information when the program finally exits.
The above method is the most logical, but the actual application of the effect is not good. There are two reasons, the first of which we mentioned above: when the new and delete forms do not match, the result is "undefined". If our luck is too bad-the program crashes when it executes the mismatched delete, the data stored in our global object (appmemory) will no longer exist, and will not print any information. The second reason is related to the compiler, as mentioned earlier, when the compiler handles the custom data type or the new and delete operators of a custom data type array, the compiler-related cookie technique is typically used. The possible implementation of this cookie technique in the compiler is that new operator calculates the amount of memory required to hold all the objects, and then adds the amount that it needs to record the cookie, and then passes the total capacity to operator new for memory allocation. When operator new returns the required block of memory, new operator records the cookie information while invoking the corresponding number of constructors to initialize the valid data. A pointer to valid data is then returned to the user. That is to say, the pointer that our overloaded operator new has applied to and recorded is not necessarily consistent with the pointer returned to the caller by new operator (Figure 3). When the caller passes the pointer returned by new operator to the delete operator for memory release, if its invocation form matches, the corresponding form of delete operator is reversed, that is, the destructor of the corresponding number of times is called. The entire memory address containing the cookie is then identified by a pointer to the valid data and passed to operator delete to free memory. If the calling form does not match, delete operator does not perform the above operation, and directly passes the pointer to the valid data (rather than the pointer that really points to the entire memory) to the operator delete. Because we are recording a pointer to the entire block of memory we have allocated in operator new, and now the operator delete is not, we cannot find the corresponding memory allocation information in the data recorded by the global object (appmemory).
Figure 3
In summary, when the call form of new and delete does not match, because the program may crash or the memory subsystem can not find the corresponding memory allocation information, the program will eventually print out "Errordelete" only to detect some "lucky" mismatch. But we have to do something, can not let this harm great error from our eyes slip away, since we can not after autumn accounts, we will output a warning information in real time to remind users. When do you throw a warning? Very simply, when we found that the operator delete or operator delete[] was called, we were unable to find the memory allocation information corresponding to the incoming pointer value in the Global object (appmemory) map, and we thought we should remind the user.
Now that we have decided to output warning information, the question is: how do we describe our warning information to make it easier for users to navigate to mismatched deletion errors? Answer: Print the file name and line number information for this delete call in the warning information. This is a bit difficult, because for operator delete we cannot make an overloaded version with additional information to the object operator new, we can only redefine its implementation in the case of preserving its interface, so our operator delete Only the pointer value can be obtained in the input. In the case where the New/delete call form does not match, it is very likely that we will not be able to find the original new call's allocation information in the global object (appmemory) map. What do we do? As a last resort, we had to use global variables. We defined two global variables (delete_file,delete_line) records in the implementation file of the detection subsystem operator the file name and line number when the delete was called, and in order to ensure that the concurrent delete operation accesses the two variables, a The mutex (as to why it is Ccommonmutex, not a pthread_mutex_t, is discussed in detail in the "Issues on Implementation" section, where it functions as a mutex).
Char delete_file[filename_length] = {0};int Delete_line = 0; Ccommonmutex GlobalLock;
The following form of Debug_delete is defined in the header file of our detection subsystem.
extern char delete_file[filename_length];extern int delete_line;extern Ccommonmutex globallock;//in the back explains # define # DEBUG_ DELETE Globallock.lock (); if (delete_line! = 0) buildstack (); \ (//See Sixth Section explained) strncpy (Delete_file, __file__,filename_length-1);D elete_file[filename_length-1]= '; Delete_line = __line__; Delete
Add a message to the original macro definition in the user's instrumented file:
#include "MemRecord.h" #if defined (mem_debug) #define NEW Debug_new#define Delete Debug_delete#endif
In this way, before the user is detected that the file calls delete operator, the mutex is acquired, then the corresponding global variable (delete_file,delete_line) is assigned using the call point file name and line number, and then the delete operator is called. When delete operator finally calls the operator delete that we defined, after obtaining the file name and line number information for the call, the file name and the line number global variable (delete_file,delete_line) are reinitialized and the mutex is opened. Let the next delete operator hanging on the mutex be executed.
After the delete operator has been modified, when we find that the corresponding memory allocation information cannot be found through the pointer passed by the DELETE operator, the warning of the file name and line number that includes the call is printed.
There is no perfect thing, since we provide a way to delete the error method, we need to consider the following exceptions:
1. The user uses a third-party library function that has memory allocation and deallocation operations. or the implementation file for memory allocation and deallocation in the user's instrumented process is not using our macro definition. Since we have replaced the global operator delete, the user-invoked delete in this case will also be intercepted by us. The user does not use our defined DEBUG_NEW macro, so we cannot find the corresponding memory allocation information in our global object (appmemory) data structure, but since it also does not use Debug_delete, we define the two global delete for delete _file and Delete_line do not have values, so you can not print warning.
2. One of the user's implementation files called New for memory allocation work, but the file did not use the DEBUG_NEW macro that we defined. At the same time, the code in another user's implementation file is responsible for calling delete to delete the allocated memory, but unfortunately, the file uses the Debug_delete macro. In this case, the memory detection subsystem reports warning and prints out the file name and line number of the delete call.
3. In contrast to the second case, a user's implementation file calls new for memory allocation and uses our defined DEBUG_NEW macros. At the same time, the code in another user's implementation file is responsible for calling delete to delete the allocated memory, but the file does not use the Debug_delete macro. In this case, warning is not printed because we are able to find the original information for this memory allocation.
4. In the case of nested delete (which defines a visible "implementation problem"), the first and third cases above are likely to print incorrect warning information, and a detailed analysis of the "Implementation issues" section.
You may think such a warning is too casual and misleading. What do you say? As a detection subsystem, we have adopted the principle that we should err on the wrong note and not be false. Please "rectify any mistakes, Tsutomu".
5. Detection of dynamic memory leak information
The memory leak detection described above can be used to print out memory allocation information that has been allocated on the heap while the program is running and is not freed when the program completes its lifecycle, where programmers can find and correct "explicit" memory leaks in the program. But if the program can release all of its allocated memory before it is over, is it possible to say that there is no memory leak in the program? Answer: No! In the programming practice, we found two other more dangerous "implicit" memory leaks, the performance is that when the program exits, there is no memory leak phenomenon, but in the process of running the program, the memory consumption is increasing, until the entire system crashes.
1. A thread of the program constantly allocates memory and saves pointers to memory in a data store (such as list), but no threads have been freed from memory during the program's run. When the program exits, the memory block pointed to by the pointer value in the Datastore is released sequentially.
2. The n Threads of the program perform memory allocations and pass pointers to a data store, with M threads processing and memory freed from the datastore. Because N is much larger than m, or m threads are processing too much time, memory allocation is much faster than memory is released. However, when the program exits, the memory block pointed to by the pointer value in the data store is released sequentially.
The reason that he is more dangerous, because it is not easy to find out the problem, the program may run for several more than 10 hours without problems, thus passing the tight system test. However, if you run 7x24 hours in the real world, the system will crash at an unscheduled interval, and the cause of the crash can be traced from log and program appearances.
In order to get this problem under the sun, we added a dynamic detection module, Memsnapshot, which is used to count the current memory usage and memory allocations of the program during the run of the program, so that users can monitor the dynamic memory allocation status of the program.
When a customer uses the memsnapshot process to monitor a running process, the memory subsystem of the monitored process transmits the memory allocation and the freed information to the memsnapshot in real time. Memsnapshot counts the received information every interval of time, calculates the total memory usage of the process, and computes the total amount of memory allocated by each memory allocation action, with the file name and line number of the memory allocation called new as the index value. In this way, if the total amount of memory allocated by a row of a file continues to grow without ever reaching a point of equilibrium or even falling back in the statistical results of successive intervals, it must be one of the two problems we have mentioned above.
On the implementation, the memory-detection subsystem's global object (Appmemory) constructor creates a message queue with its current PID-based key value and writes the corresponding information to the message queue when operator new and operator delete are called. The memsnapshot process starts by entering the PID of the detected process, which then assembles the key value and locates the message queue created by the detected process, and begins to read the data in the message queue for analysis and statistics. When the information for operator new is obtained, the memory allocation information is recorded and the corresponding memory allocation information is deleted when the operator delete message is received. At the same time, a profiling thread is started, and the current allocated memory information is computed at intervals, and the memory is allocated as a keyword to see the total amount of memory allocated in the same location (the same file name and line number) and the percentage of the total amount of the process.
Figure 4 is a running Memsnapshot program that monitors the dynamic memory allocation of the process:
Figure Four
The only trick in supporting the implementation of the Memsnapshot process is the handling of the abnormal exit status of the detected process. Because the memory detection subsystem in the instrumented process creates a message queue for transferring data between processes, it is a core resource whose life cycle is the same as the kernel and, once created, will not be freed unless explicitly deleted or system restarted.
Yes, we can complete the deletion of the message queue in the destructor of the global object (appmemory) in the memory detection subsystem, but if the detected process exits abnormally (Ctrl + C, segment error crashes, etc.), Message Queuing can be left alone. Then we can not use the signal system call in the Global object (Appmemory) constructor to register the SIGINT,SIGSEGV and other system signal processing functions, and delete the message queue in the handler function? Still not, because the detection process is completely possible to register its own corresponding signal processing function, which will replace our signal processing function. Ultimately, the way we do this is to use fork to create an orphan process, and use this process to monitor the status of the detected process, and attempt to delete the message queue created by the instrumented process if the detected process has exited (whether it exits gracefully or exits unexpectedly). The following is a brief introduction to its implementation principle:
In the Global object (Appmemory) constructor, when the message queue is created successfully, we call fork to create a child process, and then the child process calls fork to create the grandson process again, and exits, making the grandson process an "orphan" Process (the orphan process is used because we need to sever the signal link between the detected process and the process we created). The grandson process uses the global object (Appmemory) of the parent process (the instrumented process) to obtain its PID and the identity of the message queue that was just created, and passes it to a new program image--memcleaner that is generated by calling the EXEC function.
The Memcleaner program simply calls Kill (PID, 0), functions to see the status of the detected process, and if the detected process does not exist (normal or abnormal exit), then the KILL function returns a value other than 0, at which point we will try to clear the message queue that may exist.
6. Problem with implementation: nested delete
In the "Problem with error mode removal" section, we performed a minor operation on delete operator--added two global variables (delete_file,delete_line) to record the file name and line number of the delete operation. And in order to synchronize access to global variables (delete_file,delete_line), a global mutex is added. At the outset, we used pthread_mutex_t, but in testing we found the limitations of pthread_mutex_t in this application environment.
For example, the following code:
Class B {...}; Class A {public:a () {M_PB = NULL}; A (b* pb) {m_pb = PB;}; ~a () { if (m_pb! = NULL) line number 1delete m_pb;//This sentence is most deadly };p rivate:class b* m_pb; int main () {A * PA = new A (new B); ... Line number 2delete PA; }
In the above code, a delete PA in the main function is called a "nested delete", that is, when we delete the A object, the destructor of the A object performs another delete B action. When the user uses our memory detection subsystem, the Delete PA action should be translated into the following actions:
The global lock global variable (delete_file,delete_line) is assigned a file name and a row number of 2 DELETE operator A call to the global lock global variable ( delete_file) on ~a () Delete_line) assignment to file name and line number 1 delete operator B call ~b () return ~b () call operator delete B record global variable (delete_file, Delete_line) value 1 and clear global variable (delete_file,delete_line) value open global lock return operator delete B return delete operator B return ~a () call operator delete A record global variable (delete_file,delete_line) value 1 and clear global variable (delete_file,delete_line) value to open global lock Return operator delete a returns delete operator a
In this process, there are two technical problems, one is the reentrant problem of the mutex , and the other is the problem of the field protection of the global variable (delete_file,delete_line) when nesting is deleted.
the reentrant problem of the so-called mutex is that in the same thread context, multiple lock calls are made to the same mutex consecutively, and then multiple unlock are called consecutively. This means that our application requires a mutex with the following characteristics:
1. Requires that the same mutex be held multiple times in the same thread context. And only the same number of unlock are called in the same thread context to give up the possession of the mutex.
2. An attempt to hold a mutex for a different thread context allows only one thread to hold the mutex at a time, and the other thread can hold the mutex only after it releases the mutex.
The pthread_mutex_t mutex does not have the above attribute, and the second call to Pthread_mutex_lock will hang even in the same context. Therefore, we must implement our own mutex. Here we use the semaphore feature to implement a mutex Ccommonmutex (source code see attachment) that conforms to the above-described characteristics.
To support feature 2, in this Ccommonmutex class, a semaphore is encapsulated and its resource value is 1 in the constructor, with an initial value of 1. When the Ccommonmutex::lock interface is called, the call to sem_wait gets semaphore so that the semaphore resource is 0 so that other threads calling the lock interface are suspended. When calling interface Ccommonmutex::unlock, call Sem_post to restore the semaphore resource to 1, allowing one of the other suspended threads to hold the semaphore.
At the same time, in order to support feature 1, this ccommonmutex increases the current thread's PID judgment and the current thread access count. When the thread calls the lock interface for the first time, we call sem_wait while logging the current Pid to the member variable m_pid, the collocated access count is 1, and the same thread (M_pid = = Getpid ()) subsequent calls will be counted only and not suspended. When the unlock interface is called, if the count is not 1, only the access count is decremented until the decrement access count is 1 to purge the PID and call Sem_post. (Specific code visible attachment)
The problem with field protection for global variables (delete_file,delete_line) When nesting is deleted is that when you call DELETE M_PB in the destructor of a in the preceding steps, the global variable (delete_file,delete_ Line) The assignment of the file name and row number overrides the assignment of the global variable (delete_file,delete_line) when the delete PA is called in the main program, resulting in the loss of information for the delete PA when executing operator delete A.
In order to protect these global information in the field, it is best to use stacks, where we use the stack container provided by STL. In the Debug_delete macro definition, before assigning a value to a global variable (delete_file,delete_line), let's first determine if someone has already been assigned a value before--to see if the line number variable is equal to 0, if not 0, you should press the existing information stack (call a global function buildstack () to press the current global file name and row number data into a global stack globalstack), and then assign a value to the global variable (delete_file,delete_line), and then call Delete operator. In the erase interface provided by the global object (appmemory) of the memory subsystem, if the incoming file name and line number are judged to be 0, then the data we need is likely to be overwritten by nested deletions, so the corresponding data needs to be popped out of the stack for processing.
The problem with nested deletions is now basically resolved, but when nested deletions occur at the same time as the first and third exceptions described in the "Problem with error mode removal" section, the problem may arise because the user's delete call did not pass the DEBUG_DELETE macro that we defined. The root cause is that we use stack to keep the scene of the delete information recorded via our Debug_delete macro for use in the erase interface of the operator DELETE and global Objects (appmemory), but the user has not been debug_ Delete macro Delete operation has not been pressed stack operation and directly called operator Delete, it is possible to do not belong to the operation of the delete information pop-up, destroying the order and validity of the stack information. Then, when we cannot find the memory allocation information for this and subsequent delete operation, we may print out the wrong warning information.
How to detect a memory leak under Linux