Simple Memory Leak tracking implementation of a debugger (5) the inline assembly (inline assembly) method for using dbghelp to obtain the function call stack

Source: Internet
Author: User
Preface

Memory Leak in C ++ encoding is a very annoying but lingering topic. Recently, due to the introduction of GC, in order to verify whether GC is indeed normal and free of memory, therefore, a tracer for memory allocation is provided first.

Different from the distributor, the distributor mainly solves two problems:

1. Performance, pool-based allocation often provides much faster performance than direct virtual allocation. It is said that this principle is invalid after Vista, because Microsoft modified the implementation mechanism of VA, but it was said that there was no actual test.

2. Fragment. Avoid a large amount of scattered memory allocation to free up its continuous memory. As a result, the subsequent memory cannot be allocated because there is no continuous block.

Our tracker tracer mainly aims to solve a problem, that is, when the memory is allocated, when it is deleted, and when the program is exited.

 

Solution 1: debug_new Solution

Basically, many predecessors have written this topic before, and there is no solution beyond those of the predecessors here. I just learned and understood it when I made this module.

There are two comparative solutions for this problem. One is the debug_new solution of MFC, which is also used by Max SDK.

In fact, the principle is very simple. As long as we can get the file name and row number of the current statement, and then when we are new, we ask our tracer to record the current address and bind it with the file name and row number, then, delete the record according to the address.

How to implement it?

 

This problem is nothing more than solving two problems. First, we need to take over new things before recording our own information.

Operator new of C ++ can be overloaded in different forms, for example:

 void* operator new (size_tInSize, const char* InLogMsg) {     std::cout << InLogMsg <<std::endl;     return ::malloc(InSize); }  

When this overload is called, new needs to be adjusted as follows:

Int * P = new ("I am allocating memory") int;

Note that new and operator new are not the same thing. After operator new is provided in a special form, a similar operator delete must be provided accordingly. Otherwise, level 1 warning will occur.

If you are interested in this question, we can see Supplement 1 at the end of this article. It has nothing to do with the topic of this article and is temporarily ignored.

The second question is, how do we know the file and flight of the current statement?

C ++ can use the following method to obtain the file and row number of the current statement:

STD: cout <"the current file is:" <__ file _ <". The current row number is: "<_ line __< <STD: Endl;

Preparations are ready to begin!

 

First, we need to provide a tracer to record the file name and row number information:

class TracerFileLn{public:    TracerFileLn& singleton();private:    struct _AllocInfo    {        const char*filename_;        size_tfileLn_;    };    typedefstd::hash_map<qword, _AllocInfo>alloc_hash;    alloc_hashallocInfoHash_;public:    void traceMalloc(void* InPtr, const char* InFilename, size_t InFileLn)    {        _AllocInfosAllocInfo = { InFilename, InFileLn };        allocInfoHash_.insert( alloc_hash::pair((qword)InPtr, sAllocInfo) );    }    void traceFree(void* InPtr)    {        auto it = allocInfoHash_.find(InPtr);        allocInfoHash_.erase(it);    }};

So, can we provide the next new overload?

Void * operator new (size_tinsize, const char * infilename, size_t infileln)

Then, operator new is implemented as follows:

void* operator new(size_t InSize, const char* InFilename, size_t InFileLn){    void* pPtr = ::malloc(InSize);    TracerFileLn::singleton().traceMalloc(pPtr, InFilename, InFileLn);    return pPtr;}

Then, operator Delete needs to be implemented as follows:

void operator delete(void* InPtr){    TracerFileLn::singleton().traceFree(InPtr);}void operator delete(void* InPtr, const char* InFilename,size_t InFileLn){    TracerFileLn::singleton().traceFree(InPtr);}

Remember that new [] and delete [] must be implemented accordingly.

In this case, we can no longer use the native New of C ++, but must use the new.

Int * pptr = new (_ file __,__ line _) int;

 

It's too much trouble to write all the places where new is used? However, this is not difficult. If we have a macro, it's like the MFC debug_new:

# Define debug_new new (_ file __,__ line __)

# Define new debug_new

Then, the new location will be replaced with new (_ file __,__ line.

Int * pptr = newint;

You can continue to write this statement, but before this sentence, you must ensure that there is a macro definition of # define new debug_new in front of it.

 

The information is traced. After the program ends, you only need to check which allocinfo exists in the hash, and the corresponding memory leakage can be found by dumping them one by one.

It looks very convenient, right? It is not convenient. We will continue to expand later.

 

 

Supplement 1: After providing the new operator new, why should we provide the corresponding operator delete? According to the C ++ standard, the object construction can be abnormal. If the construction is abnormal, the current object should be recycled. If you use a custom format for the new object, when the constructor is abnormal, C ++ recycles the object using the delete statement in the corresponding custom format, therefore, the corresponding operator delete must be provided. Otherwise, the memory will not be recycled. But if everything is normal, which delete is called when you manually call delete? The answer is the standard Delete: void operator Delete (void * inptr). In addition, new and operator new are not the same thing? Yes, new is the c ++ keyword. What is new doing? 1. Call operator new in the corresponding form to allocate memory. 2. Call placement new, which is the object constructor to construct the object. That is, there are two steps for new. The first step is Operator new, and operator new only allocates memory, regardless of anything else. Do not want Tracer

In the previous article, we introduced the principle and implementation of debug_new.

In the last tracerfileln, we use a hash_map to provide the trace function.

There may be a potential trap in the middle. before entering the next chapter, we need to eliminate this potential trap.

 

What if I overload the original operator new in C ++ and add trace?

 

Void * operator new (size_t insize)

{

Void * pptr =: malloc (insize );

Tracerfileln: Singleton (). tracemalloc (pptr, "<null>",-1 );

Return pptr;

}

 

Stack Overflow!

Why?

Once the native operator new and C ++ are reloaded, the user-provided version will be preferentially used for calling: Operator new. The result is:

In the insert of hash_map, memory allocation is also used: Operator new. Do you remember the default Implementation of Allocator in C ++? Allocator: The implementation of allocate is called: Operator new!

This operator new is reloaded, and tracemalloc is called in it, so:

New-> operator New-> tracemalloc-> insert-> ..............................

 

Therefore, to solve this problem, you must implement allocator that is no longer called: Operator new version.

The specific implementation method is not much said. Please refer to the STL book translated by Hou Jie. The above is no longer clear.

Change operator new to malloc/free.

 

Similarly, if you do not want to trace STL containers, replace them with your own version of allocator.

In addition, what if I do not want to trace the class I write?

It is also very simple. c ++ class. If operator new is implemented by itself, new will give priority to Calling operator new provided by class itself. Therefore, let this class derive from the following class, or implement several methods by yourself:

class UseSystemMallocForNew{public:void* operator new( size_t Size ){return ::malloc( Size );}void operator delete( void* Ptr ){::free( Ptr );}void* operator new[]( size_t Size ){return ::malloc( Size );}void operator delete[]( void* Ptr ){::free( Ptr );}};

  

Solution 2: dbghelp

The dbghelp solution is complicated and slow, but it is not too bad for trace applications.

The principle is dbghelp. lib, dbghelp. H provides a lot of information about the current call stack (ESP, EBP), and works with the PDB file of the corresponding module to obtain the current call module (DLL), call functions, call lines, and commands.

The specific principle is no longer nonsense. There are also many documents on the use of dbghelp on the Internet. I will post a few references:

Howto: Dump call stack

Implementation of a debugger (5) debugging symbols

Using dbghelp to obtain the function call Stack: inline assembly (inline assembly) Method

 

Let's talk about the implementation principle:

We can still intercept new, but do not need to provide a special version of New. just hijack Global New.

After each new call, you can use the dbghelp function to obtain the current call stack and remove the stacks from new to tracer (just put the top several stacks away. Specifically, You can discard a few, depending on the implementation, my implementation starts to call three layers from new, so I just need to discard three layers. you can do it yourself ).

Create a hash and the key is still the allocated memory address. Here, there is an optimization solution for the value. Because the callstack is large, if the current callstack needs to be stored every time, the tracer occupies too much memory. However, although new can be called millions or thousands of times in a program, the number of call stacks in the place where new is located is limited, it may be tens of thousands or tens of thousands. Therefore, once the call stack is obtained, we can first cache the call stack to a vector and calculate the CRC value to another map. Then, the hash value only needs to save the current call stack ID.

After obtaining the callstack, calculate the CRC value and check it in the map. If there is a corresponding call stack, use the corresponding call stack. Otherwise, generate a new one, in this way, the space is optimized.

The others are the same as the previous tracer. In the delete operation, the records of corresponding addresses are removed from the tracer table. After the program is completed, check whether the hash is cleared. If it is not cleared, dump.

Dbghelp is also required for dump.

 

Because this implementation needs to hijack the original void * operator new (size_t insize) of C ++, so, as described in the previous chapter, all hash and vector operations must use the Allocator provided by Alibaba to cancel the tracer function. In the implementation process of tracemalloc, if there are other calls to operator new and new, they also need to be eliminated.

Dbghelp is not very thorough in understanding this part, so it won't make any comments out of the box. Please try again later ~.

 

Reference code, organization, and discussion of several issues the reference code extracts the relevant code. In the csdn Resource Station of Xiaosheng, download the code at. The link is as follows. Http://download.csdn.net/detail/noslopforever/4568056download please use this resource connection. The tracer variant of tracer is slightly changed to record more information. For example, if you do not need to use Hash, you can directly use a list to record it. If you do not use free, the list will not be deleted from the hash. The list will only become larger and larger, record information such as the allocation time, destruction time, allocation size, thread, and so on. In this way, the memory of the entire application can be monitored. U3 uses dbghelp trace to record all the distributions of the current application, for a period of time, or even the entire application life cycle. In this way, we can provide more information about memory allocation, know when the memory allocation calls are too concentrated, and when the memory destruction calls are too concentrated, still, allocation and destruction are smoothly executed and developed. However, every time new information is added, the tracer will become slower. Advantages and disadvantages of file line tracer and dbghelp Tracer

First, in terms of performance, the information required by file line tracer comes from the compilation phase. There is no new call overhead in addition to the program stack and hash during runtime, while dbghelp information comes from runtime, the overhead is naturally much larger than file ln tracer.

Then, define new is required for file line tracer, which introduces some minor troubles. dbghelp tracer does not need this. Expand after define new.

Then, file line tracer only knows the file and row number where the current statement for memory allocation is located. However, dbghelp can also give the current call stack for memory allocation, it is more conducive to quickly locating the wrong allocation ".

 

The define new of file line tracer and the problems arising therefrom

The biggest problem with define new is that you need to ensure that the # define new debug_new macro is called before new.

When the header file contains a chaotic relationship, this is quite uncomfortable. If there is a pre-compilation header, you can add this sentence to the first line of the pre-compilation header. However, when there is no pre-compiled header, you need to maintain its correctness.

Otherwise, in case. h In New ,. delete in CPP (or the opposite), and define new occurs again in this. h later, it is very easy to misjudge the situation, a new is clearly deleted, but not recorded, so a false positive reported Memory leakage, or a deleted object is not newly deleted ......

The sequence of header files is really a permanent pain in C ++ ......

 

Tracer optimization cookie Optimization

No matter which tracer, some code overhead is added.

If the memory splitter is also written by itself, it is convenient to note that some small cookies can be split before memory allocation, and the required information is recorded in these small cookies. When you need this information, you only need to forward several bytes to obtain the cookie, which has the highest performance, but introduces two problems:

I. The full set of memory allocation should be done by myself. Mature third-party allocation such as dlmalloc and tlmalloc cannot be used, which is why independent tracer is used in my example code.

2. Check that the memory currently accessed is allocated by the Allocator. Once new and delete are not paired, this issue will spring up. A bad solution, but in most cases, is to record a magic number in the cookie. when accessing the cookie, you must first determine whether the magic number can be matched. After all, in most cases, the data in the memory is very unlikely to match the magic number. However, this problem is hard to prevent. This problem may become more and more prominent when you provide not a full set of solutions, but a small module that will be referenced by other people at the code level rather than the binary level.

 

To avoid the hash overhead of multi-process optimization, another method is to use another thread to send trace information to other processes for processing. Because messages can be queued and processed when the program is idle and exited, the overhead of running the program will be reduced. The specific method can be file writing, TCP transmission to other servers, or shared memory writing, depending on your own wishes and test results. Memory occupied by optimization

If we change the trace as described in the "Variant" section to "full interception" and never destroy it, then the problem we will face next is that once the allocation is increased, this memory usage is rising.

At this time, there is also a scheme to change tracer to not be processed by this process, but to send messages to other applications through TCP connections for interception and processing. However, for dbghelper, the message sent must include the full text of the call stack. Otherwise, it will be much more troublesome for another application to obtain only the stack tag. Fortunately, the sending thread can be established in another thread, and the full-text information of each call stack can be cached once after it is obtained. The total number of full-text information of the Call Stack is less than the allocation number, right?

 

Tracer cross-module call

After tracer is deployed across modules, it becomes a headache.

If all the modules are maintained by yourself, you can ensure that your tracer can be used fairly in each module without any problems.

However, if you are creating a release, relatively refined, and with a single function, tracer should be very careful.

First, you may not want to enable the tracer function of this module. Therefore, you need to provide a dedicated memleak debug version.

Then, if you use tracer, ensure that the memory generated by tracer new is monitored by tracer for Delete. Here, you need to maintain the principle of "delete this module new this module, but the principle is that it is easy to do. You got a shot, right?

Then, if you do not want to use tracer, or do not want to use tracer, you need to remove tracer from the include file of the distribution version. the corresponding use in H requires macro shielding. This is what everyone should do, and the right should be nonsense.

 

Finally, if you want to use your tracer, believe me, this is just the beginning of a nightmare ...... You will never know how users use your provided libraries and interfaces ...... So the only thing they can do is to make them have no choice.

 

The cross-module feature is a very confusing feature. If you have a fully controllable project, we recommend that you do not go too far on this road. After all, it is the best thing to adapt to. What do you mean?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.