The last time we performed a memory leak detection tool, we can detect the memory leak when the system exits, and print the function call stack at the memory leak, this tool can quickly locate the function call location of the leaked program, but people are always lazy animals. After several times, it is a little uncomfortable to use it, unhappy points mainly include: 1. The call stack is recorded every time the memory is applied for. At this time, there are multiple memory allocation operations to save information such as the file name, function name, and row number, however, this information is rarely used, because the possibility of Memory leakage is still relatively low. In this way, the program performance is greatly reduced after the tool is used. For software such as databases, the test efficiency is greatly affected. 2. source code control is not in the test department. Therefore, if you want to perform a memory leak test, you need to modify the code every time you obtain the source code, which is cumbersome to modify, such as initialization, records the memory pointer and size. When released, you need to delete it from the linked list and select a proper location to print the leaked memory. The first time was quite fresh, the second time it was too troublesome, and it was a bit wrong. The third time it was decided to give up the method, which was too troublesome, and there may not be any problems in the program. The above reasons should be sufficient for me to optimize the memory leak detection tool. The expected result is that when the program changes as little as possible, it can be configured to detect program memory leaks. The first idea is to use hooks to intercept memory allocation and release functions. After memory allocation, the pointer and the large note are recorded and deleted when the memory is released, the last thing left is the memory leakage. The next step is to study how to intercept functions in the program. In Windows, special hook functions are provided, but messages can only be intercept. function calls in the program do not communicate with messages, this method is basically denied. In the rumor of programming experts, there is a hook about C on Windows. Here, some text descriptions are taken.In WindowsAll compiled programs have an Import. ImportThere is a JMPTable, all function calls will jump to Import firstTable, and then through JMPJump to the corresponding execution function. So if you want to hook up, you only needThe address of the function to be linked. When someone else calls a function, JMP willTo the hook function.OK. Let's take a look at the function calling in Windows and write a simple program. # Include <stdlib. h> # include <stdio. h> void test1 () {printf ("call test1 \ n"); return;} void test2 () {printf ("call test2 \ n"); return ;} voidmain () {test1 ();} set a breakpoint at the test1 call. After debugging, check the assembly code: call @ ILT + 10 (test) (0040100f) the call to test1 calls a @ ILT + 10, which means that Incremental Link Table is available only in the Debug version ), that is, the test1 function corresponds to the JMP command at 10 ILT offset, that is, the call jumps to the address 0040100f, OK, press F11 to see what the situation is. 0040100F jmp test1 (00401020) 00401014 jmp test2 (00401070) There is a JMP command at the address 0040100f. It jumps to 00401020, And the JMP command of the test2 function is also here. Let's take a look at what the 00401020 memory is. Follow up on F11 and find that we have finally come to the test1 function definition. First of all, there must be some of the most basic stack operations for function calls, I will not elaborate on it here, and it is of no use to hook. Now we know the basic method of function calling: call and then JMP. To hook up, we only need to modify the jmp address in ILT. For example, above, if you want to hook the function test1 to jump to test2, you only need to change the address after jmp to the address of test2. But it should be noted that this memory can not be changed at will, or you may accidentally write the wrong memory address, Windows will be miserable, but Windows is not so bad, A function is provided for a WriteProcessMemory function to allow you to write the EXE memory. That is to say, you know what you are doing. In debug, when the breakpoint is set to test1 (), watch test1 and find that the value is 00401020, that is, the position defined by the function, however, if we print test1 out, we will find that it is 0040100F, that is, the location of test1 in ILT. This was just a few detours. Write a simple program to test whether our ideas can work. Void hookfunc () {LPBYTE response; LPBYTE lpByte2; DWORD dwAddr1; lpByte1 = (LPBYTE) test1; // Get old function JMP Addr response = (LPBYTE) & Response [1]; lpByte2 = (LPBYTE) test2; // Get new function JMP Addr lpByte2 = (LPBYTE) & lpByte2 [1]; // get new and old function's addr memcpy (& dwAddr1, lpByte2, sizeof (DWORD); WriteProcessMemory (GetCurrentProcess (), lpByte1, & dwAddr1, sizeof (DWORD), NULL);} This program first obtains test The address of the functions 1 and test2, that is, the memory address in the ILT. The JMP command occupies one byte, And the JMP address is a DWORD. Therefore, skip one JMP byte first, point the pointer to the address after JMP, copy the address of the test2 function, and use the WriteProcessMemory function to write the JMP address of test2 to the JMP address of test1. debug and run it. Unfortunately, the program crashes. What seems to be wrong with the above idea? Let's keep track of it. Set a breakpoint at test1's call and trace it to ILT. We found that: 00401005 jmp test1 + 4Bh (00401_ B) 0040100A jmp test2 (00401080) the JMP command is different from the test2 command. The following address is not the test2 address we expected. Theoretically, this address is 00401070, which is in line with our idea. It seems that the above idea is still problematic, let's print out the JMP values. Print the JMP addresses after test1 and test2 respectively. One is 0X26 and the other is 0X71. This is definitely not the absolute address of the function, it seems that the jump is a relative address. It's really stupid. I forgot the Assembly knowledge. Right-click to open Code Bytes. Let's take a look: 00401005 E9 26 00 00 00 jmp test1 (00401030) The hexadecimal format of this JMP command is E9 26 00 00 00 00, while E9 is a remote jump, that is, the jump address here is: 00 401005 + 5 length of this instruction) + 26 = 00401030, while 00401030 is the real address of the test1 function. Now all the doubts are resolved. According to this logic, after the above modification, the actual jump location should be 00401005 + 5 + 71 = 0040107B, which is not the real address of the test2 function. Therefore, the above program needs to make the following changes: Calculate the memory address offset of the JMP test1 and JMP test2 commands. The adjusted value should be 00401005 + 5 + 0040100A-00401005) + 71 = 00401080 and 00401080 are the real address of the test2 function. Therefore, the value of dwAddr1 above should be calculated based on the above formula to calculate an offset, that is, 71 + 0040100A-00401005) = 76. Try again and see if test1 is called, the call test2Hook function test1 is successfully printed! But now there is another problem. What should I do if I want to call the real test1 function? Now we have no JMP command for test1 in ILT. One way is to create another empty function and change the JMP command of this function in ILT to JMP test1, when the empty function is called, the actual test1 is called. Another way is to change JMP test2 to JMP test1, that is, to switch the call of test1 and test2. The second method seems to be more appropriate, because we have a reason to assume that the test2 function is a hook function, and we certainly won't directly call test2, instead, it calls test2 through the hook method. Therefore, after the commands in the ILT of test1 and test2 are exchanged, the test2 function will be automatically called, explicitly calling the test2 function will jump to the test1 function. Another way is to record the absolute address of test1 and call it directly by means of assembly code. This is a lot of trouble and there is no experiment. In addition, for Windows API functions, similar JMP commands are not generated in debug, but the address of the API function is directly redirected through the CALL command, this situation is handled in "programming guru proverbs" by applying for a global variable so that it is still called in JMP mode, then modify the JMP address. So far, we have implemented the function to hook up the internal process of the local process, fundamentally solving the biggest obstacle to optimization of Memory Leak programs. Next, let's consider another question: do we directly provide the source code to an initialization function, add all the code to the project, and then call the initialization function to hook it? Compared with the first version, all the memory allocation and release code need to be modified. This has been improved a lot, but it does not seem so transparent. Can it be more convenient? Can I make a dynamic link library and call the initialization function implicitly? In this way, you only need to load the. lib file during program compilation. Here we need to mention # A usage of The pragma command: # pragma comment (linker, "/include :... The pragma comment command puts a comment record into an object file or executable file. The most common method is # pragma comment (lib, "ws2_32.lib ") this command tells the compiler to link the ws2_32.lib library file to the target file. Linker puts a link option into the target file, And/include can forcibly include an object. Therefore, we can create a class for initialization in DLL and declare a Global Object of the class, such as _ declspec (dllexport) ResourceLeakDetector rld; export the rld object in rld. in the header file of h, add the following pragma command: # pragma comment (linker, "/include :__ imp _? Rld @ 3VResourceLeakDetector @ A ") is used to force the inclusion of rld objects. The @ conformances in the future are exported in c ++ mode, _ imp _ indicates a prefix of the imported object. To make it easier for users, we also. add another header file: # pragma comment (lib, "rld. lib ") Now we only need to add rld to the project to use the Memory Leak Detection Program. h. h include. The last is the call stack efficiency problem, because we have very few memory leaks, it is a resource-consuming task to obtain the call stack and parse the file name and function name for each function call. A reasonable method is to obtain only the function offset address, instead of parsing its file name and function name. At the end of the leak, the file name and function name are parsed Based on the offset address. The memory leak detection program is much more efficient than the original version, and it is not so complicated to use :). In addition, if the path contains a Chinese name, it will be truncated in the past. The reason is that DbgHelp is installed on many machines. the dll version is too old to provide a function with a wide character resolution path. After downloading the latest Debug tool library from the windows Website, this problem has also been solved. The issue that the call stack cannot be obtained under Vista is also solved.
This article from the "more test more happy" blog, please be sure to keep this source http://happytest.blog.51cto.com/324097/62791