Supplement and improve the article "Identify source code error lines only through the crash address)

Source: Internet
Author: User

Http://www.vckbase.com/document/viewdoc? Id = 1473


DocumentCodeTools

Supplement and improve the article "Identify source code error lines by means of crash addresses only"

Author:Shanghai weigong communications ROC

Download source code

Read Lao Luo's"Identify the source code error line through the crash address only"(Hereinafter referred to as" Luo Wen "), I feel that this Article can still learn a lot. However, there are still some improper statements in this article, and some operations are too cumbersome. After studying this article, I have supplemented and improved some content in this article on the basis of many lab practices. I hope to debug the program for you, in particular, the release version is helpful. Thank you for your criticism.

I. Applicability of this method
There are many causes of program crash in Windows programs, and the method described in this article is only applicable to: a program crash caused by a statement immediately. For example, in the original article, the divisor is zero. In my actual work, I encountered more situations: the pointer points to an invalid address, and then the pointer content is read or written. For example:

void Crash1(){ char * p =(char*)100; *p=100;}

This method can be used to locate the statement line in the function or subroutine that causes the crash, whether it is a debug or release program, the specific methods are described below. In addition, another common cause of program crash in practice is that the partial variable array in a function or subroutine pays out-of-the-box value, causing the return address of the function or subroutine to be overwritten, this causes the function or subroutine to crash when returning. For example:

#include 
     
      void Crash2();int main(int argc,char* argv[]){Crash2();return 0;}void Crash2(){char p[1];strcpy(p,"0123456789");}
      

Compile and run the release version of this program in VC, and the following error prompt box appears.


Figure 1 running result of the preceding example

The crash address is 0x34333231. The root cause of the crash caused by the preceding statement is apparent in subsequent programs. Obviously, the methods described in this article are powerless. However, in this example, some clues can be found to find the cause of the crash: The local array P in function crash2 has only one byte, apparently, copying the string "0123456789" will copy the string that exceeds the length to the end of array P, that is, * (p + 1) = '1', * (p + 2) = '2', * (p + 3) = ''3'', * (p + 4) = 4 ...... The ASC code of ''1'' is 0x31, ''2' is 0x32, and ''3'' is 0x33, ''4'' is 0x34 ....., because int data in Intel's CPU is low-byte stored in the low address, the memory for saving the string '000000' is, the value displayed as a 4-byte int is 0x34333231. Apparently, when the string "0123456789" is copied, the "1234" character overwrites the return address of the function crash2, causing program crash. If you have other troubleshooting methods for errors that may cause program crashes, you are welcome to discuss them.

Ii. Set the method for compiling and generating Map Files
The method to generate a map file is to manually add compilation parameters to generate a map file. In fact, there are configuration options for generating map files in the vc6 IDE. Click "project"> "settings... ", Select the" Link "page in the pop-up property page, make sure to select" general "in" category ", and finally select the" generate mapfile "option. To display line numbers information in the map file, add/MapInfo: lines to project options. Line numbers information is very important for the method used by Luo Wen to locate the source code line with errors. However, the author will introduce a better method to locate the error code line, which does not require line numbers information.


Figure 2 create a map file

Iii. Methods to locate the crash statement
In the locating method described in "Luo Wen", it is correct to find the location of the function that generates the crash, that is, in the starting address of each function listed in the map file, the nearest address that does not exceed the collapsed address is the address of the function that contains the crashed statement. However, the method for further locating the wrong statement line is not the most appropriate, because the premise is that the base address value is 0x00400000, and generally, the code segment of the PE file starts from the 0x1000 offset. Although this situation is common, you can set the base address to another number in VC, for example, to 0x00500000.

Crash line offset = crash address-0x00400000-0x1000

The formula obviously cannot find the crash line offset. In fact, if the above formula is changed

Crash line offset = crash address-absolute address of the crash Function + relative offset of the Function

This is universal. Take the example in "luowen" as an example: in the map file corresponding to the crashed program mentioned in "luowen", the compilation result of the crashed function is

0001:00000020 ?Crash@@YAXXZ 00401020 f CrashDemo。obj 

For the above results, when using my formula, the "absolute address of the collapsed function" refers to 00401020, and the relative offset of the function refers to 00000020. When the crash address is 0x0040104a, the crash line offset = the crash address-the start address of the crash Function + the relative offset of the function = 0x0040104a-0x00401020 + 0x00000020 = 0x4a. The result is the same as that calculated by Luo Wen. But this formula is more common.

4. A better way to locate the location of the crash statement.
In fact, in addition to the line numbers information in the map file, we can finally locate the line of the wrong statement. In vc6, we can also compile the corresponding Assembly Statement and binary code generated by the program, and the corresponding C/C ++ statement as one of the "Cod" file to locate the error statement line. The following describes how to set the "Cod" file containing three types of information: click "project"> "settings... ", Select the" C/C ++ "page in the pop-up property page, and then select" listing Files "in" category ", in the "listing file type" combo box, select "assembly, machine code, and source ". Next, we will use a specific example to illustrate the specific operation of this method.


Figure 3 generate a "Cod" File

Prepare step 1) the program that generates the crash is as follows:

01 //************************************* * ************************** 02 // File Name: crash. Cpp03 // purpose: demonstrate the new method of finding the source code error line through the crash address 04 // Author: Wei Gong communication roc05 // Date: 2005-5-1606 //********************************* * ****************************** 07 void crash1 (); 08 int main (INT argc, char * argv []) 09 {10crash1 (); 11 return 0; 12} 1314 void crash1 () 15 {16 char * P = (char *) 100; 17 * P = 100; 18}

Prepare step 2) generate a map file according to the settings described in this article (line numbers information is not required ).
Prepare step 3) generate the cod file according to the settings described in this article.
Prepare step 4) compile. Here we take the debug version as an example (if the release version needs to change the compilation option to a non-optimized option, otherwise the above Code will not be compiled because the code is voided during optimization, the result of the crash is invisible.) After compilation, an "EXE" file, a "map" file, and a "Cod" file are generated.
Run this program to generate the following crash prompt:


Figure 4 running result of the preceding example

Troubleshooting step 1) locate the crash function. You can query the map file. Part of the map file generated by my machine compilation is as follows:

 Crash Timestamp is 42881a01 (Mon May 16 11:56:49 2005) Preferred load address is 00400000 Start Length Name Class0001:00000000 0000ddf1H .text CODE0001:0000ddf1 0001000fH .textbss CODE0002:00000000 00001346H .rdata DATA0002:00001346 00000000H .edata DATA0003:00000000 00000104H .CRT$XCA DATA0003:00000104 00000104H .CRT$XCZ DATA0003:00000208 00000104H .CRT$XIA DATA0003:0000030c 00000109H .CRT$XIC DATA0003:00000418 00000104H .CRT$XIZ DATA0003:0000051c 00000104H .CRT$XPA DATA0003:00000620 00000104H .CRT$XPX DATA0003:00000724 00000104H .CRT$XPZ DATA0003:00000828 00000104H .CRT$XTA DATA0003:0000092c 00000104H .CRT$XTZ DATA0003:00000a30 00000b93H .data DATA0003:000015c4 00001974H .bss DATA0004:00000000 00000014H .idata$2 DATA0004:00000014 00000014H .idata$3 DATA0004:00000028 00000110H .idata$4 DATA0004:00000138 00000110H .idata$5 DATA0004:00000248 000004afH .idata$6 DATAAddress Publics by Value Rva+Base Lib:Object0001:00000020 _main 00401020 f Crash.obj0001:00000060 ?Crash1@@YAXXZ 00401060 f Crash.obj0001:000000a0 __chkesp 004010a0 f LIBCD:chkesp.obj0001:000000e0 _mainCRTStartup 004010e0 f LIBCD:crt0.obj0001:00000210 __amsg_exit 00401210 f LIBCD:crt0.obj0001:00000270 __CrtDbgBreak 00401270 f LIBCD:dbgrpt.obj...

For the crash address 0x00401082, the closest address smaller than this address (the address in RVA + base) is 00401060, and its corresponding function name is? Crash1 @ yaxxz, because all function names starting with question marks are c ++ modified names, "@ yaxxz" is the suffix used to distinguish between overloaded functions, so? Crash1 @ yaxxz is the function crash1 () in our source program.
Troubleshooting step 2) locate the error line. Open the compiled "Cod" file. The content of the file generated on my machine is as follows:

TITLEE:/Crash/Crash。cpp.386Pinclude listing.incif @Version gt 510.model FLATelse_TEXTSEGMENT PARA USE32 PUBLIC ''CODE''_TEXTENDS_DATASEGMENT DWORD USE32 PUBLIC ''DATA''_DATAENDSCONSTSEGMENT DWORD USE32 PUBLIC ''CONST''CONSTENDS_BSSSEGMENT DWORD USE32 PUBLIC ''BSS''_BSSENDS$$SYMBOLSSEGMENT BYTE USE32 ''DEBSYM''$$SYMBOLSENDS$$TYPESSEGMENT BYTE USE32 ''DEBTYP''$$TYPESENDS_TLSSEGMENT DWORD USE32 PUBLIC ''TLS''_TLSENDS;COMDAT _main_TEXTSEGMENT PARA USE32 PUBLIC ''CODE''_TEXTENDS;COMDAT ?Crash1@@YAXXZ_TEXTSEGMENT PARA USE32 PUBLIC ''CODE''_TEXTENDSFLATGROUP _DATA, CONST, _BSSASSUMECS: FLAT, DS: FLAT, SS: FLATendifPUBLIC?Crash1@@YAXXZ; Crash1PUBLIC_mainEXTRN__chkesp:NEAR;COMDAT _main_TEXTSEGMENT_mainPROC NEAR; COMDAT; 9    : {  0000055 push ebp  000018b ec mov ebp, esp  0000383 ec 40 sub esp, 64; 00000040H  0000653 push ebx  0000756 push esi  0000857 push edi  000098d 7d c0 lea edi, DWORD PTR [ebp-64]  0000cb9 10 00 00 00 mov ecx, 16; 00000010H  00011b8 cc cc cc cc mov eax, -858993460; ccccccccH  00016f3 ab rep stosd; 10   : Crash1();  00018e8 00 00 00 00 call ?Crash1@@YAXXZ; Crash1; 11   : return 0;  0001d33 c0 xor eax, eax; 12   : }  0001f5f pop edi  000205e pop esi  000215b pop ebx  0002283 c4 40 add esp, 64; 00000040H  000253b ec cmp ebp, esp  00027e8 00 00 00 00 call __chkesp  0002c8b e5 mov esp, ebp  0002e5d pop ebp  0002fc3 ret 0_mainENDP_TEXTENDS;COMDAT ?Crash1@@YAXXZ_TEXTSEGMENT_p$ = -4?Crash1@@YAXXZ PROC NEAR; Crash1, COMDAT; 15   : {  0000055 push ebp  000018b ec mov ebp, esp  0000383 ec 44 sub esp, 68; 00000044H  0000653 push ebx  0000756 push esi  0000857 push edi  000098d 7d bc lea edi, DWORD PTR [ebp-68]  0000cb9 11 00 00 00 mov ecx, 17; 00000011H  00011b8 cc cc cc cc mov eax, -858993460; ccccccccH  00016f3 ab rep stosd; 16   :  char * p =(char*)100;  00018c7 45 fc 64 0000 00 mov DWORD PTR _p$[ebp], 100; 00000064H; 17   :  *p=100;  0001f8b 45 fc mov eax, DWORD PTR _p$[ebp]  00022c6 00 64 mov BYTE PTR [eax], 100; 00000064H; 18   : }  000255f pop edi  000265e pop esi  000275b pop ebx  000288b e5 mov esp, ebp  0002a5d pop ebp  0002bc3 ret 0?Crash1@@YAXXZ ENDP; Crash1_TEXTENDSEND

Where

?Crash1@@YAXXZ PROC NEAR; Crash1, COMDAT

It is the starting line of the crash1 assembly code. The code that generates the crash is located somewhere later. The next line is:

; 15   : {

"{" After the colon indicates the statement in the source file, and "15" before the colon indicates the number of rows of the statement in the source file. After that, the offset address, binary code, and assembly code after the statement is compiled are displayed. For example

0000055 push ebp

"0000" indicates the offset relative to the start address of the function, "55" indicates the compiled machine code, and "Push EBP" indicates the compiled code. From the "Cod" file, we can see that a (C/C ++) statement usually needs to be compiled into several Assembly statements. In addition, if some Assembly statements are too long, they will be displayed in two lines, for example:

00018c7 45 fc 64 0000 00 mov DWORD PTR _p$[ebp], 100; 00000064H

"0018" indicates the relative offset. In the debug version, this data is the offset relative to the start address of the function (at this time, the first statement of each function is offset by 0000 ); in the release version, the offset is relative to the first statement of the code segment (that is, the relative offset of the first statement of the code segment is 0000, and the relative offset of the first statement of each function in the future is not 0000 ). "C7 45 FC 64 00 00 00" is the compiled machine code, "mov dword ptr _ p $ [EBP], 100" is assembly code, in assembly language "; "The post content is a comment, so"; 00000064 H "is a comment, which is used to indicate that 100 is converted to a hexadecimal value of" 00000064 H ".
Next, we start to locate the statement that generates a crash.
Step 1: Calculate the offset between the crash address and the crash function. In this example, the address of the crash statement is known (0x00401082 ), and the start address of the corresponding function (0x00401060), so the offset of the crash address relative to the start address of the function is easy to calculate:

Crash offset address = crash statement address-start address of the crash function = 0x00401082-0x00401060 = 0x22.

Step 2: Calculate the relative offset of the erroneous Assembly statement in the cod file. We can see that the relative offset address of the function crash1 () in the cod file is 0000, then

Relative offset of the crash statement in the cod file = relative offset of the crash function in the cod file + crash offset address = 0x0000 + 0x22 = 0x22

Step 3: Let's take a look at the code of the crash1 function offset 0x22 division? The result is as follows:

 00022c6 00 64 mov BYTE PTR [eax], 100; 00000064H

This Assembly statement saves 100 to the memory unit referred to by the register eax, and the storage space is 1 byte ). The program crashed when executing this command. Obviously, eax contains an invalid address, so the program crashed!
Step 4: view the source code of the Assembly statement in the first few lines. The result is as follows:

; 17   :  *p=100;

Among them, 17 indicates that the statement is located in 17th rows of the source file, and "* P = 100;" is the statement that generates a crash in the source file.
So far, we can only find the source code statement that causes the crash and the exact location in the source file where the statement is located, and even find the exact assembly code that causes the crash!
How do you feel better?

Section 5

1. The new method should also pay attention to the applicable scope, that is, a program crash immediately caused by a statement. In addition, I do not know whether other compilers can generate similar "Cod" files except vc6.
2. We can compare the "Cod" files of the debug and releae versions generated by the new method to find the release version (or debug version) only) there are bugs (or other traits) not available in another version ). For example, if you open the "Cod" file of the release version, you can see why the debug version crashes, but the release version does not: in the original version of release, the statements that crash are not compiled at all. For the release version in the same sample to see the crash effect, you need to change the compilation option to an unoptimized configuration.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.