C ++ from the perspective of assembly (opening part)

Source: Internet
Author: User

 

[Disclaimer: All Rights Reserved. You are welcome to reprint it. Do not use it for commercial purposes. Contact Email: feixiaoxing @ 163.com]

 

Many of my friends, including myself, are not very familiar with many features of the C ++ language. Especially when I was looking for a job a few years ago, I often forced myself to remember some complex questions and answers to cope with exams from my work unit. But often time has passed, and everything is back to the origin. The problems that have not been clarified are still not understood, and everything has not changed. It wasn't until a few years later that I had accumulated experience in the coding process and tried to explain some phenomena using assembler code and memory data. Some may be afraid of assembly language, but it is not necessary. As long as you have some knowledge of C language and stack, you already have the basics of assembly language. In the next several blogs, we will introduce how x86 assembly, data types, data operation logic, pointers, Data, classes, and heavy-load operators are carried out in assembly, let's talk about some personal opinions. Next, we will conduct some small tests and explain them in assembly language. You can do it together.

 

 

(1) char name [] and char * name

 

 

1:

2: void process ()

3 :{

00401020 push ebp

00401021 mov ebp, esp

00401023 sub esp, 4Ch

00401026 push ebx

00401027 push esi

00401028 push edi

00401029 lea edi, [ebp-4Ch]

0040102C mov ecx, 13 h

00401031 mov eax, 0 CCCCCCCCh

00401036 rep stos dword ptr [edi]

4: char name_tmp [] = {"hello "};

00401038 mov eax, [string "hello" (0042201c)]

0040103D mov dword ptr [ebp-8], eax

00401040 mov cx, word ptr [string "hello" + 4 (00422020)]

00401047 mov word ptr [ebp-4], cx

5: char * name_glb = "hello ";

0040104B mov dword ptr [ebp-0Ch], offset string "hello" (0042201c)

6 :}

00401052 pop edi

00401053 pop esi

00401054 pop ebx

00401055 mov esp, ebp

00401057 pop ebp

00401058 ret

Through the above code, we can clearly see the difference between the two. The "hello" string is a global read-only variable with the spatial address 0x0042201C. Name_tmp is the char array in the function. The four rows below the 4th line statement indicate that the global data "hello" is copied to name_tmp twice. The first time is dword and four bytes, the second time is word and two bytes. Therefore, name_tmp contains 6 bytes. In comparison, name_glb has nothing. It just points itself to a global variable, so it is just a pointer.

(2) apple a () and apple B

 

Assume that class apple is defined:

 

 

Class apple

{

Public:

Apple (){}

~ Apple (){}

};

So how did apple a () and apple B compile them separately?

 

 

9: void process ()

10 :{

00401020 push ebp

00401021 mov ebp, esp

00401023 sub esp, 44 h

00401026 push ebx

00401027 push esi

00401028 push edi

00401029 lea edi, [ebp-44h]

0040102C mov ecx, 11 h

00401031 mov eax, 0 CCCCCCCCh

00401036 rep stos dword ptr [edi]

11: apple ();

12: apple B;

00401038 lea ecx, [ebp-4]

0040103B call @ ILT + 20 (apple: apple) (00401019)

13 :}

00401040 lea ecx, [ebp-4]

00401043 call @ ILT + 10 (apple ::~ Apple) (0040100f)

00401048 pop edi

00401049 pop esi

0040104A pop ebx

0040104B add esp, 44 h

0040104E cmp ebp, esp

00401050 call _ chkesp (004010b0)

00401055 mov esp, ebp

00401057 pop ebp

00401058 ret

 

Why didn't apple a () Compile anything? The reason is simple, because the compiler regards apple a () as an extern function, and the return value is apple. The corresponding apple B is the temporary variable actually defined in the function, because there are two apple functions-apple constructor and apple's destructor not far below.

(3) (apple *) (0)-> print ()

 

Here, class apple is defined as follows:

 

 

Class apple

{

Int value;

Public:

Apple (){}

~ Apple (){}

Void print () {return ;}

};

If 0 is set to apple *, will the function print be accessed?

 

 

10: void process ()

11 :{

00401030 push ebp

00401031 mov ebp, esp

00401033 sub esp, 40 h

00401036 push ebx

00401037 push esi

00401038 push edi

00401039 lea edi, [ebp-40h]

0040103C mov ecx, 10 h

00401041 mov eax, 0 CCCCCCCCh

00401046 rep stos dword ptr [edi]

12: (apple *) (0)-> print ();

00401048 xor ecx, ecx

0040104A call @ ILT + 0 (apple: print) (00401005)

13 :}

0040104F pop edi

00401050 pop esi

00401051 pop ebx

00401052 add esp, 40 h

00401055 cmp ebp, esp

00401057 call _ chkesp (004010e0)

0040105C mov esp, ebp

0040105E pop ebp

0040105F ret

By running the function, we find that no exception is generated. Why? Because we found that ecx is passed to the print function as 0, that is, the familiar this pointer is 0. However, we found that the this pointer is not used in the print function, because we didn't access this-> value at all, just a return statement. This shows that the pointer as a class null pointer is not terrible, but it is terrible to use null to access data in the memory.

(4) int m = 1; int n = m ++ + m; what is n?

 

 

10: void process ()

11 :{

0040D4D0 push ebp

0040D4D1 mov ebp, esp

0040D4D3 sub esp, 48 h

0040D4D6 push ebx

0040D4D7 push esi

0040D4D8 push edi

0040D4D9 lea edi, [ebp-48h]

0040D4DC mov ecx, 12 h

0040D4E1 mov eax, 0 CCCCCCCCh

0040D4E6 rep stos dword ptr [edi]

12: int m = 1;

0040D4E8 mov dword ptr [ebp-4], 1

13: int n = m ++ + m;

0040D4EF mov eax, dword ptr [ebp-4]

0040D4F2 add eax, 1

0040D4F5 mov dword ptr [ebp-4], eax

0040D4F8 mov ecx, dword ptr [ebp-4]

0040D4FB add ecx, dword ptr [ebp-4]

0040D4FE mov dword ptr [ebp-8], ecx

0040D501 mov edx, dword ptr [ebp-4]

0040D504 add edx, 1

0040D507 mov dword ptr [ebp-4], edx

14 :}

0040D50A pop edi

0040D50B pop esi

0040D50C pop ebx

0040D50D mov esp, ebp

0040D50F pop ebp

Through the assembly code, we can see that [ebp-4] is the address of m in the stack, [ebp-8] is the address of n in the stack. There are a total of nine statements under int n = m ++ + m. We can analyze: the first three sentences indicate that m increases by 1, and the fourth sentence indicates ecx = m, that is, ecx = 2. The fifth sentence is the sum of ecx and m. The translation is ecx = ecx + m. In this case, ecx = 4. The sixth sentence indicates n = ecx. From the seventh to ninth sentences, m increases by 1. Why is there such a situation? In fact, the truth is very simple, mainly because our expressions are calculated from the right to the left. If you see this, you will understand: first, ++ m, then n = m + m, and finally, m ++.

(5) What is the difference between * p ++ and (* p) ++?

 

 

10: void process ()

11 :{

0040D4D0 push ebp

0040D4D1 mov ebp, esp

0040D4D3 sub esp, 48 h

0040D4D6 push ebx

0040D4D7 push esi

0040D4D8 push edi

0040D4D9 lea edi, [ebp-48h]

0040D4DC mov ecx, 12 h

0040D4E1 mov eax, 0 CCCCCCCCh

0040D4E6 rep stos dword ptr [edi]

12: char data = 'a ';

0040D4E8 mov byte ptr [ebp-4], 61 h

13: char * p = & data;

0040D4EC lea eax, [ebp-4]

0040D4EF mov dword ptr [ebp-8], eax

14: * p ++;

0040D4F2 mov ecx, dword ptr [ebp-8]

0040D4F5 add ecx, 1

0040D4F8 mov dword ptr [ebp-8], ecx

15: (* p) ++;

0040D4FB mov edx, dword ptr [ebp-8]

0040D4FE mov al, byte ptr [edx]

0040D500 add al, 1

0040D502 mov ecx, dword ptr [ebp-8]

0040D505 mov byte ptr [ecx], al

16 :}

0040D507 pop edi

0040D508 pop esi

0040D509 pop ebx

0040D50A mov esp, ebp

0040D50C pop ebp

0040D50D ret

First, create the local variable data. Then copy the data pointer to p. The Assembly Code clearly shows that * p ++ is equivalent to p ++; (* p) ++ first copies the pointer to edx, then obtain the char data pointed to by the edx address and copy it to al. al increases by 1. At the same time, p address is copied to ecx, and al is copied to the address pointed to by the ecx address, which is that simple.

 

There are many other similar problems. You may wish to give it a try:

 

(1) How is the following union arranged in memory? When gcc and vc are compiled, is the allocated memory size the same?

 

 

Typedef union

{

Char m: 3;

Char n: 7;

Int data;

} Value;

(2) are the following addresses consistent?

 

 

Char value1 [] = {"hello "};

Char value2 [] = {"hello "};

Char * pValue1 = "hello ";

Char * pValue2 = "hello ";

Are the value1 and value2 addresses consistent? What about pValue1 and pValue2?

(3) Why is the following statement incorrect? Why is the memory leaked? How to modify it?

 

 

Class apple

{

Char * pName;

Public:

Apple () {pName = (char *) malloc (10 );}

~ Apple () {if (NULL! = PName) free (pName );}

};

 

Void process ()

{

Apple a, B;

A = B;

}

 

(Full text)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.