[Disclaimer: All Rights Reserved. You are welcome to reprint it. Do not use it for commercial purposes. Contact Email: feixiaoxing @ 163.com]
Many of my friends, including myself, are not very familiar with many features of the C ++ language. Especially when I was looking for a job a few years ago, I often forced myself to remember some complex questions and answers to cope with exams from my work unit. But often time has passed, and everything is back to the origin. The problems that have not been clarified are still not understood, and everything has not changed. It wasn't until a few years later that I had accumulated experience in the coding process and tried to explain some phenomena using assembler code and memory data. Some may be afraid of assembly language, but it is not necessary. As long as you have some knowledge of C language and stack, you already have the basics of assembly language. In the next several blogs, we will introduce how x86 assembly, data types, data operation logic, pointers, Data, classes, and heavy-load operators are carried out in assembly, let's talk about some personal opinions. Next, we will conduct some small tests and explain them in assembly language. You can do it together.
(1) Char name [] and char * Name
1:2: void process()3: {00401020 push ebp00401021 mov ebp,esp00401023 sub esp,4Ch00401026 push ebx00401027 push esi00401028 push edi00401029 lea edi,[ebp-4Ch]0040102C mov ecx,13h00401031 mov eax,0CCCCCCCCh00401036 rep stos dword ptr [edi]4: char name_tmp[] = {"hello"};00401038 mov eax,[string "hello" (0042201c)]0040103D mov dword ptr [ebp-8],eax00401040 mov cx,word ptr [string "hello"+4 (00422020)]00401047 mov word ptr [ebp-4],cx5: char* name_glb = "hello";0040104B mov dword ptr [ebp-0Ch],offset string "hello" (0042201c)6: }00401052 pop edi00401053 pop esi00401054 pop ebx00401055 mov esp,ebp00401057 pop ebp00401058 ret
Through the above code, we can clearly see the difference between the two. The "hello" string is a global read-only variable with the spatial address 0x0042201c. Name_tmp is the char array in the function. The four rows below the 4th line statement indicate that the global data "hello" is copied to name_tmp twice. The first time is DWORD and four bytes, the second time is word and two bytes. Therefore, name_tmp contains 6 bytes. In comparison, name_glb has nothing. It just points itself to a global variable, so it is just a pointer.
(2) Apple A () and Apple B
Assume that class apple is defined:
class apple{public: apple() {} ~apple() {}};
So how did Apple A () and Apple B compile them separately?
9: void process()10: {00401020 push ebp00401021 mov ebp,esp00401023 sub esp,44h00401026 push ebx00401027 push esi00401028 push edi00401029 lea edi,[ebp-44h]0040102C mov ecx,11h00401031 mov eax,0CCCCCCCCh00401036 rep stos dword ptr [edi]11: apple a();12: apple b;00401038 lea ecx,[ebp-4]0040103B call @ILT+20(apple::apple) (00401019)13: }00401040 lea ecx,[ebp-4]00401043 call @ILT+10(apple::~apple) (0040100f)00401048 pop edi00401049 pop esi0040104A pop ebx0040104B add esp,44h0040104E cmp ebp,esp00401050 call __chkesp (004010b0)00401055 mov esp,ebp00401057 pop ebp00401058 ret
Why didn't apple a () Compile anything? The reason is simple, because the compiler regards apple a () as an extern function, and the return value is apple. The corresponding Apple B is the temporary variable actually defined in the function, because there are two apple functions-Apple constructor and Apple's destructor not far below.
(3) (Apple *) (0)-> Print ()
Here, class apple is defined as follows:
class apple{ int value;public: apple() {} ~apple() {} void print() { return;} };
If 0 is set to Apple *, will the function print be accessed?
10: void process()11: {00401030 push ebp00401031 mov ebp,esp00401033 sub esp,40h00401036 push ebx00401037 push esi00401038 push edi00401039 lea edi,[ebp-40h]0040103C mov ecx,10h00401041 mov eax,0CCCCCCCCh00401046 rep stos dword ptr [edi]12: ((apple*)(0))->print();00401048 xor ecx,ecx0040104A call @ILT+0(apple::print) (00401005)13: }0040104F pop edi00401050 pop esi00401051 pop ebx00401052 add esp,40h00401055 cmp ebp,esp00401057 call __chkesp (004010e0)0040105C mov esp,ebp0040105E pop ebp0040105F ret
By running the function, we find that no exception is generated. Why? Because we found that ECx is passed to the print function as 0, that is, the familiar this pointer is 0. However, we found that the this pointer is not used in the print function, because we didn't access this-> value at all, just a return statement. This shows that the pointer as a class NULL pointer is not terrible, but it is terrible to use null to access data in the memory.
(4) int M = 1; int n = m ++ + m; what is n?
10: void process()11: {0040D4D0 push ebp0040D4D1 mov ebp,esp0040D4D3 sub esp,48h0040D4D6 push ebx0040D4D7 push esi0040D4D8 push edi0040D4D9 lea edi,[ebp-48h]0040D4DC mov ecx,12h0040D4E1 mov eax,0CCCCCCCCh0040D4E6 rep stos dword ptr [edi]12: int m = 1;0040D4E8 mov dword ptr [ebp-4],113: int n = m++ + ++m;0040D4EF mov eax,dword ptr [ebp-4]0040D4F2 add eax,10040D4F5 mov dword ptr [ebp-4],eax0040D4F8 mov ecx,dword ptr [ebp-4]0040D4FB add ecx,dword ptr [ebp-4]0040D4FE mov dword ptr [ebp-8],ecx0040D501 mov edx,dword ptr [ebp-4]0040D504 add edx,10040D507 mov dword ptr [ebp-4],edx14: }0040D50A pop edi0040D50B pop esi0040D50C pop ebx0040D50D mov esp,ebp0040D50F pop ebp
Through the assembly code, we can see that [ebp-4] is the address of m in the stack, [ebp-8] is the address of N in the stack. There are a total of nine statements under int n = m ++ + M. We can analyze: the first three sentences indicate that M increases by 1, and the fourth sentence indicates ECx = m, that is, ECx = 2. The fifth sentence is the sum of ECx and M. The translation is ECx = ECx + M. In this case, ECx = 4. The sixth sentence indicates n = ECx. From the seventh to ninth sentences, M increases by 1. Why is there such a situation? In fact, the truth is very simple, mainly because our expressions are calculated from the right to the left. If you see this, you will understand: first, ++ m, and then
N = m + M, and finally M ++.
(5) What is the difference between * P ++ and (* P) ++?
10: void process()11: {0040D4D0 push ebp0040D4D1 mov ebp,esp0040D4D3 sub esp,48h0040D4D6 push ebx0040D4D7 push esi0040D4D8 push edi0040D4D9 lea edi,[ebp-48h]0040D4DC mov ecx,12h0040D4E1 mov eax,0CCCCCCCCh0040D4E6 rep stos dword ptr [edi]12: char data = 'a';0040D4E8 mov byte ptr [ebp-4],61h13: char* p = & data;0040D4EC lea eax,[ebp-4]0040D4EF mov dword ptr [ebp-8],eax14: *p++;0040D4F2 mov ecx,dword ptr [ebp-8]0040D4F5 add ecx,10040D4F8 mov dword ptr [ebp-8],ecx15: (*p)++;0040D4FB mov edx,dword ptr [ebp-8]0040D4FE mov al,byte ptr [edx]0040D500 add al,10040D502 mov ecx,dword ptr [ebp-8]0040D505 mov byte ptr [ecx],al16: }0040D507 pop edi0040D508 pop esi0040D509 pop ebx0040D50A mov esp,ebp0040D50C pop ebp0040D50D ret
First, create the local variable data. Then copy the data pointer to P. The Assembly Code clearly shows that * P ++ is equivalent to P ++; (* P) ++ first copies the pointer to EDX, then obtain the char data pointed to by the edX address and copy it to Al. Al increases by 1. At the same time, P address is copied to ECx, and Al is copied to the address pointed to by the ECX address, which is that simple.
There are many other similar problems. You may wish to give it a try:
(1) How is the following Union arranged in memory? When GCC and VC are compiled, is the allocated memory size the same?
typedef union {char m:3;char n:7;int data;}value;
(2) are the following addresses consistent?
Char value1 [] = {"hello"}; char value2 [] = {"hello"}; char * pvalue1 = "hello"; char * pvalue2 = "hello "; are the value1 and value2 addresses consistent? What about pvalue1 and pvalue2?
(3) Why is the following statement incorrect? Why is the memory leaked? How to modify it?
class apple{ char* pName;public: apple() { pName = (char*)malloc(10);} ~apple() {if(NULL != pName) free(pName);}};void process(){ apple a, b; a = b;}
(Full text)