From the perspective of assembly C ++ (x86 assembly) 02

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Speaking of how to read the Assembly from the perspective of the C ++ LanguageCodeIt becomes a problem we need to solve. In fact, to be honest, compilation is not difficult. We only need to understand the following issues:

(1) What language is assembly?

(2) What are the main contents in assembly?

(3) How does the Assembly Language correspond to the actual C/C ++ language code one by one?

(1) Languages of Assembly

In fact, assembly language is a mark of the CPU instruction code. Different CPUs have different instruction sets. CPUs on normal PCs generally come from AMD or Intel, which is the x86 instruction set we are talking about today. Other similar CPUs include PowerPC, which are mainly used by vswitches and routers of telecom enterprises. Arm type, which is mainly used by smart terminals or devices in the category of devices; Sun or, it is mainly used by Sun servers. Because the CPU Instruction Set and binary code are almost one-to-one, the assembly language can not only help us quickly understand the hardware of the machine, but also help us understandProgramHow is it running on the device.

(2) What are the contents of assembly languages?

There are a lot of content in the assembly language, but there are actually not many content related to our C/C ++ language. In general, you only need to know the basic operations and address access between registers, segment addresses, stacks, and registers.

(3) How does the Assembly Language correspond to the actual language one by one?

We start with an example. Generally, a statement must be split into several Assembly statements. For example:

[CPP] View plaincopy

IntM = 10;
IntN = 20;
IntP = m + N;

Let's assume that M, N, and P are all in a function, so in fact, all three variables are temporary variables. before entering the function, both EBP and ESP need to free up space to prepare for these temporary variables. These three statements should be explained in this way.

[CPP] View plaincopy

43:IntM = 10;
004012e8 mov dword ptr [ebp-4], 0ah
44:IntN = 20;
004012ef mov dword ptr [ebp-8], 14 h
45:IntP = m + N;
004012f6 mov eax, dword ptr [ebp-4]
004012f9 add eax, dword ptr [ebp-8]
004012fc mov dword ptr [ebp-0Ch], eax

We can intuitively see the correspondence between the Assembly Statement and the C language through the code above. In the first sentence, M is assigned a value of 10, and the memory is the downward memory of EBP. In the second sentence, it is similar to the first sentence. In the third sentence, it is a little complicated. Let's analyze it. First we can see that the CPU from the stack M data found out, that is, the [ebp-4] address at the data, then, the CPU uses the same method to find the n data and directly add it to the Register eax. The last step is relatively simple, is to save the eax data on the address at the [ebp-0c. As long as it is a temporary variable inside the function, you will see this form. Temporary variables are obtained by the EBP offset address.

Have you ever wondered if p is a global variable?

[CPP] View plaincopy

45:IntM = 10;
004012e8 mov dword ptr [ebp-4], 0ah
46:IntN = 20;
004012ef mov dword ptr [ebp-8], 14 h
47: P = m + N;
004012f6 mov eax, dword ptr [ebp-4]
004012f9 add eax, dword ptr [ebp-8]
004012fc mov [P (0042b0b4)], eax

Seeing the code above, we found that the assignment direction of M and N has not changed. The change is that the value of the last register eax is assigned an absolute address 0x42b0b4. This illustrates a problem. After the program is loaded into the memory, the global variable has an independent address space and will not change with the stack floating.

As we have said before, all variables in the function will be stored in the stack space between EBP and esp. How does the Code work? Can we see such a piece of assembly code?

[CPP] View plaincopy

41:VoidProcess ()
42 :{
004012d0 push EBP
004012d1 mov EBP, ESP
004012d3 sub ESP, 4ch
004012d6 push EBX
004012d7 push ESI
004012d8 push EDI
004012d9 Lea EDI, [ebp-4Ch]
004012dc mov ECx, 13 H
004012e1 mov eax, 0 cccccccch
004012e6 rep STOs dword ptr [EDI]
43:IntM = 10;
004012e8 mov dword ptr [ebp-4], 0ah
44:IntN = 20;
004012ef mov dword ptr [ebp-8], 14 h
45:IntP = m + N;
004012f6 mov eax, dword ptr [ebp-4]
004012f9 add eax, dword ptr [ebp-8]
004012fc mov dword ptr [ebp-0Ch], eax
46 :}

Let's print out the complete code of a function just now. We found that, in fact, before the M operation of the temporary variable, the function has done a lot of preparatory operations, the main purpose is to: (1) prepare space for the temporary variable; (2) store the registers used in function operations. This is because the registers are resources shared by all functions. If the original data is not well recorded, after the function returns, the register will forget the original value and cannot continue to calculate it correctly in the original state. There are 10 sentences between address 0x4012d0 and address 0x4012e6. The first sentence is EBP pressure stack; the second sentence is ESP copied to EBP; the third sentence is ESP auto-reduced 4C size, which is generally determined by the number of temporary variables defined inside the function; the fourth sentence is EBX pressure stack; the fifth sentence is ESI pressure stack; the sixth sentence is EDI pressure stack; the seventh to tenth sentences are, set all 0x4 bytes above the [ebp-4C] to CC, EDI as the starting address, ECx as the number of cycles 0x13 times, DWORD indicates that 4 bytes are set each time.

So what does the function do before it returns?

[CPP] View plaincopy

46 :}
004012ff pop EDI
00401300 pop ESI
00401301 pop EBX
00401302 mov ESP, EBP
00401304 pop EBP
00401305 RET

In fact, the content returned by the function is very simple. The first sentence is the EDI stack, the second sentence is the ESI stack, and the third sentence is the EBX stack, which is in the opposite order of the previous register stack. The last three sentences are particularly important. We can see that EBP is copied to ESP, EBP goes out of the stack, and the function returns, so that everything is restored to the status before the function call.

So how does the input parameter handle the function call?

[CPP] View plaincopy

53: Process (20 );
0040efa4 push 14 h
0040efa6 call @ ILT + 40 (process) (0040102d)
0040 efab add ESP, 4

The above code is the case when the process function contains a parameter. After the function is called, esp + 4 and the stack is restored. Stack + 4, mainly because the parameter space is 4 bytes. The following figure shows the stack space when a function is called:

| Function parameters |

| Return address |
| Temporary Variable | <------------------------ EBP

| Pressure stack register |

| Stack top | <------------------------- ESP

Other knowledge:

(1) There are many global computing CPU registers, such as eax, EBX, ECx, and EDX. What we usually call ax, BX, CX, DX refers to their low position.

(2) The segment register stores the code segment, data segment, and stack segment of the program. The code segment stores all program code, the Data Segment stores the code of the full data variable, and the stack is all the stack space.

(3) Currently, the VC compiler supports Embedded Assembly. If you are interested, you can try it in the function. The following code is just an example:

[CPP] View plaincopy

VoidProcess (Int* Q)
{
_ ASM {
Push eax
Push EBX
Push ECx
MoV eax, 0x10
MoV EBX, 0x15
Add eax, EBX
MoV ECx, Q
MoV [ECx], eax
Pop ECx
Pop EBX
Pop eax
}
}

(Complete)

[ notice: the following blog describes assembly languages and pointers .

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

From the perspective of assembly C ++ (x86 assembly) 02

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support