"Assembly Language Program Design" study notes (3) C and assembly language

Source: Internet
Author: User
Tags arithmetic constant
3.1 80x86 Compilation and C language -1 3.1.1 80x86 Compilation and C language-1-compiling the system structure in the eyes of programmers compiling the system structure in the eyes of programmers

How to generate assembly code from C code

Gcc-o2-s Code.c-m32-fno-omit-frame-pointer

-o2 indicates that there is a certain level of optimization;
-S means to code.c the original C program, compile it into a. S assembler;
-m32 indicates that you want to generate 32-bit code. assembly Language data format

3.1.2 80x86 compilation with C language-1-First Assembly instruction example of the first assembly instruction

L represents the addition of two integers.

This add instruction operand has two actually is x plus y equals t because it is two operands then we must know that it must be 8ebp EAX added up and put into the eax inside. That is, the operand of this instruction is two purpose register is on the right that of course it is both a purpose and source actually Two are all sources added to the destination register, the second source on the right is added up. data transfer Instructions (MOV)

(
An immediate number is actually a constant integer.
) different operand type combinations supported by the data transfer Directive

What is inside the parentheses represents the memory address.

(
For example,%eax, which represents a memory address.
) Simple addressing mode

If we have an operand that accesses memory, then how is the memory address calculated or referred to as how it is addressed.

(
-Indirect addressing

Take Movl (%ECX),%eax as an example:

The register ECX inside the value as the memory address to access, the memory address inside the data out, that is, the memory address indicated in the location of the data to take out, Take it as operand, MOV to EAX register.

Note:
1. Parentheses inside the percent ECX (%ECX), indicating the address;
2. If there are 1 constants outside, if you add $, it means that it is 1 constant, not an address. Base Address + offset addressing

Take MOVL 8 (%EBP),%edx as an example:

Take out the value of the Register EBP, add 8, add out and as the memory address, with the memory address , the memory address of the number out, take out and then move the past.
) Indirect addressing

MOVL (%ECX),%eax, (%ECX) is the register ECX inside the value as a memory address, to access the memory address inside the data, take this data out, rather than address out, is the memory The data in the location indicated in address is taken out and used as the operand of the Mov past. This is called indirect addressing. Addressing with base address plus offset

There's also an address called base address plus offset. Actually, it's just that I put a constant outside the parentheses, like this is 8, 8 brackets, plus the value in the EBP register, and the 8 Plus and the memory address, and then take this number out of memory. Take it out and move it over. This is called base address plus offset addressing

So notice how this representation is expressed in parentheses inside the percent ecx so this way ecx value means address if there is a constant outside, note that there is no dollar in front of the constant. If you add the dollar number, it means it's a constant, not an address, don't add this. Add two numbers and turn them into memory addresses. This number is taken out as one of the operands. 3.1.3 80x86 Assembly and C-language-1-addressing mode use instances

SWAP.C:

void swap (int *xp, int *yp)
{
        int t0 = *XP;
        int t1 = *YP;
        *XP = T1;
        *yp = t0;
}

The SWAP.S of the Assembly are:

$ gcc-o2-s Swap.c-m32-fno-omit-frame-pointer $ cat swap.s. File "swap.c". Section. Text.unlik Ely, "Ax", @progbits. LCOLDB0:. Text. LHOTB0:. P2align 4,,15. Globl swap. Type swap, @function swap:.    LFB0:. Cfi_startproc pushl%ebp. Cfi_def_cfa_offset 8. Cfi_offset 5,-8 MOVL
        %ESP,%EBP cfi_def_cfa_register 5 pushl%ebx. Cfi_offset 3, -12 movl 8 (%EBP),%edx
        MOVL (%EBP),%eax movl (%edx),%ecx movl (%eax),%ebx movl%ebx, (%edx) Movl%ecx, (%eax) popl%ebx. Cfi_restore 3 popl%ebp. Cfi_restore 5. C FI_DEF_CFA 4, 4 ret. Cfi_endproc. LFE0:. Size swap,.-swap. Section. text.unlikely. LCOLDE0:. Text. LHOTE0:. Ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609". Section. NotE.gnu-stack, "", @progbits 

Example Analysis

(
This section of the video on the first paragraph of the assembly code is very detailed, in particular, each assembly instruction, in the stack is exchanged, moved, stored, explained very well. Note that the above transformation, suggested to see the video, more detailed, the screenshot only put 2 photos.

It is recommended to understand the stack in the data structure, and then look at this video, the harvest will be very large.
) 3.1.4 80x86 compilation with C language-1-Address calculation instruction with other-1 addressing

Addressing mode instances

Address Calculation Instructions

Address calculation instruction Leal L means the suffix is my count. The operand is a double word type that it actually resembles mov very much like it is also two operands src and destination Note that SRC is the address calculation expression what is called an address calculation expression As we've just said, it's actually a four-factor base address plus index multiply scale plus displacement that is, SRC is this expression, as long as it's a legitimate expression, it can be src.

Destination is generally a register that I assign the address expression value of src to a calculated address to destination

Note that it is very similar to MOV, but there are essentially different mov if your src is an address expression then it will go to the address and then it is really going to visit the memory of the memory to take the data out to the Mov past

Leal is very simple. The address I'm trying to figure out is the data that I need. You put this address expression to the finish, and after that, the address itself, I moved to destination as the object of the operation. This is a big difference because it's a complete address calculation

So the use of it is a big use of address calculation but not to be stored in C I define an array x what the array I want to calculate its address d is equal to a fetch address operation x square brackets I This is a very common screen on this very common this is actually the case with the Leal command calculation is finished I'm going to have to add the base to the index and I'll be done with sizeof, right? This is the address I need, this address is my goal, my purpose is to assign the address value I want to assign to p to be ok so it can be used for address calculation without a visit

The other is a clever address calculation expression inside there are 4 elements base index scale plus D this constant then it can be done X plus k times y this type of integer calculation that the x and Y can be registers is also said to be variable k is a constant meaning if you can put an entire Number calculation is expressed in a form like this, then it is very convenient to use Leal to calculate this than you use a separate addition or subtraction or take the command to separate the operation efficiency to be fast so that we can find the compiler a large number of use Leal instructions in addition to the completion of the address calculation of a large number of use Leal instructions Computes an integer calculation instruction that completes an integer value

The arithmetic right shift is not the same as the logical right SHIFT, because your data is moved to the right of the high position to fill in the other data to fill what? logical right Shift simply fill 0 arithmetic right shift to fill the highest bit symbol bit of the moved data bits this is a difference.

3.1.5 80x86 Compilation with C language-1-Address calculation instruction with others-2 Using the Leal Directive for calculations (example 1)

Example 2

Sarl, a represents arith arithmetic, and R represents right shift. L indicates that it is a double word operation 3.1.6 80x86 Assembly with C language -1-x86-64 The pass register with assembly instruction--preliminary x86-32 with x86-64 data type width

Universal Registers for x86-64

The RSP still retains the work site registers. swap under the x86-32

x86-64 under the ...

x86-64 a long int type of swap

Summary

format of the x86 assembly

Exercises

3.2 80x86 Compilation and C language -2 3.2.1 80x86 compilation with C language-2-condition code control flow (controlled stream)

assembling the architecture in the eyes of programmers (part)

Condition Code

First of all, let's explain what the conditional Code condition code is divided into four bits. You can understand that it's a four-bit register. These flags are stored on each inside.

The first one is that the CF bit is called the carry bit, the second one is called the sign bit, the third one is called 0 which means that your calculation is not 0 by it to identify the fourth one is the overflow bit

Let's take a look at this. Overflow that can be used to detect unsigned integer operations recall that if two unsigned integers are added, if the highest bit upward yields a carry-over, then it is indicative of an overflow of unsigned integer operations.

So if this addition is a negative, the top is a 1, and of course we're going to use it as a symbol. So this time, this SF is set to 1 otherwise it will be set to 0 that is, according to different results

I remember what I used to tell you in class is a whole number. It's all about the same thing at the machine level. It's a string of 0101, 01 strings, so in the machine it looks like you're light from this storage format, it doesn't show whether it's a band or a symbol, and we've talked about it because of the features of the complement. You do simple addition and subtraction operations, such as the addition and subtraction of commands at the machine level, the addition and subtraction of complement operations and the addition and subtraction of the original code is actually a set of circuits actually a directive of one kind of instruction to achieve so for add, it does not distinguish whether you are signed or unsigned integer type

But the difference is that this is where the add command is done, or when it does, and it will judge that the two carry flags are the CF bit and the of bit, which means that if it's an unsigned integer on the one hand, if it overflows, the CF will be set if the two numbers are treated as signed integers. If it overflows, it will put the of, that is, the hardware above or consider more

We've asked everyone before. If there is no distinction between the signed and unsigned numbers on the hardware level, then who is to tell the difference? Of course, the compiler will tell you whether the number is signed or unsigned, but how does the compiler differentiate it? It also relies on some of the hardware's instructions or certain condition codes. So it relies on these two conditional codes to judge if the compiler thinks you have two numbers that are unsigned, and it overflows, then it's going to be judged by the CF bit. 3.2.2 80x86 compilation with C language-2- Comparison instruction and test order condition code

3.2.3 80x86 Compilation with C language-2-read Condition code instruction-1 Read the condition code

3.2.4 80x86 Compilation with C language-2-read Condition code instruction-2 Condition Code

Then read the condition code to everyone is a typical simple C language of a function of we see through the GCC how to convert this C function to what kind of a compilation in our intuition there is a recognition on the C level if you want to read a larger than the size of a small two of the size of the small actually return this number

How is it implemented at the assembly level? The SETX instruction is to read the current condition code or some combination of the condition code into the byte register of the purpose. It only deposits one byte of the remaining three bytes and will not be modified so this time, you put a register it's probably the lowest byte to get rid of

So what about the rest of the high three bytes? So in general we will use the MOVEZBL command to the purpose register AH High 0 extension is the name as a result of our previous lesson also said is the last move this time called Move command z that is, 0 extension because it is B to l A byte to a doubleword so that the 8-bit extension to 32-bit code that how to expand it is Z called zero 0 extension 3.2.5 80x86 Assembly and C language -2-x86-64 read conditional code command x86-64 read the condition code /c5> reading the condition code

We went through the manual for a reason, so we should see clearly what that means.

That is, under the 64-bit architecture, below the X86 64 architecture, if a 32-bit operation is performed that 32-bit operation produces a 32-bit result then the automatic 0 extension expands to a high 32-bit meaning that the first instruction under the 64-bit architecture Eax himself made a different thing or done it, of course, it turned out to be 0. This result 0 will automatically expand to 32 bits of this Rax's high

Well, that looks like a little bit weird, so at least I'm going to see my purpose from the command. Register is EAX I'm a 32-bit operation. So why do you want to put my this tallest is Rax is equivalent to eax high 32 bit to it automatically please 0 give it 0 to expand it

Of course, this is actually related to the processor assembly line correlation is related to the relevance of the pipeline to do this is actually to reduce is the pipeline running in the processor AH different is the relationship between the data dependence between the different instructions it's not going to talk about it. 3.2.6 80x86 Compilation and C language-2-jump instruction jump instruction

What is the condition code in fact a lot of use for the conditional execution is that you have a series of statements in front of you, such as add or test or compare to do a certain operation, especially like the compare two data is smaller than the big than the smaller than the big one if the data is big There must be different if else this different path to run so this time is involved in how to involve the jump command, especially conditional jump so the condition code a lot of actually used at this level

JX J starts with the Jump command, which is a suffix, which means I'm relying on what conditions or combinations of conditional codes that I jump on. So it means that it depends on the current condition code, of course, or the combination of the condition code. Select the next execution statement that you're running along. or jumping to another place called a jump command. That's the first one in jmp, that's all I'm going to say is unconditional jump 1. The condition is that 1 is always satisfied. That's what we didn't say.

Well, the rest is the one we're looking at. Of course, the suffix is like set, we just said Sete Ah Setne Ah, this is JE jne and so what does that mean? That is, if the relevant condition satisfies for example the JE Condition code is the ZF this bit is 1 means that the comparison is a zero is called result 0 or The result is equal, so in this case, jump on it, or we'll go in sequence.

What about the jne just like that? and the JS Jns is similar, of course. JA JB JG JL we talked about it. There is a set that is used for this signed integer comparison. The two sets that we want to differentiate between the two sets we have to distinguish from each other. Of course, it's enough for you to differentiate it. Set is exactly the same as the number of symbols used for the processing of some for unsigned number of processing then we have to distinguish between conditional jump instruction Instance

Cmpl cmpl command Cmpl is EAX compared with edx is actually edx minus eax then look up movl EBP 8 to edx movl EBP 12 to eax actually nothing is to take two parameters are taken in the X Y are taken in and taken in, it is to reduce a After the reduction is over, I compare the size of the final is definitely a big decrease if it is less than equals, then run to here to use eax minus edx instead of the edx minus EAX then of course remember the result value is put into the eax inside the difference is put into the eax inside go That's actually the logic on the whole.

3.2.7 80x86 Compilation with C language-2-conditional move instruction C language: Conditional Expressions

x86-64 ...

A cmovle, so let's guess what it's for, but it's also obvious that you don't have a conditional jump. What is the order of one of the instructions? A conditional delivery instruction Cmov conditional move on condition move so what's condition here? It's this c is Le

Le let us think of what AH set that set of jump that set of suffixes are le ah what je ah he kind of this is the same it's what this means is that the condition is that Le satisfied with the data from SRC to dest this C ah with set after the suffix and the suffix after the jump C is the same thing. If the EDI is less than or equal to the ESI, then I'll move edx into the eax, or I'll do nothing.

So let's go back and look at this piece of code, actually, it's really clear that we're going to start with the x minus y and y minus x two values, and then, of course, I'm going to put one in the eax first. Because our final return value is through EAX, and then compare. Take a look at the front I this result has no lil bit if put right, that is to say this le does not satisfy that this instruction is equal to is an empty instruction empty instruction is passed if not lil bit equal to this le satisfies is less than equals this condition satisfies that I will put edx into eax inside to put another difference value to replace the original The one we guessed, and then returned as a return value. So, one obvious benefit is that it's like I'm using a conditional move instruction instead of a conditional jump directive. Then why don't we talk about x86-32 for a while?

3.3 80x86 Compilation and C language-2 (cont.) 3.3.1 80x86 Compilation and C language-2 (cont.)-Architecture background for conditional move Directives-1 Micro-Architecture background *

3.3.2 80x86 Compilation and C language-2 (cont.)-Architecture background for conditional move Directives-2 Micro-Architecture background *

This processor, I can read multiple instructions at the same time to enter this pipeline is I suddenly like reading is not read an instruction not to say a circle read an instruction to let it into the pipeline is a circle read multiple instructions into the pipeline so in this case, it could be as many as dozens of to hundreds of instructions Running in the pipeline while running inside the assembly line of course it's a good thing to do on the one hand because your throughput rate is improving.

But for a jump instruction, especially that conditional jump command, it's going to be a problem. What do you mean? Conditional jump instruction It's very critical, and it's going to be me. The next instruction in this instruction is to go down in a row or jump to one of my target addresses, but then there's a problem.

Let's say we use the above paragraph as an example, because I now divide the execution of the program into a number of streams, which means that a conditional jump order comes in. I'm not immediately able to know if the condition jump instruction is to jump or not to jump I may have to wait for it in the middle of the pipeline Or some sort of even the back of the site. That's what this is all about. It's a question of how you're going to get the instructions immediately after the instructions.

Yes, that's a big question, and that's one of the biggest problems that I'm going to go along with, or I think it's going to jump. Because I don't know the line is going to be followed by a row of the conditions of the jump command to go in the back immediately to get the address, but now how to take a problem on the assembly line will bring this So a stupid way to say that I don't take a look at it is a conditional jump command I'm not going to wait until it's finished and I'll run again. Of course, it's too wasteful.

Of course, there's a way to do it, I'm the equivalent of my fight a I'll bet it's jumping or not jumping if I bet it doesn't jump then I'll take it from one to the other. And if it jumps, then you take all of these instructions and cancel it again, so especially in your depth pipeline. Multiple instructions cancel this efficiency is very low this efficiency is very low may have dozens of command execution efficiency is very low

So in this case, the problem is that the conditional jump directive may cause some loss of performance, even if you wait for it to come out, it's stupid, that's a loss, or you just guess the wrong thing, and there's a loss of performance, so you're going to have to try and get rid of the idea. I don't know, I didn't show you white.

Conditional jump directives often cause a certain performance penalty, so they need to be eliminated as much as possible.

So, how to eliminate it. Just use conditional transfer instructions. Equivalent to 1 conditional transfer instructions, to replace 1 conditional jump instructions, that is condition move that instruction.

Recall the code snippet just now, in which the instructions are executed in order, executed sequentially, without any branch.

So the equivalent of this conditional jump order is canceled, and replaced by 1 conditional transfer instructions.

But the conditional transfer directive also has the very big limitation.

(See video for detailed reasons)

3.3.3 80x86 Compilation with C language-2 (cont.)-The assembly language of the Loop -1

"Do-while" Loop

3.3.4 80x86 Compilation with C language-2 (cont.)-The assembly language of the Loop -2 "While" Loop (version 1)

"While" loop (version 2,do-while mode)

"for" Loop

"for" –> "while" –> "Do-while"

Supplement

3.3.5 80x86 Compilation with C language-2 (cont.)-The assembly language of the Loop -3 "While" Loop-version 3 (jump-to-middle)

Jump-to-middle Instances

"for" –> "while" (Jump-to-middle)

jump-to-middle Mode

3.3.6 80x86 Compilation with C language-2 (cont.)-The architecture background of the loop representation Micro-Architecture background *

We're going to have to talk about it. Micro-Architecture Background This is the one we're going to talk about. Conditional Jump command We talked about it. It often leads to a certain loss of performance. So, how do we eliminate it? A method just said conditional move but its scope of use is limited because In many cases, it is not clear that can not eliminate the elimination of what to do is to guess we just talked about jumping or not jumping is a problem then let's guess how to guess. Using branch prediction to guess is jump prediction jump prediction Actually, this thing is actually very simple.

Let's take a look at this table of course let's just say that actually it introduces a table of a jump Branch prediction to make predictions how about a prediction? It's historically assumed that you're the one with this directive, and that's what you call this directive. That's the address. The head is a branch directive. I'll just run it once in history, and I'll write it down and write it down.

You see, it's a watch. The table is divided into two main items, but of course, there is a small column on the right. What is this one for you? This is a PC address of your jump instruction. You can tell by the PC address that this command was not the same as the one we performed the conditional jump on the left column. In fact, that's what you're doing with those conditions. The jump instruction is the one whose PC and its address to the right is your conditional jump command as long as you've done it in history. If you jump, then you jump to the target address and I'll write it down to the right of the relevant column. And then there's a small column on the far right, and that's what this column says, which means I'm predicting if you're jumping or not.

Well, it's really very, very simple. We're talking about the principle, it's very, very simple, so I'm going to make predictions about your history. You jumped the last time you jumped, like I only have one to record your last jump or not, and if you jumped that last time, that's 1. I guess you did that last time. I'll give it 0. You're guessing you don't jump, based on historical information, of course, if I'm wrong, then I'm going to convert it 0 to 1, and then next time I come back, it's a state of 1, so I'm going to give it a jump. Of course, you can extend the number of predictions to a single word. Processing

In general, judging by your historical information, if it's a branch directive, it's a jump or no-jump. History is based on historical information, so let it jump in history. If you don't jump, it's probably just not jumping.

Branch Prediction

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.