1. Assume that a C program has two files: p1.c and p2.c. On an ia32 server, we compile the code using the Unix Command Line as follows:
UNIX> gcc-O1-o p p1.c p2.c
In fact, the GCC command calls a series of programs to convert the source code into executable code. First, the C Preprocessor extends the source code, inserts all files specified with the # include command, and extends all the macros specified with the # define declaration. Then, the compiler generates two source code Assembly codes, namely p1.s and p2.s. Next, the assembler converts the assembly code into a binary target code file named p1.o and p2.o. The target code is a form of machine code, which includes the binary representation of all commands, but has not been filled with the global address. Finally, the linker combines the two target code files with the Code implementing library functions (such as printf) and generates the final executable code file p. Executable code is the second form of machine code to be considered, that is, the format of the Code executed by the processor.
During the entire compilation process, the compiler will complete most of the work and convert the program represented by a relatively abstract execution model provided by C language into a very basic instruction executed by the processor. Assembly code indicates very similar to machine code. Compared with the binary format of machine code, assembly code has a major feature, that is, it is expressed in a text format with better readability. Being able to understand the assembly code and its connection with the original C code is a key step in understanding how the computer executes the program.
2. operand indicator EB base address register EI address change register
Type |
Format |
Operation Value |
Name |
Instant count |
$ IMM |
IMM |
Immediate addressing |
Register |
EA |
R [EA] |
Register addressing |
Memory |
IMM |
M [Imm] |
Absolute addressing |
Accessors |
(Ea) |
M [R [EA] |
Indirect addressing |
Memory |
Imm (EA) |
M [Imm + R [EA] |
(Base address + offset) Addressing |
Memory |
(EB, EI) |
M [R [Eb] + R [ei] |
Address Change Addressing |
Memory |
Imm (EB, EI) |
M [Imm + R [Eb] + R [e] I] |
Address Change Addressing |
Memory |
(, EI, S) |
M [R [ei] * s] |
Proportional address change addressing |
Memory |
Imm (, EI, S) |
M [Imm + R [ei] * s] |
Proportional address change addressing |
Memory |
(EB, EI, S) |
M [R [Eb] + R [ei] * s] |
Proportional address change addressing |
Memory |
Imm (EB, EI, S) |
M [Imm + R [Eb,] + R [ei] * s] |
Proportional address change addressing |
3. In the mov class, the data transmission Command copies the value of the source operand to the destination operand. The value specified by the source operand is an immediate number, which is stored in a register or memory. The destination operand specifies a location, a register, or a register address. Ia32 adds a limit. The two operands of the transfer instruction cannot both point to the memory location. Two commands are required to copy a value from one memory location to another -- The First Command loads the source value into the register, and the second command writes the value to the destination location.
4. arithmetic and logical operations
The load effective address command lead is actually a variant of the movl command. Its command form is to read data from the memory to the register, but in fact it does not reference the register at all. Its first operand seems to be a memory reference, but this instruction does not read data from a specified position, but writes a valid address to the destination operand. This command can generate pointers for subsequent accessors.
Lead S, d <--- & S
In addition, it can briefly describe common arithmetic operations. For example, lead 7 (% edX, % edX, 4) % eax. If the value of % edX is X, the value of % eax is 5x + 7.
5. Control p125
Assembly Code does not record the type of program value. On the contrary, different commands determine the size of the operands and whether they are signed or unsigned.
In assembly language, a jump label is provided as the jump target, for example, ". l1 ". Indirect jump is written as "*" followed by an operand indicator. (Indirect jump, that is, the jump target is read from the register or memory location ). Conditional jump can only be a direct jump. In assembly code, jump commands have several different encodings, but most commonly used are related to PC (program counter. They encode the difference between the target instruction address and the instruction address that follows the jump instruction. The address offset can be encoded as 1, 2, or 4 bytes. The second encoding method is to give an "absolute" address and specify the target in four bytes.
The switch statement supports multiple branches based on an integer index value. This statement is particularly useful when processing tests with multiple possible results. They not only improve the readability of the C code, but also make it more efficient by using the data structure of the jump table. The jump table is an array and the table item I is the address of a code segment, this code segment implements the action that the program should take when the index value is equal to I. The program code uses the index value to execute an array reference in the stepping stone table and determine the target of the jump command.
* ** The general form template of the IF-else statement in C is as follows:
I