Information Security System Design Foundation Fourth Week study summary

Last Update:2015-10-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Chapter III Machine-level representation of the program

3.1 Historical View

The Intel processor family, commonly known as x86, has undergone a long-term development process.

Each successor processor is designed to be back-compatible, meaning that code edited on earlier versions can run on newer processors.

3.2 Program code

Suppose a C program, with two files p1.c and p2.c, compiles the code on the IA32 machine with the UNIX command line as follows:

Unix> Gcc-01-o P p1.c p2.c

Command GCC refers to the gcc C compiler

01 tell the compiler to use the first level of optimization (improving the optimization level will make the final program run faster, but the compilation time will be longer, debugging more difficult)

In fact, the GCC command invokes a series of programs that convert the source code into executable code:

First, the C preprocessor extends the source code

The compiler then produces two source code assembly code. S

Next, the assembler translates the assembly code into binary target code. O

Finally, the linker merges two object code files with the code that implements the library functions, producing the final executable code file p

3.2.1 Machine-Level code

The computer system uses a variety of different forms of abstraction, using simpler abstract models to hide the details of the implementation.

There are two important abstractions for machine-level programming:

1. The format and behavior of the machine-level program, defined as the instruction set architecture (ISA), which defines the processor state, the format of the instruction, the impact of each instruction on the state

Most ISA executes instructions by order.

2. The storage address used by the machine-level program is a virtual address, and the provided memory model appears to be a very large byte array.

The compiler does most of the work throughout the compilation process. The assembler code is a major feature of the binary format of machine code: It is expressed in a more readable text format.

The IA32 machine code differs greatly from the original C code, and some of the processor states that are usually hidden from the C language programmer are visible, some memory:

• Program counter (PC, expressed in%eip): Indicates the address of the next instruction to be executed in memory

• Integer Register: Contains 8 named locations, each storing 32-bit values, which can store addresses (pointers to C languages) or integer data.

• Condition Code Register: holds state information for the most recently executed arithmetic or logic instruction to implement conditional changes in the control or data flow.

• Floating-point registers: storing floating-point data.

3.2.2 Code Example

To view the contents of the target code file, the most valuable is the disassembler, in Linux, the command-line flag with-D objdump can act as this role.

The machine code and its disassembly represent the value of the attribute:

· IA32 instruction lengths ranging from 1 to 15 bytes

• The instruction format is designed in such a way that, starting at a given location, a byte can be uniquely decoded into a machine instruction

• The disassembler simply determines the assembly code based on the sequence of bytes in the machine code file and does not require access to the program's source code or assembly code

• There are some differences between the command naming conventions used by the disassembler and the assembly code generated by GCC

3.3 Data formats

IA32 for the basic data type of C language:

Most gcc-generated assembly code directives have a character suffix that indicates the size of the operand, such as:

Movb Transfer bytes

MOVW Transfer Word

MOVL Transmission Double Word

3.4 Access Information

A IA32 central processing unit (CPU) contains a set of 8 registers that store 32-bit values:

3.4.1 Operand instruction character

IA32 supports a number of operand formats, such as:

The number of operations is divided into three types:

• Immediate count, i.e. constant value

• Registers that represent the contents of a register

• Memory to access a memory location based on a valid address

3.4.2 Data Transfer Instructions

The routing instruction is divided into the instruction class: The instruction in a class executes the same operation, except that the operands are of different sizes.

The Mov class's instruction copies the value of the source operand to the destination operand

Both the Movs and Movz directives replicate a smaller source data to a larger data location, with the sign bit extension (movs) or 0 extension (MOVZ).

Symbol bit extension: All highs of the destination are populated with the highest bit value of the source value

0 Expansion: High-level with 0 padding

PUSHL and POPL can push data into the program stack and eject data from the program stack.

3.5 Arithmetic and logical operations

3.5.1 Load Valid address

Instruction Leal S,d, effect d<-&s

Writes a valid address to the destination operand

3.5.21 Yuan operation and two Yuan operation

Unary operation: Only one operand, both source and destination, can be a register or a memory location

Binary operation: The second operand is the source and the destination, two operands cannot be the memory location at the same time

3.5.3 Shift Operation

The shift amount is given and the shift value is given, and the arithmetic and logical right shifts can be made, but only 0-31 bits are shifted.

3.6 Control

3.6.1 Condition Code

CF Carry Flag

ZF 0 Logo

SF symbol Flag

of overflow flag

3.6.2 Access Condition Code

Condition codes are usually not read directly and are commonly used in three ways:

1, according to a combination of criteria code, a byte is set to 0 or 1;

2, the condition jumps to some other part of the program;

3. Conditionally transmit data.

3.6.3 Jump instruction and its encoding

Jump instruction JMP, the purpose of the jump is indicated by a label, the label

A jump is conditional, depending on a combination of criteria code, or a jump or continuation of the next command of a code sequence

JMP directives:

3.6.4 Translation Conditions Branch

The most common way to translate conditional expressions and statements from the C language into machine code is to combine conditional and unconditional jumps.

3.6.5 Cycle

1.do-while

To judge by execution first.

2.while

First execution, that is, the first implementation may terminate

3.for

3.6.6 Conditional Delivery Instructions

The traditional way to implement conditional operations is to take advantage of controlled conditional shifts.

Conditional transfer of data is an alternative strategy that evaluates two outcomes of a conditional operation before selecting one based on whether the condition is satisfied.

3.6.7switch statements

The switch statement can be used in multiple branches based on an integer index value, which is particularly useful when dealing with tests with multiple possible results, which not only improve the readability of C code, but also make the implementation more efficient by using the data structure of the jump table.

3.7 Process

A procedure call involves passing data (in the form of procedure parameters and return values) and control from part of the code to another part, in addition to allocating space for local variables of the procedure on entry, and freeing the space when exiting.

3.7.1 Stack frame structure

The machine uses stacks to pass process parameters, store return information, save registers for later recovery, and local storage. The portion of the stack allocated for a single process is called a stack frame.

Suppose the procedure P (caller) calls the procedure Q (callee), then the parameter of Q is placed in the stack frame of P, and when P calls Q, the return address in P is pressed into the stack to form the end of the stack frame of p. The return address is where the program should continue to execute when it returns from Q. The stack frame of Q starts with the value of the saved frame pointer, followed by the value of the other registers that are saved.

The process Q also uses stacks to hold other local variables that cannot be stored in the register, for the following reasons:

• There are not enough registers to store all local variables

• Some local variables are arrays or structs, so you must access them through an array or struct reference

• To use the address operator & for a local variable, we must be able to generate an address for it

3.7.2 Transfer Control

The following table is a command that supports procedure calls and returns:

The call command has a target, which indicates the address of the instruction at the beginning of the called process, which can be either direct or indirect, the target of the direct call in the assembly code is a symbol, and the target of the introduction call is * followed by an operand designator.

The effect of the call instruction is to put the return address into the stack and jump to the beginning of the called procedure.

The RET instruction pops the address from the stack and jumps to that position.

3.7.3 Register Usage Conventions

You must guarantee that when a caller invokes the callee, the callee does not overwrite the value of a register that the caller will later use.

Two ways to achieve the above requirements:

• Before calling Q, the value of y is stored in its own stack frame, and when Q returns, process P can remove the value of y from the stack, that is, the caller holds the value of Y.

• Save the value of Y in the callee Save register, and restore the value before returning.

Information Security System Design Foundation Fourth Week study summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Information Security System Design Foundation Fourth Week study summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Information Security System Design Foundation Fourth Week study summary

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support