"copyright notice: respect for the original, reproduced please retain the source: blog.csdn.net/shallnet, the article only for learning Exchange, do not use for commercial purposes"
at the very bottom of the computer operation, all computer processors operate the data according to the binary code defined by the manufacturer within the processor, which defines that the processor should take advantage of the data provided by the programmer to perform the corresponding functions, which are known as the script code. The instructions for different types of processors are not the same, but the way the scripts are processed is similar.
when the computer processor chip is running, he reads the script stored in memory, each containing information of a different length of bytes, which instructs the processor to complete a specific task. Each instruction code is read from memory, and the operation data required by the instruction is also read from memory.
Each instruction must contain at least one byte of the opcode, and the opcode defines what the processor should do. Each processor has its own defined opcode.
Intel IA-32 Series microprocessors use specially formatted scripts to understand how these instruction formats are useful for assembly language programming. The IA-32 script format consists of four main components:
- Optional instruction Prefix
- Operation code
- Optional modifiers
- Optional data elements
the layout for the IA-32 script format. as shown, the only necessary part of the IA-32 instruction format is the opcode. Each script must contain an opcode that defines the basic task of executing the instruction.
The instruction prefix can contain 0-4 1-byte prefixes that modify the behavior of the opcode. By the prefix function, these prefixes are divided into four groups, and when the opcode is modified, each group prefix can use only one (up to 4). The four prefixes are as follows:
- Lock prefix and repeat prefix
- Segment overwrite prefix and branch hint prefix
- Operand length override prefix
- Address length override Prefix
Some opcode require additional modifiers to define the registers and memory addresses that are involved in the execution function. Modifiers contain three parts:
- (modr/m) addressing mode byte
- SIB (proportional index Base) byte
- 1, 2, or 4 address shift bytes
the modr/m byte is composed of three fields of information. such as:The mod field is used with the r/m field to define the register or addressing mode used by the instruction. The Reg/opcode is used to allow more than three bits to further define the opcode function, or to define the value of the register. The Sib field is also comprised of information from 3 fields. AsThe scale field specifies the scale factor for the operation, the index field specifies the register used as the index register in memory access, and the base Address field specifies the register of the base register to use as memory access.
The address shift byte is used to specify the offset for the location of the memory location defined in the modr/m and Sib bytes. data elements in the last part of the script, some scripts read data from memory locations or processor registers, and some scripts contain data within their instruction itself.
As you can see, programming with the processor's script is very difficult, so there is a high-level language behind it. But the processor does not know how to handle the high-level language at all, it must be a mechanism to convert the high-level language code to the processor can handle the script to process.
using assembly language programming must first understand the processor environment, different types of processor code is not the same. We will take the AI-32 platform as an example to describe assembly language. Although different processor families combine different sets of instructions and features, most processors use the same set of core components. The computer typically consists of the following four components:
The processor contains hardware and scripts that control the operation of the computer through 3 separate buses: The control bus, the address bus, the data bus, and the processor is connected to other elements of the computer (Memory storage unit, input device, and output device). The processor is made up of many components, and in the process of processing data by the processor, each component has its role, and assembly language programs have the ability to access and control all of them, so it is important to understand these components. The main components of the processor are:
- Control Unit
- Execution Unit
- Register
- Sign
The control unit is the center of the processor, which controls when the processor does what it does when it is the primary function of the control unit. The control unit implements 4 basic functions:
- Get instructions from memory.
- The instruction is parsed for operation.
- Get the results you want from memory.
- The results are stored if necessary.
The Execution Unit is responsible for executing the processor instructions. The execution unit is composed of one or more arithmetic logical units (ALU). registers are used to resolve the problem that the processor waits to read data from memory. When the processor accesses the data element, the request is sent outside the processor, through the control bus into the storage unit, and the processor is waiting when the process is executed. Registers can store data elements to be processed without having to access the internal memory cells. However, the number of registers built into the processor chip is limited. The IA-32 platform has multiple sets of registers of different lengths. The general register is used to temporarily store data, the segment register is used to refer to the memory location, the instruction pointer register, also known as the program Counter (PC), tracks the next instruction code to be executed, and the control register is used to determine the operating mode of the processor. The General register and its usefulness are described in the following table:
EAX |
Accumulator for operands and result data |
ebx |
points to data pointer in data memory segment |
ecx |
string and loop action counter |
edx |
i/o pointer |
edi |
data pointer for the target of string manipulation |
esi |
data pointer to the source for string manipulation |
esp |
stack pointer |
Ebp |
Stack data pointer |
Segment Register and its description:
cs |
code snippet |
ds |
data segment |
ss |
stack segment |
es |
additional segment pointer |
fs |
additional segment pointer |
Gs |
Additional segment pointers |
The processor flag determines whether each operation implemented by the processor is successful.
Linux Platform x86 compilation (II): Processor instruction code and IA-32 Platform understanding