Assembly languages depend on the architecture. In Linux, the compiled assembly by GCC is an ATT syntax assembly by default. This article mainly introduces ATT Assembly: operand length, immediate number representation method, register reference, operand sequence, symbol constant, memory Reference.
1. The length of the operand
The length of the operand is expressed by the symbol after the instruction in B (byte, 8-bit), w (word, 16-bits), L (long, 32-bits ), for example, "movb % Al, % Bl", "movw % ax, % BX", "movl % eax, % EBX ".
If the length of the operand is not specified, the compiler will set it according to the length of the target operand. For example, the command "mov % ax, % BX", because the length of the destination operand BX is word, the compiler will equate this command with "movw % ax, % BX ". Similarly, the command "mov $4, % EBX" is equivalent to the command "movl $4, % EBX", and "Push % Al" is equivalent to "pushb % Al ". The compiler reports an error for an instruction that does not specify the length of the operand but cannot be guessed by the compiler. For example, the instruction "Push $4"
2. Immediate count
To use the immediate number, add the symbol $ before the number, such as "movl $0x04, % EBX" or:
Para = 0x04
Movl $ para, % EBX
3. Register reference
The reference register must be preceded by a percent sign (%), for example, "movl % eax, % EBX ".
80386 there are eight 32-bit registers: % eax, % EBX, % ECx, % edX, % EDI, % ESI, % EBP, % ESP;
8 16-bit registers, which are in fact the low 16 bits of the above 8 32-bit registers: % ax, % BX, % CX, % dx, % Di, % Si, % bp, % sp;
8 8-bit registers: % ah, % Al, % BH, % BL, % CH, % Cl, % DH, % DL. They are in fact the registers % ax, % BX, % CX, % dx of the high 8-bit and low 8-bit;
6 segment registers: % CS (CODE), % DS (data), % SS (stack), % es, % FS, % GS;
Three control registers: % Cr0, % CR2, % audit;
6 Debug Registers: % db0, % db1, % DB2, % db3, % db6, % db7;
Two test registers: % tr6, % tr7;
Eight floating-point register stacks: % ST (0), % ST (1), % ST (2), % ST (3), % ST (4 ), % ST (5), % ST (6), % ST (7 ).
4. operand order
Operands are arranged from the source (left) to the target (right), such as "movl % eax (source), % EBX (destination )"
5. symbol constant
Symbol constants are directly referenced, such as value:. Long 0x12a3f2de.
Movl value, % EBX
The command execution result is to load the constant 0x12a3f2de into the register EBX.
The referenced symbolic address is preceded by the symbol $. For example, "movl $ value, % EBX" is used to load the symbolic value address into the register EBX.
6. Memory Reference
The indirect memory reference format of Intel syntax is: Section: [base + Index * scale + displacement]
In at&t syntax, the corresponding format is Section: displacement (base, index, scale)
Among them, base and index are any 32-bit base and index storage devices. The scale value can be 1, 2, or 8. If the scale value is not specified, the default value is 1. Section can specify any segment register as the segment prefix. The default segment register is different under different circumstances. If you specify the default segment prefix in the instruction, the compiler will not generate this segment Prefix code in the target code. The following are some examples:
-4 (% EBP): base = % EBP, displacement =-4, section is not specified, because base = % EBP, so the default section is % SS, index, if scale is not specified, the index is 0.
Foo (, % eax, 4): Index = % eax, scale = 4, displacement = Foo. Other domains are not specified. The default section is % Ds. Foo (, 1): This expression references the value of the address pointed to by the pointer Foo. Note that there is no base or index in this expression, and there is only one comma. This is an exception syntax, but it is legal. % GS: FOO: this expression references the value of Foo, which is placed in the % GS segment. If the prefix "*" is specified before the call and jump operations, it indicates an absolute address to call/redirect, that is, the JMP/call command specifies an absolute address. If "*" is not specified, the operand is a relative address. If the operand of any instruction is a memory operation, the instruction must specify its operation size (byte, word, long), that is, it must carry the instruction suffix (B, W, L ).