Comparison of 2.6.1 and Intel assembly language
We know that Linux is a member of the Unix family, and although the history of Linux is not long, many of the things associated with it originate from UNIX. In terms of the 386 assembly language used by Linux, it also originated from UNIX. UNIX was originally developed for PDP-11, and has been ported to the VAX and 68000 series processors, and the assembly language on these processors is at/t's instruction format. When UNIX is ported to i386, it naturally uses the-T assembler format instead of the Intel format. Although there are some grammatical differences between the two assembly languages, the hardware knowledge is the same, so if you are very familiar with the Intel syntax format, then you can easily "transplant" it to T. Let's compare the syntax of Intel and T, so that you can "transplant" the knowledge of the past quickly.
1. Prefix
In Intel's syntax, neither the register nor the immediate number are prefixed. At/T, however, the Register is preceded by "%" and immediately preceded by "$". In Intel's syntax, hexadecimal and binary immediate suffix are labeled "H" and "B" respectively, while at/T, hexadecimal immediately preceded by "0x", table 2.2 gives several corresponding examples.
Table 2.2 The difference between Intel and the-t prefix
Intel syntax |
AT/t syntax |
MOV eax,8 |
MOVL $8,%eax |
MOV EBX,0FFFFH |
MOVL $0XFFFF,%EBX |
int 80h |
int $0x80 |
2. Direction of the operand
Intel is in the opposite direction of the/t operand. In Intel syntax, the first operand is the destination operand, and the second operand is the source operand. At/T, the first number is the source operand, and the second number is the destination operand. It can be seen that the grammar of at-and-T is consistent with people's usual reading habits.
For example: in Intel, MOV eax,[ecx]
At/T, MOVL (%ECX),%eax
3. Number of memory unit operands
As you can see from the example above, the number of memory operands is also different. In Intel's syntax, the base register is enclosed in "[]", and at/T, in "()".
For example: in Intel, MOV eax,[ebx+5]
At AT&T,MOVL 5 (%EBX),%eax
4. Indirect addressing method
Compared to Intel's syntax, the/T indirect addressing method may be more obscure. Intel's instruction format is segreg:[base+index*scale+disp], and the-T format is%segreg:disp (Base,index,scale). All of the Index/scale/disp/segreg are optional and can be simplified completely. If index is specified without a scale specified, the default value for scale is 1. Segreg segment registers are dependent on instructions and whether the application is running in real or protected mode, in real mode it relies on instructions, while in protected mode, Segreg is redundant. At/T, when the immediate number is used in scale/disp, it should not be prefixed with a "$" prefix, table 2.3 gives its syntax and several corresponding examples.
Table 2.3 Syntax and examples of memory operands
Intel syntax |
AT/t syntax |
Directive Foo,segreg:[base+index*scale+disp] |
Instruction%segreg:disp (base,index,scale), foo |
MOV eax,[ebx+20h] |
MOVL 0x20 (%EBX),%eax |
Add eax,[ebx+ecx*2h |
Addl (%ebx,%ecx,0x2),%eax |
Lea EAX,[EBX+ECX] |
Leal (%EBX,%ECX),%eax |
Sub eax,[ebx+ecx*4h-20h] |
subl-0x20 (%ebx,%ecx,0x4),%eax |
As can be seen from the table, the grammar of at-and-T is more obscure, because [Base+index*scale+disp] can see its meaning at a glance, and disp (Base,index,scale) is unlikely to do so.
This approach is often used to access a field within a particular element of the data structure array, where base is the starting address of the array, and scale is the size of each element of the arrays, and index is the subscript. If the array element is also a structure, the disp is the displacement of the concrete field in the structure.
5. Suffix of operation code
In the example above you may have noticed that there is a suffix behind the opcode at/t, meaning that the size of the opcode is indicated. "L" denotes a long integer (32 bits), "w" denotes a word (16 bits), and "B" represents a byte (8 bits). In Intel's syntax, you add a byte ptr, Word ptr, and DWORD ptr to the front of the memory unit operand, and DWORD corresponds to "long". Table 2.4 shows a couple of corresponding examples.
Table 2.4 Example of the suffix of the opcode
inte L Syntax |
at&t syntax |
Mov al,bl |
movb %bl,%al |
Mov ax,bx |
movw %bx,%ax |
Mov eax,ebx |
movl %ebx,%eax |
Mov EAX, DWORD ptr [EBX] |
movl (%EBX),%eax |
The difference between the ATT assembly and the Intel assembler, excerpted from the deep analysis of the Linux kernel source code book