At&t x86 ASM syntax

Source: Internet
Author: User
Tags modifiers
At&t x86 ASM syntax

Creation Time:
Article attributes: Translation
Article submission: e4gle (e4gle_at_hackermail.com)

At&t x86 ASM syntax
El8 <el8@m4in.org>, alert7 <alert7@m4in.org>
From m4in security teams (www.m4in.org)

Djgpp uses the at&t format Assembly syntax. The syntax is a little different from that in general intel format. The main differences are as follows:

At&t syntax reverses the Source and Destination operands. The destination operand is after the source operand. Register operands must have a % prefix, and immediate operands must have a $ prefix. The size of the memory operand depends on the last character of the operation code. They are B (8-bit), w (16-bit), and L (32-bit ).
Here are some examples. The left part is the intel Instruction format, and the right part is the at&t format.
Movw % BX, % ax // mov ax, BX
Xorl % eax, % eax // XOR eax, eax
Movw $1, % ax // mov ax, 1
Movb X, % Ah // mov ah, byte PTR x
Movw X, % ax // mov ax, word PTR x
Movl X, % eax // mov eax, X
For most operation commands, at % T is similar to Intel, except for the following:
Movssd // movsx
Movzsd // movz

S and D resolution indicate the suffixes of source and target operands.
Movswl % ax, % ECx // movsx ECx, ax
Cbtw // CBW
Cwtl // cwde
Cwtd // CWD
Cltd // CDQ
Lcall $ S, $ o // call far s: O
Ljmp $ S, $ o // jump far s: O
LRET $ V // RET far v
The operation prefix cannot be written in the same line as the command they use. For example, rep and stosd should be two independent commands, and the memory is a little different. Generally, the intel format is as follows:

Section: [base + Index * scale + disp]

Written:

Section: disp (base, index, scale)

Here are some examples:

Movl 4 (% EBP), % eax // mov eax, [EBP + 4])
Addl (% eax, % eax, 4), % ECx // Add ECx, [eax + eax * 4])
Movb $4, % FS :( % eax) // mov FS: eax, 4)
Movl _ array (, % eax, 4), % eax // mov eax, [4 * eax + array])
Movw _ array (% EBX, % eax, 4), % CX // mov CX, [EBX + 4 * eax + array])

The jump command is usually a short jump. However, the following commands can only jump within one byte: jcxz, jecxz, loop, loopz, loope, loopnz, and loopne. As the online document says, a jcxz Foo can be extended to the following:
Jcxz cx_zero
JMP cx_nonzero
Cx_zero:
JMP foo
Cx_nonzero:
The document also noticed the Mul and imul commands. Extended multiplication commands only use one operand. For example, imul $ EBX and $ EBX will not put the result into edX: eax. Use the single operand in imul % EBX to obtain the extension result.

--------------------------------------------------------------------------------

Inline ASM
I will start with inline ASM first, as it seems that there are many questions about this. This is the most basic syntax, as described in online help information:
_ ASM _ (ASM statements: outputs: Inputs: registers-modified );

The meanings of these four fields are:

ASM statements-at&t structure, each new line is separated.
Outputs-modifiers must be enclosed by quotation marks and separated by commas
The inputs-modifier must be enclosed by quotation marks and separated by commas.
Registers-modified-names are separated by commas
A small example:
_ ASM __("
Pushl % eax/n
Movl $1, % eax/n
Popl % eax"
);
If you do not need a special input/output variable or modify the value of any register, the other three fields are generally not used,
Let's analyze the input variables.

Int I = 0;

_ ASM __("
Pushl % eax/n
Movl % 0, % eax/n
Addl $1, % eax/n
Movl % eax, % 0/n
Popl % eax"
:
: "G" (I)
); // Increment I
Don't bother with the above Code! I will try my best to explain it. We want the input variable I to add 1. We do not have any output variable or change the register value (we saved the eax value ). Therefore, the second and last fields are empty. Because the input field is specified, we still need to keep an empty output field, but there is no last field because it is not used. Leave a new line or at least a space between two empty colons.

Let's take a look at the input fields. Additional descriptors can be used to correct commands to allow the compiler to correctly process these variables. They are usually enclosed in double quotes. So what is this "G" used? As long as it is a valid assembly instruction, "G" allows the compiler to determine where to load the I value. Generally, most of your input variables can be assigned to "G" so that the compiler decides how to load them (GCC can even optimize them !). Other descriptors use "R" (loaded to any available registers), "a" (ax/eax), "B" (BX/EBX ), "C" (CX/ECx), "D" (dx/EDX), "D" (DI/EDI), "S" (Si/ESI), and so on.

We will mention an input variable such as % 0 in the ASM code. If we have two inputs, one is % 0 and the other is % 1, which are arranged in order in the input segment (for example, in the next example ). If N input variables are not output, the N-1 from % 0 to % will correspond to the variables in the input field in order.

If any input, output, or register modification field is used, the register name in the assembly code must be replaced by two %. This corresponds to the first example where the last three fields are not used.

Let's take a look at two examples of input variables that introduce "volatile:

Int I = 0, j = 1;
_ ASM _ volatile __("
Pushl % eax/n
Movl % 0, % eax/n
Addl % 1, % eax/n
Movl % eax, % 0/n
Popl % eax"
:
: "G" (I), "G" (j)
); // Increment I by J
Okay, now we have two input variables. No problem. We only need to remember that % 0 corresponds to the first input variable (I in this example), and % 1 corresponds to the J listed after I.
Oh yeah, what does volatile mean? It prevents your compiler from modifying your assembly code, that is, it does not perform optimization (record, delete, combine, and so on .), Compile them without changing the code. We recommend that you use the volatile option.

Let's take a look at the output field:

Int I = 0;
_ ASM _ volatile __("
Pushl % eax/n
Movl $1, % eax/n
Movl % eax, % 0/n
Popl % eax"
: "= G" (I)
); // Assign 1 to I
This looks like an example of the input field we mentioned earlier; it is indeed not very different. All output modifiers should be preceded by = characters, which are also represented in % 0 to % N-1 in the assembly code and arranged in order in the output fields. You must ask how to sort Input and Output fields at the same time? The following example shows how to process input and output fields at the same time.
Int I = 0, j = 1, K = 0;
_ ASM _ volatile __("
Pushl % eax/n
Movl % 1, % eax/n
Addl % 2, % eax/n
Movl % eax, % 0/n
Popl % eax"
: "= G" (k)
: "G" (I), "G" (j)
); // K = I + J
Okay, the only thing that is unclear is the number of variables in the assembly code. I will explain it immediately.
When both the input and output fields are used:

% 0... % K is the output variable

% K + 1... % N is the input variable

In our example, % 0 corresponds to K, % 1 corresponds to I, % 2 corresponds to J. Very simple, right?

So far, we have not used the last field (registers-modified ). If we want to use any register in our assembly code, we need to explicitly use the push and pop commands to save them, or use GCC to process them in the last field.

This is the previous example with no explicit retention and storage eax.

Int I = 0, j = 1, K = 0;
_ ASM _ volatile __("
Pushl % eax/n/* Note: it seems that there is something wrong with the original article, it is clear that the eax value is saved ,:(*/
Movl % 1, % eax/n
Addl % 2, % eax/n
Movl % eax, % 0/n
Popl % eax"
: "= G" (k)
: "G" (I), "G" (j)
: "Ax", "Memory"
); // K = I + J
Let GCC save and store eax, if necessary. A 16-bit register name represents a 32-, 16-, or 8-bit register. If we want to rewrite the memory (write a variable, etc .), We recommend that you specify the "memroy" modifier in the Register-modified field. This means that we should add this modifier in addition to the first example, but I did not propose it until now to make it easier to understand.

In your inline assembly, the position label should use B or F as the Terminator, especially the backward jump. (Note: B Indicates backward jump, and F indicates forward jump)

For example,

_ ASM _ volatile __("
0:/n
...
JMP 0b/n
...
JMP 1f/n
...
1:/n
...
);
Here is an example of a jump program written in a mix of C code and inline assembly code (thanks to Srikanth B. R for this tip ).

Void myfunction (int x, int y)
{
_ ASM _ ("START :");
_ ASM _ (... do some comparison ...);
_ ASM _ ("JL label_1 ");

Callfunction (& X, & Y );
_ ASM _ ("JMP start ");

Label_1:
Return;
}

--------------------------------------------------------------------------------

External ASM
Blah... Okay fine. Here's a clue: Get some of your C/C ++ files, and use gcc-s file. C for compilation. View the file. s file. The basic structure is as follows:
. File "myasm. s"

. Data
Somedata:. Word 0
...

. Text
. Globl _ myasmfunc
_ Myasmfunc:
...
RET
Macros, macros! The header file libc/asmdefs. H is convenient for you to write ASM. Include this header file at the beginning of your assembly code and then you can use macros. Example: myasm. S:
# Include <libc/asmdefs. h>

. File "myasm. s"

. Data
. Align 2
Somedata:. Word 0
...

. Text
. Align 4
Func (_ myexternalasmfunc)
Enter
Movl arg1, % eax
...
JMP mylabel
...
Mylabel:
...
Leave
This is a good pure compilation code framework.

--------------------------------------------------------------------------------

Other resources
The best way to learn all these is to look at others 'Code. there's some inline ASM code in the SYS/farptr. h. also, if you run Linux, FreeBSD, Etc ., somewhere in the kernel source tree (i386/or something), there are plenty of ASM sources. check the djgpp2/directory at x2ftp. oulu. fi, for graphics and gaming libraries that have sources.

If you have ASM code that needs to be converted from Intel to at&t syntax, or just want to stick)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.