X86 inline assembly in Linux

Source: Internet
Author: User
Linux x86 inline assembly-general Linux technology-Linux programming and kernel information. The following is a detailed description. Bharata B. Rao provides a general introduction to using and constructing x86 inline assembly on the Linux platform. He introduced the basic knowledge of inline assembly and its usage, provided some basic guidance on inline assembly encoding, and explained some examples of inline assembly code in the Linux kernel.

If you are a Linux kernel developer, you will find that you often need to encode or optimize the code path for features highly relevant to the architecture. You may execute these tasks by inserting assembly language instructions into the middle of the C statement (also known as an inline assembly method. Let's take a look at the specific usage of inline assembly in Linux. (We will discuss the limitations in IA32 compilation .)

GNU Assembler

Let's first take a look at the basic assembler syntax used in Linux. GCC (gnu c compiler for Linux) uses the AT&T Assembly syntax. Some basic rules of this syntax are listed below. (The list must be incomplete. It only includes the rules related to inline assembly .)

Register name
Register names are prefixed with %. That is, if you must use eax, it should be used as % eax.

Sequence of source and destination operands
In all commands, the source and target operations are performed first. This is different from the Intel syntax that places the source operand after the destination operand.


Mov % eax, % ebx, transfers the contents of eax to ebx.


Operand size
The instruction suffix can be B, w, or l based on whether the operand is byte, word, or long. This is not mandatory; GCC will try to provide the corresponding suffix by reading the operands. However, manually specifying a suffix can improve code readability and eliminate the possibility of incorrect compiler guesses.


Movb % al, % bl -- Byte move
Movw % ax, % bx -- Word move
Movl % eax, % ebx -- Longword move


Immediate operand
Use $ to specify the direct operand.


Movl $0 xffff, % eax -- will move the value of 0 xffff into eax register.


Indirect Memory Reference
Any indirect reference to memory is done by using.

Movb (% esi), % al -- will transfer the byte in the memory


Pointed by esi into al
Register


Inline assembly

GCC provides a special structure for inline assembly, which has the following format:

The "asm" structure of GCG

Asm (aggreger template


: Output operands (optional)


: Input operands (optional)


: List of clobbered registers
(Optional)


);


In this example, the assembler template consists of Assembly commands. The input operand is the C expression used to act as the input operand. The output operand is the C expression that will execute the Assembly command output on it.

The importance of inline assembly is that it can be flexibly operated, and its output can be displayed through the C variable. Because it has this capability, "asm" can be used as an interface between assembly instructions and C Programs that contain it.

A very basic but important difference is that simple inline assembly only includes instructions, while extended inline assembly includes operands. To illustrate this, consider the following example:

Basic elements of inline assembly

{
Int a = 10, B;
Asm ("movl % 1, % eax;



Movl % eax, % 0 ;"
: "= R" (B)/* output */
: "R" (a)/* input */
: "% Eax");/* clobbered register */
}


In the above example, we use the Assembly command to make the value of "B" equal to "". Note the following:

* "B" is the output operand, referenced by % 0, and "a" is the input operand, referenced by % 1.
* "R" is the operand constraint, which specifies that the variables "a" and "B" are stored in registers. Note that the output operand constraint should contain a constraint modifier "=", specifying that it is the output operand.
* To use register % eax in "asm", add % before % eax, in other words % eax, because "asm" uses % 0, % 1, and so on to identify variables. Any number with one % is considered as an input/output operand, not a register.
* The modifier register % eax after the third colon tells that the GCC % eax value will be modified in "asm", so that GCC will not use this register to store any other value.
* Movl % 1, % eax move the value of "a" to % eax, movl % eax, % 0 move the content of % eax to "B.
* Because "B" is specified as the output operand, when "asm" is executed, it will reflect the updated value. In other words, changes made to "B" in "asm" will be reflected out of "asm.

Now let's take a closer look at the meaning of each item.



Assembler Template

An assembly template is a set of Assembly commands inserted into a C program (either a single command or a group of commands ). Each instruction should be enclosed by double quotation marks, or the entire instruction set should be enclosed by double quotation marks. Each Command should end with a separator. Valid delimiters are new lines (\ n) and semicolons (;). '\ N' can be followed by a tab (\ t) as a formatting symbol to increase the readability of the commands generated by GCC in the Assembly file. The command references the C expression (specified as the operand) by Number % 0 and % 1 ).

If you want to ensure that the compiler does not optimize commands in "asm", you can use the keyword "volatile" after "asm ". If the program must be compatible with ansi c, use _ asm _ and _ volatile __instead of asm and volatile.


Operands

The C expression is used as the number of assembler commands in "asm. When an assembly instruction executes a meaningful job by performing operations on the C expression of a C program, the operand is the main feature of inline assembly.

Each operand is specified by an operand constraint string, followed by a C expression enclosed by arc, for example, "constraint" (C expression ). The primary function of an operand constraint is to determine the addressing method of an operand.

Multiple operands can be used in both the input and output sections. Each operand is separated by a comma.

Inside the assembler template, the operands are referenced by numbers. If there are n operands (including input and output) in total, the number of the first output operand is 0, increasing one by one, and the number of the last input operand is n-1. The total number of operands is limited to 10. If the maximum number of operands in any instruction mode in the machine description is greater than 10, the latter is used as the limit.


Modify register list

If commands in "asm" refer to hardware registers, we can tell GCC to use and modify them ourselves. In this way, GCC will not assume that the value it loads into these registers is a valid value. Generally, you do not need to column the input and output registers as clobbered because GCC knows that "asm" uses them (because they are explicitly specified as constraints ). However, if the instruction uses any other register, whether explicit or implicit (the Register does not appear in the input constraint list or in the output constraint list ), all registers must be specified as the modifier list. After the third colon, the modifier register name is specified as a string.

For keywords, if the command modifies the memory in some unpredictable and ambiguous ways, the "memory" keyword may be added to the modifier register list. This tells GCC not to cache memory values in registers between different commands.


Operand Constraints

As mentioned above, each operand in "asm" should be described by an operand constraint string, followed by a C expression enclosed by arc. The operand constraint mainly determines the addressing method of the operands in the instruction. Constraints can also be specified:

* Whether to allow an operand to be in a register and the types of registers it can be included in
* Whether the operand can be a memory reference and the types of addresses used in this case
* Can the operand be an immediate number?

The constraint also requires matching of two operands.


Common constraints

Only a small part of the available operand constraints are commonly used. The constraints and brief descriptions are listed below. For a complete list of operand constraints, see the GCC and GAS manuals.

Register operand constraints (r)
When this constraint is used to specify the operands, they are stored in General registers. See the following example:



Asm ("movl % 32a, % 0 \ n": "= r" (cr3val ));


Here, the variable cr3val is saved in the register, and the value of % Cr 3 is copied to the register. The value of cr3val is updated from this register to the memory. When the "r" constraint is specified, GCC can save the variable cr3val in any available GPR. To specify a register, you must use a specific register constraint to directly specify the register name.



A % eax

B % ebx

C % ecx

D % edx

S % esi

D % edi


Memory operand constraints (m)
When the operands are in the memory, any operation performed on them will occur directly in the memory location, which is the opposite of the register constraint, the latter stores the value in the register to be modified, and then writes it back to the memory location. But register constraints are generally used only when they are absolutely necessary for instructions, or they can greatly increase the speed of processes. When you need to update the C variable in "asm", and you do not want to use registers to save its value, the use of memory constraints is the most effective. For example, the idtr value is stored in the memory location loc:



("Sidt % 0 \ n": "m" (loc ));



Matching (number) Constraints
In some cases, a variable must act as both the input and output operations. You can specify this condition in "asm" by using the matching constraint.



Asm ("incl % 0": "= a" (var): "0" (var ));


In the matching constraints example, register % eax is used as both an input variable and an output variable. Read the var input to % eax, and store the updated % eax IN var again. Here, "0" specifies the constraint with 0th output variables being the same. That is, it specifies that the var output instance should only be stored in % eax. This constraint can be used in the following scenarios:

* After the input is read from the variable or the variable is modified, the modification is written back to the same variable.
* Instances of the input and output operands do not need to be separated.

The most important thing to use matching constraints is that they can effectively use available registers.



Examples of General inline assembly usage

The following example illustrates the usage using different operand constraints. There are so many constraints that they cannot be listed one by one. Here, only the constraint types that are most frequently used are listed.

"Asm" and register constraints "r" Let's take a look at the "asm" using register constraints r ". Our example shows how GCC allocates registers and updates the output variable values.

Int main (void)
{
Int x = 10, y;

Asm ("movl % 1, % eax;


"Movl % eax, % 0 ;"
: "= R" (y)/* y is output operand */
: "R" (x)/* x is input operand */
: "% Eax");/* % eax is clobbered register */
}


In this example, the value of x is copied to y in "asm. Both x and y are stored in registers and passed to "asm ". The assembly code generated for this example is as follows:

Main:

Pushl % ebp

Movl % esp, % ebp

Subl $8, % esp

Movl $10,-4 (% ebp)

Movl-4 (% ebp), % edx/* x = 10 is stored in % edx */
# APP/* asm starts here */

Movl % edx, % eax/* x is moved to % eax */

Movl % eax, % edx/* y is allocated in edx and updated */

# NO_APP/* asm ends here */

Movl % edx,-8 (% ebp)/* value of y in stack is updated

The value in % edx */

When the "r" constraint is used, GCC can freely allocate any registers here. In our example, it selects % edx to store x. After reading the value of x in % edx, it also allocates the same register for y.

Because y is specified in the output operand, the updated value in % edx is stored in-8 (% ebp) and y position on the stack. If y is specified in the input part, the value of y on the stack is not updated even if it is updated in the temporary register storage value (% edx) of y.

Because % eax is specified in the modifier list, GCC does not use it anywhere else to store data.

Both input x and output y are allocated in the same % edx register. Assume that the input is consumed before the output is generated. Note that this is not the case if you have many commands. To ensure that the input and output are allocated to different registers, you can specify the & constraint modifier. The following is an example of adding a constraint modifier.


Int main (void)
{
Int x = 10, y;

Asm ("movl % 1, % eax;


"Movl % eax, % 0 ;"
: "= & R" (y)/* y is output operand, note

& Constraint modifier .*/
: "R" (x)/* x is input operand */
: "% Eax");/* % eax is clobbered register */
}


The following is the assembly code generated for this example, from which we can see that x and y are stored in different registers in "asm.

Main:

Pushl % ebp

Movl % esp, % ebp

Subl $8, % esp

Movl $10,-4 (% ebp)

Movl-4 (% ebp), % ecx/* x, the input is in % ecx */
# APP
Movl % ecx, % eax
Movl % eax, % edx/* y, the output is in % edx */

# NO_APP

Movl % edx,-8 (% ebp)



Use of specific register Constraints

Now let's take a look at how to specify individual registers as the limitation of the operands. In the following example, the cpuid command uses the input in the % eax register and provides the output in the four registers: % eax, % ebx, % ecx, and % edx. The input (variable "op") of cpuid is passed to the eax register of "asm" because cpuid wants it to do so. Use the, B, c, and d constraints in the output to collect the values in the four registers respectively.

Asm ("cpuid"


: "= A" (_ eax ),


"= B" (_ ebx ),


"= C" (_ ecx ),


"= D" (_ edx)


: "A" (op ));


The following shows the assembly code generated for it (assuming that _ eax, _ ebx, etc.... variables are stored on the stack ):


Movl-20 (% ebp), % eax/* store 'Op 'in % eax -- input */
# APP


Cpuid
# NO_APP


Movl % eax,-4 (% ebp)/* store % eax in _ eax -- output */


Movl % ebx,-8 (% ebp)/* store other registers in

Movl % ecx,-12 (% ebp)
Respective output variables */

Movl % edx,-16 (% ebp)

The strcpy function can be implemented using the "S" and "D" constraints in the following ways:


Asm ("cld \ n

Rep \ n


Movsb"


:/* No input */


: "S" (src), "D" (dst), "c" (count ));

Place the source pointer src into % esi by using the "S" constraint, and use the "D" constraint to put the destination pointer dst into % edi. Because the rep prefix requires a count value, put it in % ecx.

You can see another constraint below. It uses two registers % eax and % edx to combine two 32-bit values and then generates a 64-bit value:

# Define rdtscll (val )\


_ Asm _ volatile _ ("rdtsc": "= A" (val ))

The generated assembly looks like this (if val has a 64 bit memory space ).

# APP

Rdtsc
# NO_APP


Movl % eax,-8 (% ebp)/* As a result of A constraint


Movl % edx,-4 (% ebp)
% Eax and % edx serve as outputs */

Note here that the values in % edx: % eax serve as 64 bit output.



Use matching Constraints

The Code called by the system is shown below. It has four parameters:


# Define _ syscall4 (type, name, type1, arg1, type2, arg2, type3, arg3, type4, arg4 )\
Type name (type1 arg1, type2 arg2, type3 arg3, type4 arg4 )\
{\
Long _ res ;\
_ Asm _ volatile ("int $0x80 "\


: "= A" (_ res )\

: "0" (_ NR _ # name), "B" (long) (arg1), "c" (long) (arg2 )),\


"D" (long) (arg3), "S" (long) (arg4 )));\
_ Syscall_return (type ,__ res );\
}


In the preceding example, four independent variables called by the system are put into % ebx, % ecx, % edx, and % esi by Using B, c, d, and S constraints. Note that the "= a" constraint is used in the output. In this way, the return value of the system call in % eax is put into the Variable _ res. By using the matching constraint "0" as the first operand constraint in the input part, syscall number _ NR _ # name is put into % eax and used as the input for system calls. In this way, % eax can be used as both an input register and an output register. No other registers are used for this purpose. Note that the input (syscall) is consumed (used) before the output (Return Value of syscall) is generated ).


Use of memory operand Constraints

Consider the following atomic descent operation:

_ Asm _ volatile __(


"Lock; decl % 0"


: "= M" (counter)

: "M" (counter ));


The Assembly generated for it is similar:


# APP
Lock
Decl-24 (% ebp)/* counter is modified on its memory location */
# NO_APP.

You may consider using register constraints for counter here. In this case, the counter value must be copied to the register first, and then the memory is updated. However, you will not be able to understand the full intent of locking and atomicity, which clearly shows the necessity of using memory constraints.


Use modifier register

Consider the basic implementation of memory copy.

Asm ("movl $ count, % ecx;

Up: lodsl;

Stosl;

Loop up ;"
:/* No output */
: "S" (src), "D" (dst)/* input */
: "% Ecx", "% eax");/* clobbered list */


When lodsl modifies % eax, The lodsl and stosl commands implicitly use it. The % ecx register explicitly loads count. But GCC didn't know this before we notified it. We notified GCC by including % eax and % ecx In the modifier register set. Before completing this step, GCC assumes that % eax and % ecx are free and may decide to use them as storage for other data. Note that % esi and % edi are used by "asm" and they are not in the modifier list. This is because "asm" has been declared to use them in the input operand list. The minimum value here is that if registers (whether explicitly or implicitly) are used inside "asm", they are neither included in the input operand list nor in the output operand list, it must be listed as a modifier register.


Conclusion

In general, inline assembly is huge, and many of the features it provides are not even involved here. However, if you have mastered the basic materials described in this article, you should be able to encode your inline assembly.


References

* For more information, see the original article on the developerWorks global site.

* See the Using and Porting the GNU Compiler Collection (GCC) manual.

* Refer to the GNU Assembler (GAS) manual.

* Read the Brennan's Guide to Inline Assembly carefully.


About the author

Bharata B. Rao has a bachelor's degree in electronic and communication engineering from Mysore University, India. He has been working for IBM Global Services and India since 1999. He is a member of the IBM Linux technology center, where he is primarily engaged in Linux RAS (reliability, availability and applicability) research. Other areas of interest include the operating system nature and processor architecture. You can contact him through the rbharata@in.ibm.com.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.