Use inline assembly in Visual C ++

Source: Internet
Author: User
Tags emit intel pentium

Use inline assembly in Visual C ++
 

 

I. Advantages
 

Inline assembly can be used to embed assembly language instructions in C/C ++ code without additional assembly and connection steps. In Visual C ++, inline assembly is a built-in compiler, so you do not need to configure an Independent Assembly tool such as MASM. Here, we will take Visual Studio. NET 2003 as the background to introduce the knowledge of using inner links in Visual C ++ (if it is an earlier version, there may be some discrepancies ).

 

Inline assembly code can use C/C ++ variables and functions, so it can be easily integrated into C/C ++ code. It can do some tasks that are very cumbersome or impossible to use C/C ++ alone.

 

The purposes of inline assembly include:

 

L compile specific functions in assembly language;

L write code with high speed requirements;

L direct access to hardware in the device driver;

L write the initialization and end code of the naked function.

 

 

Ii. Keywords
 

The _ asm keyword is used to use inline assembly. It can appear in any place where C/C ++ statements are allowed to appear. Let's look at some examples:

 

L simple _ asm block:

 

_ Asm

{

Mov al, 2

Mov dx, 0xD007

Out al, DX

}

 

L add the _ asm keyword before each Assembly command:

 

_ Asm mov al, 2

_ Asm mov dx, 0xD007

_ Asm out al, DX

 

L because the _ asm keyword is a statement separator, multiple Assembly commands can be placed on the same line:

 

_ Asm mov al, 2 _ asm mov dx, 0xD007 _ asm out al, DX

 

Obviously, the first method is very consistent with the C/C ++ style, and the Assembly Code and C/C ++ code are clearly separated, the _ asm keyword is also avoided, so the first method is recommended.

 

Unlike "{}" in C/C ++, "{}" in __asm blocks does not affect the scope of C/C ++ variables. At the same time, the __asm block can be nested, and nesting does not affect the scope of the variable.

 

To be compatible with earlier versions of Visual C ++, _ asm and _ asm have the same meaning. In addition, Visual C ++ supports the Standard C ++ asm keyword, but it does not generate any instructions. Its function is limited to so that the compiler will not produce compilation errors. To use inline assembly, you must use the _ asm keyword instead of the asm keyword.

 

 

Iii. Assembly Language
 

1. Instruction Set
 

Inline assembly supports all commands of Intel Pentium 4 and AMD Athlon. More commands for other processors can be created through the _ EMIT pseudocommand (see the description of _ EMIT pseudocommands below ).

 

2. MASM expression
 

In inline assembly code, all MASM expressions can be used. (a masm expression is a combination of operators and operands used to calculate a value or an address ).

 

3. Data Indicators and operators
 

Although the data types and objects of C/C ++ can be used in the _ asm block, it cannot use MASM indicators and operators to define data objects. Specifically, the definition indicators (DB, DW, DD, DQ, DT, and DF) in the masm are not allowed in the asm block, and the DUP and THIS operators are not allowed. The structures and records in the MASM are no longer valid. The inline assembly does not accept STRUC, RECORD, WIDTH, or MASK.

 

4. EVEN and ALIGN indicators
 

Although inline assembly does not support most MASM indicators, it supports EVEN and ALIGN. When necessary, add the NOP command (null operation) to the Assembly Code to align the labels to specific boundaries. In this way, some processors can get commands more efficiently.

 

5. MASM macro indicator
 

Inline assembly is not a macro assembly and cannot use MASM macro indicators (macro, rept, IRC, IRP, and endm) and macro operators (<> ,! , &, %, And. type ).

 

6. Section
 

You must use registers instead of names to specify segments (the segment name "_ text" is invalid ). In addition, the CIDR block must be explicitly stated, for example, ES: [EBX].

 

7. Type and variable size
 

In inline assembly, length, size, and type can be used to obtain the size of C/C ++ variables and types.

 

The L length operator is used to obtain the number of elements in the array in C/C ++ (if it is not an array, the result is 1 ).

L The size operator can obtain the size of the C/C ++ variable (the size of a variable is the product of length and type ).

The L type operator can return the C/C ++ type and variable size (if the variable is an array, it returns the size of a single element in the array ).

 

For example, an 8-dimensional integer variable is defined in the program:

 

Int iarray [8];

 

Below are the values of iArray and its elements obtained in C and Assembly expressions:

 

_ Asm
C
Size
 
LENGTH iArray
Sizeof (iArray)/sizeof (iArray [0])
8
 
SIZE iArray
Sizeof (iArray)
32
 
TYPE iArray
Sizeof (iArray [0])
4
 

 

8. Notes
 

The comments in the assembly language can be used in inline assembly, that is, ";". For example:

 

_ Asm mov eax, OFFSET pbBuff; Load address of pbBuff

 

The C/C ++ Macro will be moved to a logic line. To avoid confusion caused by the use of assembly language annotations in macros, inline assembly can also use C/C ++-style annotations.

 

9. _ emit pseudocommand
 

_ Emit pseudo commands are equivalent to DB in MASM, but _ emit can only define one byte in the current Code segment (. Text Segment) at a time. For example:

 

_ ASM

{

JMP _ codelabel

 

_ Emit 0x00; defines the data mixed in the code segment

_ Emit 0x01

 

_ Codelabel:; here is the code

_ Emit 0x90; NOP command

}

 

10. Register usage
 

Generally, it cannot be assumed that a register has a known value at the beginning of the _ asm block. The register value cannot be guaranteed to be retained from the _ asm block to another _ asm block.

 

If a function is declared as _ fastcall, its parameters are passed through registers rather than stacks. This will cause a problem with the _ asm block, because the function cannot be informed of which parameter is in which register. If the function receives the parameters in EAX and immediately stores a value in EAX, the original parameters will be lost. In addition, the ECX register must be retained for all functions declared as _ fastcall. To avoid the preceding conflicts, do not declare the _ fastcall call Method for functions containing the _ asm block.

 

* Tip: if you use the EAX, EBX, ECX, EDX, ESI, and EDI registers, you do not need to save them. However, if you use DS, SS, SP, BP, and flag registers, you should use PUSH to save these registers.

 

* Tip: if the direction flag for STD and CLD is changed in the program, it must be restored to the original value.

 

 

4. Use the C/C ++ Element
 

1. Available C/C ++ Elements
 

C/C ++ and assembly languages can be used together. In inline assembly, C/C ++ variables and many other C/C ++ elements can be used, including:

 

L symbols, including labels, variables, and function names;

L constants, including symbolic constants and enumeration members;

L macro definition and pre-processing indicator;

L annotations, including "/**/" and "//";

L type name, including all valid MASM types;

L typedef name, usually using the PTR and TYPE operators, or using the specified structure or enumeration members.

 

In inline assembly, C/C ++ or assembly language base notation can be used. For example, 0x100 and 100 H are equal.

 

2. Use Operators
 

Inline assembly cannot use C/C ++ operators such as <. However, operators common to C/C ++ and MASM (such as the "*" and "[]" Operators) are considered to be operators in assembly languages and can be used. For example:

 

Int iArray [10];

 

_ Asm MOV iArray [6], BX; Store BX at iArray + 6 (Not scaled)

IArray [6] = 0; // Store 0 at iArray + 12 (Scaled)

 

* Tip: in inline assembly, you can use the TYPE operator to make it consistent with C/C ++. For example, the following two statements are the same:

 

_ Asm MOV iArray [6 * TYPE int], 0; Store 0 at iArray + 12

IArray [6] = 0; // Store 0 at iArray + 12

 

3. Use the C/C ++ symbol
 

In the _ asm block, you can reference all the C/C ++ symbols in the scope, including the variable name, function name, and label. However, you cannot access member functions of the C ++ class.

 

The following are some restrictions on using the C/C ++ symbol in inline assembly:

 

L each Assembly statement can contain only one C/C ++ symbol. In an assembly instruction, multiple symbols can only appear in LENGTH, TYPE, or SIZE expressions.

L The referenced function in the _ asm block must be declared first. Otherwise, the compiler cannot distinguish the function name and label in the _ asm block.

L The C/C ++ characters reserved for MASM cannot be used in the _ asm block (Case Insensitive ). MASM reserved words include command names (such as PUSH) and register names (such as ESI.

L The _ asm block cannot identify the structure and Union tags.

 

4. Access Data in C/C ++
 

One of the great conveniences of inline assembly is that it can reference C/C ++ variables by name. For example, if the C/C ++ variable iVar is within the scope of its function:

 

_ Asm mov eax, iVar; Stores the value of iVar in EAX

 

If the class, structure, or enumeration member in C/C ++ has a unique name, the _ asm block can only be accessed by the member name (omitted ". "variable name or typedef name before the operator ). However, if the member is not unique, you must add the variable name or typedef name before the "." operator. For example, the following two structures have the member variable SameName:

 

Struct FIRST_TYPE

{

Char * pszWeasel;

Int SameName;

};

 

Struct SECOND_TYPE

{

Int iWonton;

Long SameName;

};

 

If the variables are declared as follows:

 

Struct FIRST_TYPE ftTest;

Struct SECOND_TYPE stTemp;

 

The variable name must be used for all references to the SameName Member, because SameName is not unique. In addition, because the preceding pszWeasel variable has a unique name, you can reference it only by using its member name:

 

_ Asm

{

Mov ebx, OFFSET ftTest

Mov ecx, [EBX] ftTest. SameName; "ftTest" must be used"

Mov esi, [EBX]. pszWeasel; "ftTest" can be omitted"

}

 

* Tip: omitting the variable name is only for the convenience of writing code, and the generated Assembly commands are the same.

 

5. Compile functions with inner Confluence
 

If you use inner confluence to write a function, it is very easy to pass the parameter and return a value. Let's take a look at the example below and compare the functions written using independent assembly and inner Confluence:

 

; PowerAsm. asm

; Compute the power of an integer

 

PUBLIC GetPowerAsm

_ Text segment word public 'code'

GetPowerAsm PROC

Push ebp; Save EBP

Mov ebp, ESP; Move ESP into EBP so we can refer

; To arguments on the stack

Mov eax, [EBP + 4]; Get first argument

Mov ecx, [EBP + 6]; Get second argument

Shl eax, CL; EAX = EAX * (2 ^ CL)

Pop ebp; Restore EBP

RET; Return with sum in EAX

GetPowerAsm ENDP

_ TEXT ENDS

END

 

C/C ++ functions generally use stacks to pass parameters. Therefore, the above functions need to access its parameters through stack positions (in MASM or some other compilation tools, you can also access stack parameters and local stack variables by name ).

 

The following program is compiled using inner links:

 

// PowerC. c

 

# Include <Stdio. h>

 

Int GetPowerC (int iNum, int iPower );

 

Int main ()

{

Printf ("3 times 2 to the power of 5 is % dn", GetPowerC (3, 5 ));

}

 

Int GetPowerC (int iNum, int iPower)

{

_ Asm

{

Mov eax, iNum; Get first argument

Mov ecx, iPower; Get second argument

SHL eax, Cl; eax = eax * (2 to the power of Cl)

}

// Return with result in eax

}

 

The getpowerc function compiled by inner consortium can reference its parameters by parameter names. Since the getpowerc function does not execute the Return Statement of C, the compiler will give a warning message. We can use # pragma warning to disable this warning.

 

One of the purposes of inline assembly is to compile the initialization and end code of the naked function. For general functions, the compiler will automatically help us generate function initialization (build parameter pointers and assign local variables, etc.) and end code (balance the stack and return a value ). With inline assembly, we can write clean functions by ourselves. Of course, at this time, we must do some work on function initialization and scanning. For example:

 

Void _ declspec (naked) mynakedfunction ()

{

// Naked functions must provide their own Prolog.

_ ASM

{

Push EBP

MoV ESP, EBP

Sub ESP, _ local_size

}

 

.

.

.

 

// And we must provide epilog.

_ ASM

{

Pop EBP

RET

}

}

 

6. Call the C/C ++ Function
 

The C/C ++ function declared as _ cdecl (default) in inline assembly must be cleared by the caller. The following is an example of calling a C/C ++ function:

 

# Include <Stdio. h>

 

Char szFormat [] = "% s % sn ";

Char szHello [] = "Hello ";

Char szWorld [] = "world ";

 

Void main ()

{

_ Asm

{

Mov eax, OFFSET szWorld

PUSH EAX

Mov eax, OFFSET szHello

PUSH EAX

Mov eax, OFFSET szFormat

PUSH EAX

CALL printf

 

// Three parameters are pushed into the stack. You need to adjust the stack after calling the function.

Add esp, 12

}

}

 

* Tip: the parameters are pushed to the stack from right to left.

 

If you call the _ stdcall function, you do not need to clear the stack yourself. Because the returned command of this function is RET n, the stack is automatically cleared. Most Windows API functions use the _ stdcall call method (except for the number of wsprintf functions). The following is an example of calling the MessageBox function:

 

# Include <Windows. h>

 

TCHAR g_tszAppName [] = TEXT ("API Test ");

 

Void main ()

{

TCHAR tszHello [] = TEXT ("Hello, world! ");

 

_ Asm

{

PUSH MB_ OK OR MB_ICONINFORMATION

Push offset g_tszAppName; OFFSET is used for global variables.

Lea eax, tszHello; LEA

PUSH EAX

PUSH 0

 

Note that this is not the CALL MessageBox, but the address of the relocated function.

Call dword ptr [MessageBox]

}

}

 

* Tip: You can access C ++ member variables without restriction, but cannot access C ++ member functions.

 

7. Define the _ asm block as a C/C ++ macro.
 

Using the C/C ++ macro, you can easily Insert the assembly code into the source code. However, you need to note that the macro will be extended to a logical row. To avoid any problems, follow these rules to write macros:

 

L use curly brackets to enclose _ asm blocks;

L put the _ asm keyword before each Assembly command;

L use comments of the Classic C style ("/* comment */"). Do not use comments of the Assembly style ("; comment ") or a single line of C/C ++ comments ("// comment ");

 

For example, a simple macro is defined below:

 

# Define PORTIO _ asm

/* Port output */

{

_ Asm mov al, 2

_ Asm mov dx, 0xD007

_ Asm out dx, AL

}

 

At first glance, the following three _ asm keywords seem redundant. In fact, they are needed, because macros will be extended into a single row:

 

_ Asm/* Port output */{__ asm mov al, 2 _ asm mov dx, 0xD007 _ asm out dx, AL}

 

From the expanded code, we can see that the third and fourth _ asm keywords are required (as statement delimiters ). In the _ asm block, only the _ asm keyword and line break are considered as statement delimiters, and a statement block defined as a macro is considered as a logical line, therefore, the _ asm keyword must be used before each command.

 

Parentheses are also required. If you omit it, the compiler will not know where the Assembly Code ends, and the C/C ++ statement after the asm block looks to be considered an assembly instruction.

 

Likewise, due to macro expansion, compilation-style comments ("; comment") and single-line C/C ++ comments ("// commen") may also cause errors. To avoid these errors, use a classic C-style annotation ("/* comment */") when defining _ asm as a macro */").

 

A macro written in the _ asm block similar to a C/C ++ macro can also have parameters. Unlike the C/C ++ macro, The __asm macro cannot return a value. Therefore, this macro cannot be used as a C/C ++ expression.

 

Do not call macros of this type without any choice. For example, calling an assembly language macro in a function declared as _ fastcall may lead to unexpected results (see the preceding description ).

 

8. Jump

 

You can use goto in C/C ++ to jump to the labels in the _ asm block, or switch to the labels inside or outside the _ asm block. The labels in the _ asm block are case-insensitive (commands, indicators, and so on are case-insensitive ). For example:

 

Void MyFunction ()

{

Goto C_Dest;/* correct */

Goto c_dest;/* Error */

 

Goto A_Dest;/* correct */

Goto a_dest;/* correct */

 

_ Asm

{

JMP C_Dest; correct

JMP c_dest; Error

 

JMP A_Dest; correct

JMP a_dest; correct

 

A_dest:; _ asm label

}

 

C_Dest:/* C/C ++ Number */

Return;

}

 

Do not use the function name as the label. Otherwise, it will jump to the function for execution, rather than the label. For example, because exit is a C/C ++ function, the following redirection will not go to the exit label:

 

; Error: Use the function name as the label

JNE exit

.

.

.

Exit:

.

.

.

 

The dollar sign "$" is used to specify the current instruction position, which is often used in conditional jump. For example:

 

JNE $ + 5; the length of the following command is 5 bytes.

JMP _ Label

NOP; $ + 5, jump to here

.

.

.

_ Label:

.

.

.

 

 

5. Use Independent Assembly in Visual C ++ Projects
 

Inline assembly code is not easy to transplant. If your program is intended to run on different types of machines (such as x86 and Alpha), you may need to use specific machine code in different modules. At this time, you can use MASM (Microsoft Macro Assembler) Because MASM supports more convenient Macro commands and data indicators.

 

Here is a brief introduction to the procedure of calling MASM to compile an Independent Assembly file in Visual Studio. NET 2003.

 

In the Visual C ++ project, add the. asm file as required by MASM. In Solution Explorer, right-click the file and select the "properties" menu item. In the "properties" dialog box, click "Custom generation step" to set the following items:

 

Command Line: ML.exe/nologo/c/coff "-Fo $ (IntDir) $ (InputName). obj" "$ (InputPath )"

Output: $ (IntDir) $ (InputName). obj

 

To generate debugging information, you can add the "/Zi" parameter to the command line, and generate. lst and. sbr files as needed.

 

If you want to call Windows API in an assembly file, you can download the masm32 package from the Internet (including the MASM Assembly Tool, a very complete Windows API header file/library file, a practical macro, and a large number of Win32 Assembly examples ). Correspondingly, the "/I X: masm32include" parameter should be added to the command line to specify the path of the Windows API Assembly header file (. Inc. Masm32 home page is: http://www.masm32.com, which can download the latest version of the masm32 package.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.