Part 1 arm assembly syntax in Linux
Although it is convenient to use C or C ++ to write programs in Linux, the assembler source program is used for the most basic initialization of the system, such as initializing the stack pointer, setting the page table, and operating the arm coprocessor. After initialization, you can jump to C code execution. Note that the GNU assembler follows at&t's Assembly syntax and can download the relevant specifications from the GNU website (www.gnu.org.
I. Linux Assembly Line Structure
Any assembly line is structured as follows:
[:] [} @ Comment
[:] [} @ Annotation
In Linux arm assembly, any identifier ending with a colon is considered as a label, not necessarily at the beginning of a line.
[Example 1] define a "add" function and return the sum of the two parameters.
. Section. Text, "X"
. Global add @ give the symbol add external Linkage
Add:
Add r0, R0, R1 @ add input arguments
MoV PC, LR @ return from subroutine
@ End of Program
Ii. labels in Linux Assembler
The label can only be ~ Z, ~ Z, 0 ~ 9, ".", _, and other characters. When the label is 0 ~ When the number is 9, the local label can be repeated as follows:
Label F: Forward label in the referenced position
Label B: backward label at the place of reference
[Example 2] example of using a local symbol, a loop Program
1:
Subs r0, R0, #1 @ each loop to make R0 = r0-1
BNE 1f @ jump to 1 to execute
The local label represents the address of the local label. Therefore, it can be used as a variable or function.
Iii. Segmentation in Linux Assembler
(1) Section pseudo operation
You can use the. Section pseudo operation to customize a segment. The format is as follows:
. Section section_name [, "Flags" [, % Type [, flag_specific_arguments]
Each segment starts with its name and ends with the name of the next segment or the end of the file. These segments all have default flags that can be recognized by the connector. (Same as the area in armasm ).
The following section lists the allowed segments in the ELF format.
<Flag> meaning
A allowed segment
W writable segments
X execution segment
[Example 3] definition section
. Section. mysection @ custom data segment named ". mysection"
. Align 2
Strtemp:
. ASCII "Temp string/n/0"
(2) pre-defined segment name of the Assembly System
. Text @ code snippet
. Data @ initialize Data Segment
. BSS @ uninitialized data segment
. Sdata @
. Sbss @
Note that the. BSS segment in the source program should be before. Text.
Iv. Define entry points
The default entry of the assembler is the start mark. You can use the entry mark in the connection script file to specify other entry points.
[Example 4] define the entry point
. Section. Data
<Initialized data here>
. Section. BSS
<Uninitialized data here>
. Section. Text
. Globl _ start
_ Start:
<Instruction code goes here>
V. macro definition in Linux Assembler
The format is as follows:
. Macro macro name parameter name list @ pseudodirective. Macro defines a macro
Macro
. Endm @. endm indicates that the macro ends.
If a macro uses a parameter, add the prefix "/" when using this parameter in the macro. Default values can also be used for macro-defined parameters.
You can use the. exitm pseudo command to exit the macro.
[Example 5] macro definition
. Macro shiftleft a, B
. If/B <0
Mov/a,/a, ASR #-/B
. Exitm
. Endif
Mov/a,/a, LSL #/B
. Endm
Vi. constants in Linux Assembler
(1) The decimal number starts with a non-zero number, for example, 123 and 9876;
(2) The binary number starts with 0 B, with uppercase letters;
(3) The octal digit starts with 0, for example, 0456,0123;
(4) hexadecimal numbers starting with 0x, such as 0 xabcd and 0x123f;
(5) string constants must be enclosed in quotation marks. escape characters can also be used in the middle, such as: "You are welcome! /N ";
(6) The current address is represented by ".". This symbol can be used in the assembler to represent the address of the current instruction;
(7) expression: The expression in the assembler can use a constant or a number. "-" indicates taking a negative number, "~" "<>" Indicates not equal, and other symbols such: +,-, *,/, %, <, <,>,>, |, &, ^ ,! , ==, >=, <=, &, | The usage is similar to that in C.
VII. Common pseudo operations for Arm Assembly in Linux
As mentioned above, some operations are as follows:
Data Definition pseudo operations:. byte,. Short,. Long,. Quad,. Float,. String/. asciz/. ASCII,
Repeated pseudo operations. Rept,
Value assignment statement. EQU/. Set;
Function definition;
Align;
End the pseudo operation on the source file. end;
. Include pseudo operations;
. If pseudo operation;
. Global/. globl pseudo operation;
. Type pseudo operation;
List control statement;
Different from the common pseudo operations of gas assembly, the following are the specific pseudo operations of arm:. Reg,. unreq,. Code,. Thumb,. thumb_func,. thumb_set,. ltorg,. Pool
1. Data Definition pseudo operations
(1). byte: single-byte definition, for example,. byte 34,072, 0b01, 0 x,'s ';
(2). Short: defines double-byte data, such as. Short 0x1234,60000;
(3). Long: defines 4 bytes of data, such as. Long 0x12345678,23876565
(4). Quad: defines 8 bytes, for example,. Quad 0x1234567890abcd.
(5). Float: defines floating point numbers, for example:
. Float 0f-314159265358979323846264338327/
95028841971.693993751e-40 @-pi
(6). String/. asciz/. ASCII: defines multiple strings, for example:
. String "ABCD", "efgh", "Hello! "
. Asciz "qwer", "Sun", "World! "
. ASCII "Welcome/0"
Note that the character string defined by the. ASCII pseudo operation must be added with the ending character '/0 '.
(7). Rept: Repeated pseudo operations are defined in the following format:
. Rept repeated times
Data Definition
. Endr @ end repeated Definitions
For example:
. Rept 3
. Byte 0x23
. Endr
(8). EQU/. Set: Value assignment statement, in the following format:
. Equ (. Set) variable name, expression
For example:
. Equ ABC 3 @ Make abc = 3
2. Define pseudo operations for Functions
(1) define a function in the following format:
Function Name:
Function body
Return Statement
Generally, if a function needs to be called in other files, the. Global pseudo operation must be used to declare the function as a global function. To avoid confusion when other programs call a C function, we need to follow the APCs guidelines for register usage. The function compiler processes the code as a. Global assembly code.
(2) function compilation should follow the following rules:
A1-a4 registers (parameters, results, or temporary registers, equivalent words R0 to R3) and floating-point register f0-f3 (if there is a floating-point coprocessor) do not have to be saved in the function;
If the function returns a value not greater than the size of a word, the value should be sent to R0 at the end of the function;
If the function returns a floating point number, it is put into the floating point register F0 at the end of the function;
If the Function Procedure changes Sp (Stack pointer, R13), FP (Framework pointer, R11), SL (stack restriction, R10), LR (connection register, R14), v1-v8 (variable registers, R4 to R11) and f4-f7, then these registers should be restored to include the value it holds when entering the function at the end of the function.
3. Align. End. Include. incbin pseudo operation
(1). Align: used to specify the Data Alignment mode. The format is as follows:
. Align [absexpr1, absexpr2]
In some alignment mode, fill the value in the unused storage area. The first value indicates alignment, 4, 8, 16, or 32. The second expression value indicates the filled value.
(2). End: indicates the end of the source file.
(3). include: You can expand the specified file where. Include is used. It is usually a header file, for example:
. Include "myarmasm. h"
(4). the incbin pseudo operation can compile an unblocked binary file into the current file. The usage is as follows:
. Incbin "file" [, skip [, Count]
Skip indicates that the object is read starting from the object skipping the Skip byte, and count indicates the number of words read.
4. If pseudo operation
Determine whether to compile the following code based on the value of an expression. the endif pseudo operation indicates the end of the condition judgment, which can be used in the middle. else. which part of the Code should be compiled when the if condition is not met.
. If has multiple variants:
. Ifdef symbol @ determine whether the symbol is defined
. IFC string1, string2 @ whether the strings string1 and string2 are equal. Strings can be enclosed in single quotes.
. Ifeq expression_r @ determine whether the value of expression_r is 0
. Ifeqs string1, string2 @ judge whether string1 and string2 are equal, and the character string must be enclosed in double quotation marks
. Ifge expression_r @ determine whether the value of expression_r is greater than or equal to 0
. Ifgt absolute expression_r @ determine whether the value of expression_r is greater than 0
. Ifle expression_r @ determine whether the value of expression_r is less than or equal to 0
. Iflt absolute expression_r @ determine whether the value of expression_r is less than 0
. IFNC string1, string2 @ determines whether string1 and string2 are not equal. The usage of string1 is the opposite of that of. IFC.
. Ifndef symbol,. ifnotdef symbol @ judge whether no symbol is defined, which is the opposite of. ifdef
. Ifne expression_r @ if the value of expression_r is not 0, the compiler will compile the following code
. Ifnes string1, string2 @ if the strings string1 and string2 are not consistent, the compiler will compile the following code.
5. Global. type. Title. List
(1). Global/. globl: used to define a global symbol. The format is as follows:
. Global symbol or. globl symbol
(2). Type: used to specify whether the type of a symbol is a function type or an object type. The object type is generally data. The format is as follows:
. Type symbol, type description
[Example 6]
. Globl
. Data
. Align 4
. Type A, @ object
. Size A, 4
A:
. Long 10
[Example 7]
. Section. Text
. Type asmfunc, @ Function
. Globl asmfunc
Asmfunc:
MoV PC, LR
(3) list control statement:
. Title: used to specify the title of the assembly list, for example:
. Title "My program"
. List: Used to output list files.
6. Arm-specific pseudo operations
(1). Reg: used to assign an alias to a register. The format is as follows:
Alias. req register name
(2). unreq: Used to cancel the alias of a register. The format is as follows:
. Unreq register alias
Note that the canceled alias must be defined in advance. Otherwise, the compiler reports an error. This pseudo operation can also be used to cancel the premade alias of the system, such as r0, however, it is not recommended to do that unless necessary.
(3) the pseudo-code operation is used to select the arm or thumb instruction set. The format is as follows:
. Code expression
If the expression value is 16, the following command is the thumb command. If the expression value is 32, the following command is the arm command.
(4). the false thumb operation is equivalent to. Code 16, indicating that the thumb command is used. Arm is equivalent to. Code 32.
(5). force_thumb pseudo operations are used to force the target processor to select the thumb instruction set, regardless of whether the processor supports
(6). The thumb_func pseudo operation is used to indicate that a function is a function of the thumb instruction set.
(7 ). the role of the thumb_set pseudo operation is similar to that of. set, which can be used to create an alias for a flag. one added to the set function is to mark a flag as the entrance of the thumb function, which is equivalent. thumb_func
(8). ltorg is used to declare the beginning of a Data Buffer Pool (literal pool), which can allocate a lot of space.
(9). The pool function is equivalent to. ltorg
(9). Space <number_of_bytes >{, <fill_byte>}
Allocate the data space of number_of_bytes bytes and fill in the value of fill_byte. If this value is not specified, it is filled with 0 by default. (Same as the space function in armasm)
(10). Word <word1 >{, <word2> }...
Insert a 32-bit data queue. (Same as DCD in armasm)
You can use. Word to use the identifier as a constant.
For example:
Start:
Valueofstart:
. Word start
In this way, the start of the program is saved into the memory variable valueofstart.
(11). hword <short1 >{, <short2> }...
Insert a 16-bit data queue. (Same as dcw in armasm)
8. special characters and syntax of GNU Arm Assembly
Comment symbol in the code line :'@'
Comment on the entire line :'#'
Statement separator :';'
Prefix of the direct operand: '#' or '$'
Part 2 GNU Compiler and debugging tools
I. Compilation tools
1. Introduction to editing tools
The compilation tools provided by GNU include assembler as, C compiler GCC, C ++ compiler g ++, connector lD, and Binary Conversion Tool objcopy. The tools based on the ARM platform are arm-Linux-as, arm-Linux-GCC, arm-Linux-G ++, arm-Linux-LD, and arm-Linux-objcopy. The GNU Compiler is very powerful and has hundreds of operation options, which is also the cause of this headache for beginners. However, in actual development, you only need to use a few limited items, most of which can use the default options. The development process of the GNU tool is as follows: Compile the C and C ++ languages or compile the source program, use GCC or G ++ to generate the target file, and write the connection script file, use a connector to generate the final target file (ELF format), and use a binary conversion tool to generate downloadable binary code.
(1) compile C and C ++ languages or assembler source programs
The Assembly source program is usually used for the most basic initialization of the system, such as initializing the stack pointer, setting the page table, and operating the arm coprocessor. After initialization, you can jump to C code execution. Note that the GNU assembler follows at&t's Assembly syntax. You can download the relevant specifications from the GNU website (www.gnu.org. The default entry of the assembler is the start mark. You can also use the entry mark in the connection script file to specify other entry points (see the description of the connection script below ).
(2) Use GCC or G ++ to generate the target file
If the application contains multiple files, you need to compile them separately and connect them with connectors. For example, the author's boot program contains three files: init. S (assembly code, initialization hardware) xmrecever. C (communication module, xmode Protocol) and flash. C (flash erase module ).
Use the following command to generate the target file: The arm-linux-gcc-c-O2-oinit.oinit.s arm-linux-gcc-c-O2-oxmrecever.oxmrecever.c arm-linux-gcc-c-O2-oflash.oflash.c where the-C command indicates that only the target code is generated, not connected; The-O command specifies the name of the target file; -O2 indicates second-level optimization. After optimization, the generated code is shorter and the running speed is faster. If the project contains many files, you need to compile the MAKEFILE file. For more information about makefile, see references.
(3) Write a connection script file
GCC and other compilers have built-in default Connection Scripts. If the default script is used, the generated target code must be loaded and run by the operating system. To be able to run directly on an embedded system, you need to write your own connection script file. To write a connection script, you must first understand the format of the target file. The target file generated by the GNU Compiler is in ELF format by default. The ELF File consists of several sections. Unless otherwise specified, the target code generated by the C source program contains the following sections :. text (body segment) contains the instruction code of the program ;. data (Data Segment) contains fixed data, such as constants and strings ;. BSS (uninitialized data segment) contains uninitialized variables and arrays. The target code generated by the C ++ source program also includes. Fini (destructor code) and. INIT (constructor code. The task of the connector is to connect the. Text,. Data, And. BSS segments of multiple target files, and the connection script file tells the connector where to place these segments. For example, the connection file link. LDS is:
Entry (BEGIN)
Section
{
. = 0x30000000;
. Text: {* (. Text )}
. Data: {* (. Data )}
. BSS: {* (. BSS )}
}
Entry (BEGIN) indicates that the entry point of the program is the begin label ;. = 0x00300000 indicates that the starting address of the target code is 0x30000000, and the block ram address is mx1 ;. text :{*(. text)} indicates the code segment where all target files are placed starting from 0x30000000, followed. data :{*(. data)} indicates that the data segment starts from the end of the code segment, followed. BSS segment.
(4) use a connector to generate the final target file
With the connection script file, the following command can generate the final target file:
Arm-Linux-LD-No stadlib-O Bootstrap. Elf-tlink. LDS init. O xmrecever. O flash. o
Ostadlib indicates that the database is not connected to the operating system, but directly from the begin entry;-O indicates the name of the target file; -T indicates the connection script file used (you can also use-ttext address to indicate the address of the execution zone), and finally the list of target files to be connected.
(5) generate binary code
The ELF file generated by the connection cannot be downloaded and executed directly. You can use the objcopy tool to generate the final binary file:
Arm-Linux-objcopy-O binary Bootstrap. Elf Bootstrap. Bin
-O binary indicates that a binary file is generated. Objcopy can also generate files in S format. You only need to replace the parameter with-o srec. You can also use the-s option to remove all symbolic information and relocation information. If you want to decompile the generated target code, you can also use the objdump tool:
Arm-Linux-objdump-D Bootstrap. Elf
So far, the generated target file can be directly written to flash for running.
2. makefile instance
Example: head. s main. c
Arm-Linux-gcc-c-o head. O head. s
Previous Article: Embedded System Boot Loader technology insider
Next article: makefile
View comments * the above comments only represent their personal opinions, and do not represent the opinions or positions of the csdn website