The first part of Linux under the ARM assembly syntax
Although it is convenient to write programs in C or C + + under Linux, the Assembly source program is used for the most basic initialization of the system, such as initializing the stack pointer, setting the page table, and manipulating arm's coprocessor. After initialization is completed, you can jump to C code execution. It should be noted that the GNU assembler follows the AT&T assembly syntax and can download the specification from the GNU site (www.gnu.org).
I. Linux assembly line structure
Any assembly line is structured as follows:
[:] [} @ Comment
[:] [} @ Comment
In the Linux ARM assembly, any identifier that ends with a colon is considered to be a label, not necessarily the beginning of a line.
"Example 1" defines an "add" function that returns the and of two parameters.
. section. Text, "X"
. Global add @ give the symbol add external linkage
Add
Add R0, R0, r1 @ Add input arguments
MOV pc, LR @ return from subroutine
@ End of the program
Two. Labeling in Linux assembler
Labels can only be composed of a~z,a~z,0~9, ".", _ and other characters. When the label is 0~9 number for the local label, local label can be repeated, the use of the following methods:
Marking F: The marking of the forward in the quoted place
Label B: The marking of the backwards in the quoted place
Example of "Example 2" using local symbols, a circular procedure
1:
Subs r0,r0, #1 @ every cycle makes r0=r0-1
BNE 1f @ Jump to 1 label to perform
The local label represents its address, so it can also be used as a variable or function.
Three. Segmentation in Linux Assembler
(1). Section pseudo operation
Users can customize a segment by using the. Section pseudo operation, which is the following format:
. section section_name [, "Flags" [,%type[,flag_specific_arguments]]]
Each segment begins with a paragraph name, and the following paragraph name or end of file ends. These segments have the default flags (flags) that the connectors can identify. (same as area in Armasm).
Below is a section flag allowed in the ELF format
< logo > Meaning
A allow segment
W Writable Segment
X Execution Segment
"Example 3" Definition section
. section. mysection @ Custom data segment, which is named ". MySection"
. Align 2
strtemp:
. ASCII "Temp string/n/0"
(2) assembly system predefined segment name
. Text @ Code Snippet
. data @ Initialization Segment
. BSS @ Uninitialized data segment
. sdata @
. SBSS @
It should be noted that in the source program. The BSS segment should precede. Text.
Four. Define entry points
The default entry for the assembler is the start label, and the user can also indicate other entry points with the entry flag in the connection script file.
"Example 4" defines an entry point
. section.data
< initialized data here>
. Section. BSS
< uninitialized data here>
. section. Text
. globl _start
_start:
<instruction Code goes here>
Five. Macro definition in Linux assembler
The format is as follows:
. macro Macro Name parameter list @ Pseudo directive. Macro defines a macro
Macro Body
. ENDM @.endm Indicates the end of the macro
If the macro uses parameters, the prefix "/" is added when the parameter is used in the macro body. The parameters of a macro definition can also use default values.
You can use the. exitm pseudo directive to exit the macro.
"Example 5" macro definition
. macro Shiftleft A, b
. if/b < 0
MOV/A,/A, ASR #-/b
. exitm
. endif
MOV/A,/A, LSL #/b
. endm
Six. Constants in Linux Assembler
(1) Decimal numbers begin with a number other than 0 digits, such as: 123 and 9876;
(2) The binary number begins with a 0b, in which the letters can also be uppercase;
(3) The number of octal starts with 0, such as: 0456,0123;
(4) The hexadecimal number begins with a 0x, such as: 0xabcd,0x123f;
(5) String constants need to be enclosed in quotation marks, the middle can also use escape characters, such as: "You are welcome!/n";
(6) The current address with "." Indicates that this symbol can be used in an assembler to represent the address of the current instruction;
(7) Expression: Expressions in an assembler can use constants or values, "-" to denote negative numbers, "~" to represent a complement, "<>" to denote inequality, and other symbols such as: + 、-、 *,/,%, <, <<, >, >>, |, & ^ 、!、 = =, >=, <=, &&, | | Similar to the usage in C language.
Seven. The common pseudo operation of ARM assembly under Linux
In the previous mentioned some for the operation, and some of the following for the operation:
Data Definition pseudo operation:. BYTE,.SHORT,.LONG,.QUAD,.FLOAT,.STRING/.ASCIZ/.ASCII, duplicate definition pseudo operation. rept, assignment statement. Equ/.set;
Definition of function;
Alignment pseudo operation. Align;
Source file end pseudo operation.
.include pseudo operation;
if pseudo operation;
.global/. Globl pseudo operation;
.type pseudo operation;
List control statement;
To differentiate from the general pseudo operation of gas assembly, the following is the arm-specific pseudo operation:. Reg,. unreq,. Code,. Thumb,. Thumb_func,. Thumb_set,. ltorg,. Pool
1. Data definition pseudo operation
(1). Byte: Single-byte definition, such as:. Byte 1,2,0b01,0x34,072, ' s ';
(2). Short: Defines two-byte data, such as:. Short 0x1234,60000;
(3). Long: Defines 4-byte data, such as:. Long 0x12345678,23876565
(4). Quad: Define 8 bytes, such as:. Quad 0X1234567890ABCD
(5). Float: Define floating-point numbers, such as:
. Float 0f-314159265358979323846264338327/
95028841971.693993751E-40 @-Pi
(6). STRING/.ASCIZ/.ASCII: Define multiple strings, such as:
. String "ABCD", "Efgh", "hello!"
. Asciz "Qwer", "Sun", "world!"
. ASCII "welcome/0"
It should be noted that the. ASCII pseudo operation defines a string that needs to be added to its own end character '/0 '.
(7). Rept: Duplicate definition pseudo operation, the format is as follows:
. Rept Repeat Times
Data definition
. Endr @ end Duplicate definition
For example:
. rept 3
. Byte 0x23
. Endr
(8). Equ/.set: Assignment statement, the format is as follows:
. EQU (. Set) variable name, expression
For example:
. EQU ABC 3 @ Let abc=3
The definition of the
2. function definition pseudo action
(1) function is defined as follows:
function name:
function Body
return statement
General, If you need to call a function in another file, you need to use the. Global pseudo operation to declare a function as a global function. To avoid confusion when other programs are calling a C function, we need to follow the APCs guidelines for the use of registers. The function compiler will process the assembly code of the function code as a section. Global. The writing of a
(2) function should follow the following rules:
&NBSP;A1-A4 Registers (parameters, results, or staging registers, R0 to R3), and floating-point registers f0-f3 (if there is a floating-point coprocessor) are not saved in the function;
If the function returns a value that is not larger than one word size, the value should be sent to r0 at the end of the function;
If the function returns a floating-point number, put it in the floating-point register F0 at the end of the function;
If the function's procedure changes the SP (stack pointer, R13), FP (frame pointers, R11), SL (Stack constraints, R10), LR (connection registers, R14), V1-V8 (variable registers, r4 to R11), and F4-f7, then the registers should be restored at the end of the function to the value it holds when the function is entered.
3. Align. end. Include. Incbin pseudo operation
(1). Align: Used to specify the alignment of the data, in the following format:
. align [Absexpr1, ABSEXPR2]
Fills a value in an unused storage area in some sort of alignment. The first value represents the alignment, 4, 8,16, or 32. The second expression value represents the value of the fill.
(2). End: Indicates the ending of the source file.
(3). Include: You can expand the specified file where you use. Include, typically a header file, such as:
. include "Myarmasm.h"
(4). The Incbin pseudo operation can compile an intact binary file into the current file, using the following method:
. incbin "File" [, Skip[,count]]
Skip indicates that the Skip bytes start reading the file from the beginning of the file, and count is the number of words read.
4.. if pseudo operation
Depending on the value of an expression to decide whether to compile the following code, use the. endif pseudo operation to indicate the end of the conditional judgment, in which you can use. else to determine which part of the code should be compiled if the condition of the if is not satisfied.
. If there are multiple variants:
. ifdef symbol @ Determines whether symbol is defined
. IFC String1,string2 @ string string1 and string2 are equal, strings can be enclosed in single quotes
. ifeq expression @ To determine if the value of expression is 0
. ifeqs String1,string2 @ To determine if string1 and string2 are equal, strings must be enclosed in double quotes
. IFGE expression @ To determine if the value of expression is greater than or equal to 0
. IFGT absolute expression @ To determine if the value of expression is greater than 0
. ifle expression @ To determine if the value of expression is less than or equal to 0
. iflt absolute expression @ To determine if the value of expression is less than 0
IFNC string1,string2 @ To determine if string1 and string2 are not equal, and their usage is the opposite of the. IFC.
. ifndef symbol,. ifnotdef symbol @ Determines whether symbol is not defined, which is exactly the opposite of. ifdef
. Ifne expression @ If the value of expression is not 0, then the compiler compiles the following code
. ifnes String1,string2 @ if string string1 and string2 are not equal, the compiler compiles the following code.
5. Global. Type. title. List
(1). global/. GLOBL: A symbol used to define a global, formatted as follows:
. global symbol OR. GLOBL symbol
(2). Type: The type used to specify a symbol is a function type or an object type, and the object type is typically data, in the following format:
. type symbol, type description
"Example 6"
. Globl A
. Data
. Align 4
. Type A, @object
. Size A, 4
A:
. Long 10
"Example 7"
. section. Text
. Type Asmfunc, @function
. Globl Asmfunc
Asmfunc:
mov pc, LR
(3) List control statement:
. Title: The caption used to specify the assembly list, for example:
. title "My Program"
. List: Used to output a listing file.
6. Arm-specific pseudo-operation
(1). reg: Used to give registers an alias, the format is as follows:
Alias. Req Register Name
(2). Unreq: Used to cancel a register alias, the format is as follows:
. Unreq Register Alias
Note that the aliased alias must be defined beforehand, or the compiler will make an error, and this pseudo operation can also be used to cancel the system prefab alias, such as R0, but it is not recommended if it is not necessary.
(3). The code pseudo operation is used to select arm or thumb instruction set, in the following format:
. Code expressions
An expression of 16 indicates that the following instruction is a thumb instruction, and if the value of the expression is 32, the following instruction is an arm instruction.
(4). Thumb pseudo operation is equivalent to. Code 16, indicating the use of thumb directives, similar. ARM equivalent to. Code 32
(5). The Force_thumb pseudo operation is used to force the target processor to select a thumb instruction set regardless of whether the processor supports
(6). Thumb_func pseudo operations are used to indicate that a function is a thumb instruction set
(7). The Thumb_set pseudo operation is similar to. Set, which can be used to alias a flag, which is equivalent to the addition of the. Set feature, which can be labeled as the entry point of a thumb function. thumb_func
(8). Ltorg is used to declare the start of a data buffer pool (literal pool), which can allocate a lot of space.
(9). The role of pool is equal. ltorg
(9). Space <number_of_bytes> {, <fill_byte>}
Allocates a number_of_bytes byte of data space and fills its value to Fill_byte, if not specified, the default padding 0. (Same as the space function in armasm)
(a). Word <word1> {, <word2>} ...
Inserts a 32-bit data queue. (Same as the DCD function in armasm)
You can use. Word to use identifiers as constants
For example:
Start:
Valueofstart:
. Word Start
The start of the program is then stored in the memory variable Valueofstart.
(one). Hword <short1> {, <short2>}
Inserts a 16-bit data queue. (Same as the DCW in Armasm)
Eight. GNU arm Assembly special characters and syntax
Comment symbol in line of code: ' @ '
Whole line annotation symbol: ' # '
Statement separating symbol: '; '
Direct operand prefix: ' # ' or ' $ '
Part two GNU compilers and debugging tools
I. Compiling tools
1. Introduction to Editing tools
The compiler tools provided by GNU include Assembler as, C compiler gcc, c + + compiler g++, connector ld, and binary conversion tool objcopy. The tools based on ARM platform are Arm-linux-as, ARM-LINUX-GCC, arm-linux-g++, Arm-linux-ld and Arm-linux-objcopy respectively. The GNU compiler is powerful, with hundreds of operating options, and this is why this kind of tool gives beginners headaches. However, only a limited number of actual developments are needed, most of which can take the default option. The development process of GNU tools is as follows: Write C, C + + language or assembler source program, generate target file with GCC or g++, write connection script file, generate final target file (elf format) with connector, generate downloadable binary with binary conversion tool.
(1) Writing C, C + + language or assembly source program
Usually the assembly source program is used for the most basic initialization of the system, such as initializing the stack pointer, setting the page table, and manipulating the arm's coprocessor. After initialization is completed, you can jump to C code execution. It should be noted that the GNU assembler follows the At&t assembly syntax, and the reader can download the specification from the GNU site (www.gnu.org). The default entry for the assembler is the start label, and the user can also indicate other entry points with the entry flag in the connection script file (see below for a description of the connection script).
(2) Generate target files with GCC or g++
If your application includes multiple files, you need to compile them separately, and then connect them with the connectors. such as the author's guidance program includes 3 files: Init.s (assembly code, initialization hardware) XMRECEVER.C (communication module, using Xmode Protocol) and FLASH.C (Flash erase module).
Generate the target file separately with the following command: Arm-linux-gcc-c-o2-oinit.oinit.s arm-linux-gcc-c-o2-oxmrecever.oxmrecever.c ARM-LINUX-GCC-C-O2-OFLASH.OFLASH.C where the-C command indicates that only the target code is generated, no connection is made, the-o command indicates the name of the target file, and the-O2 represents a two-level optimization, which enables the resulting code to be shorter and faster to run. If your project contains many files, you need to write makefile files. For more information about Makefile, please refer to the relevant materials for your interested readers.
(3) Writing connection script files
Compilers such as GCC have default connection scripts built into them. If the default script is used, the generated target code requires the operating system to load the operation. In order to be able to run directly on an embedded system, you need to write your own connection script file. To write a connection script, you must first understand the format of the target file. The GNU compiler generates the target file by default to the ELF format. The Elf file consists of several segments (sections), if not specified, the target code generated by the C source program contains the following paragraph:. text (body section) contains the program's instruction code;. Data (segment) contains fixed data, as usual, string ;. BSS (uninitialized data segment) contains uninitialized variables, arrays, and so on. C + + source program generated in the target code also includes the. Fini (destructor code) and. Init (constructor code), and so on. The task of a connector is to connect the. Text,. Data, and. BSS segments of multiple destination files, and the connection script file tells the connector where to start placing the segments. For example, the connection file Link.lds is:
ENTRY (BEGIN)
Section
{
. =0x30000000;
. text:{* (. Text)}
. data:{* (. Data)}
. bss:{* (. BSS)}
}
Wherein, ENTRY (begin) indicates that the entry point of the program is the begin label; =0x00300000 indicates that the starting address of the target code is 0x30000000, which is MX1 in-slice ram;.text:{* (. Text)} Represents the code snippet where all the target files are placed from 0x30000000, and then. data:{* (. Data)} indicates that the data segment starts at the end of the code snippet and then the. BSS segment.
(4) to generate the final target file with the connector
With the connection script file, the following command generates the final destination file:
Arm-linux-ld–no stadlib–o bootstrap.elf-tlink.lds init.o xmrecever.o flash.o
Where Ostadlib represents the runtime that does not connect to the system, but is directly from the begin entry;-O indicates the name of the target file;-T indicates the connection script file (or use-ttext address,address to represent the execution area address) ; Finally, a list of destination files that need to be connected.
(5) Generate binary code
Connection generated ELF files cannot be downloaded directly, and the final binaries can be generated through the Objcopy tool:
Arm-linux-objcopy–o binary bootstrap.elf Bootstrap.bin
Where the-o binary specifies to be a binary format file. Objcopy can also generate file in s format, simply replace the parameter with-o Srec. You can also use the-s option to remove all symbolic information and reposition information. If you want to disassemble the generated target code, you can also use the Objdump tool:
Arm-linux-objdump-d bootstrap.elf
At this point, the generated target file can be written directly to the Flash to run.
2. Makefile Instance
Example:head.s MAIN.C
Arm-linux-gcc-c-O head.o Head.s
Arm-linux-gcc-c-O main.o main.c
Arm-linux-ld-tlink.lds HEAD.O Ain.o-o example.elf
Arm-linux-objcopy-o Binary-s EXAMPLE_TMP.O Example
arm-linux-objdump-d-B binary-m Arm example >TTT.S
Two. Debugging Tools
The GNU Debugging Tools under Linux are mainly GDB, Gdbserver and Kgdb. where GDB and Gdbserver can complete the remote debugging of the application on the target board Linux. Gdbserver is a very small application, running on the target board, can monitor the operation of the debugging process, and through the serial port with the host computer on the GDB communication. The developer can control the running of the process on the target board through the GDB Input command of the host computer, and view the contents of memory and register. gdb5.1.1 later version of the support of the ARM processor, the initialization of the addition-target==arm parameters can be directly generated based on the arm platform Gdbserver. The GDB tool can be downloaded from ftp://ftp.gnu.org/pub/gnu/gdb/.
for Linux kernel debugging, you can use the Kgdb tool, the same need through the serial port and host computer on the GDB communication, the target board of the Linux kernel debugging. You can learn the specific usage from http://oss.sgi.com/projects/kgdb/.
Reference:
1. Richard blum,professional Assembly Language
2. GNU ARM Assembly QuickStart, http://blog.chinaunix.net/u/31996/showart.php?id=326146
3. Introduction to ARM GNU Assembly pseudo directive, http://www.cppblog.com/jb8164/archive/2008/01/22/41661.aspx
4. GNU Assembly use Experience, http://blog.chinaunix.net/u1/37614/showart_390095.html
5. GNU compiler and development tool, http://blog.ccidnet.com/blog-htm-do-showone-uid-34335-itemid-81387-type-blog.html
6. Develop an embedded system based on ARM using the GNU tool, http://blog.163.com/liren0@126/blog/static/32897598200821211144696/
7. Objcopy Command Introduction, HTTP ://blog.csdn.net/junhua198310/archive/2007/06/27/1669545.aspx