The statement format of the assembly language is as follows:
{Symbol} {instruction/Directive/pseud0-instruction} {; Comment}
Symbol: Label/local label/constant/variable
Instruction: Command
Directive: pseudo operation
Pseudo-instruction: pseudo command
Comment: Comment
{} Is optional
Note:
1. symbol must be written in the top level. (":" Is not added after arm assembly.) (":" must be added to gun assembly.)
2. symbol naming rule: it consists of letters, numbers, and underscores. Except for the local number, it cannot start with a number.
3. commands cannot be written in the top level.
4. Arm commands, pseudo commands, and pseudo operations. Register names can be uppercase or lowercase letters, and cannot be mixed.
5. If the statement is too long, you can split a statement into several lines and use "\" to indicate a line break at the end of the line (that is, the next line is the same as the current statement ). "\" Cannot contain any characters, including spaces and tabs ).
Symbols in arm assembly language
Symbol)
Essence: represents an address value. The address value of the segment label is determined during assembly, and the address value of the segment label is determined during connection.
Category: Category 3 (based on the label generation method)
PC-based labels. The PC-based label is the label located before the target instruction or before the pseudo operation is defined in the program data. During assembly, this label is processed as a PC value plus (or minus) a numerical constant. (Usually used to indicate the target address of the jump command, or a small amount of data embedded in the code segment .) Register-based labels. Register-based labels are commonly defined by map and field, and can also be defined by equ. This label is added (or subtracted) as a numerical constant to the register value during assembly. (Used to access data in data segments .) Absolute address. The absolute address is a 32-bit data. It can address a range of [0,232-1], that is, it can directly address the entire memory space.
Symbol (local label) is mainly used in a local range. It consists of two parts: It starts with a number 0-99, followed by a symbol that usually represents the scope of the local variable. The scope of local variables is usually the current segment. You can also use rout to define the scope of local variables. Syntax format defined for local variables: n {routname} n: 0 ~ A number between 99.
Routname: the name of the current local range (as a symbol), usually the name of the range to which the variable applies (defined by the rout pseudo operation ). Syntax format for local variable reference: % {f | B} {A | t} n {routname }%: indicates reference operation
N: the number of the local variable. Routname is the name of the current range (defined by the rout pseudo operation) %: indicates the reference operation F: indicates that the compiler only searches forward B: indicates that the compiler only searches backward: indicates the compiler to search for all nested levels of macros T: indicates the compiler to search for the current level of macros. Note: 1. if neither f nor B is specified, the compiler first searches forward and then backward. if neither a nor T is specified, the compiler searches for all the highest levels from the current level to the macro level. 3. If routname is specified, the compiler searches forward for the most recent rout pseudo operation. If routname does not match the name defined in the rout pseudo operation, the compiler reports an error and fails to compile the operation.
Symbol (constant)
A numeric constant is a 32-bit integer. In arm assembly language, EQU is used to define numerical constants. Numeric constants cannot be modified once defined. When comparing the values, the numeric constants are considered to be unsigned.
{Numeric constant: decimal number, hexadecimal number, n_xxx (N indicates N (2-9) hexadecimal number, XXX is the specific number, for example: 8_3777)
Character constant: It is enclosed by a pair of single quotes and contains a single character or escape character in Standard C. Example: 'A' \ N'
String constant: It is enclosed by a pair of double quotation marks and contains a string or escape character in Standard C.
Boolean constants: {true} and {false}
}
Note: 1. the assembler does not distinguish between-N and 2 ^ 32-1. The Relational operators are processed in the form of unsigned numbers during assembly, which means that when 0>-1 is
{Flase}'s
Symbol (variable)
In a program, the value of a variable may change during compilation. There are three types of variable in arm assembly: Numeric variable, logical variable, and string variable. The type of the variable cannot be changed in the program. The value range of a numeric variable is a numerical constant or a numeric expression;
Unsigned number [Power 32 of-1], signed number [power 31 of negative 2, power 31 of positive 2-1]
The values of logical variables include {true} and {flash };
The value range of a string variable is a range that can be expressed by a string expression. [2, 0,512] bytes
Note: Replace the variable during assembly 1. If there is a $ character before the string variable, the compiler will replace the string variable with the variable value. 2. if there is a $ character before the numeric variable, the compiler converts the numeric value of the numeric variable into a hexadecimal string during compilation, then, replace the numeric variable after $ with the hexadecimal string. 3. if there is a $ character before the logical variable, the compiler replaces the logical variable with its value (T or F) during assembly. 4. if the program requires the character $, it is represented by $. The Compiler does not replace the variable, but treats $ as $. generally, $ between two vertical bars (|) does not represent variable replacement. If the vertical line (|) is enclosed in double quotation marks, the variable is replaced. 5. Use "." to indicate the end of the variable name.
"." May also indicate the current instruction address.
Expressions in arm assembly language
An expression is composed of symbols, values, single-object or multi-object operators, and parentheses.
1. String expression
A string expression is composed of strings, string variables, operators, and parentheses. The maximum length of a string is 512 bytes, and the minimum length is 0. The following describes the composition elements of a string expression.
String: contains a series of characters in double quotation marks. The length of the string is limited by the length of the arm assembly language statement. When the string contains the dollar sign $ or quotation mark ", $ is used to represent a $, and" "represents ".
String variable: the variable is declared using the pseudo-operation GBLS or lcls, and assigned a value using sets.
OPERATOR:
(1) Len: returns the length of the string.
: Len:
Where, A is a string variable
(2) CHR: 0 ~ An integer between 255 is a string containing an ASCII character. When some ASCII characters are not convenient to be placed in a string, you can use CHR to put them in a string expression.
: CHR:
Where, A is the ASCII value of a certain character
(3) STR: converts a numeric or logical expression into a string. For a 32-bit numeric, STR converts it to a string consisting of eight hexadecimal numbers. For a logical expression, STR converts it to a string T or F
: Str:
Where, A is a numeric or logical expression
(4) left: returns a substring with a certain length at the leftmost of the string.
A: Left: B
Where, A is the source string and B is the number, which indicates the number of characters that left will return.
(5) Right: return a substring with a certain length at the rightmost of the string.
A: Right: B
Where, A is the source string, B is the number, indicating the number of characters that right will return
(6) CC: used to connect two strings.
A: CC: B
Where, A is 1st source strings. B is 2nd source strings. The CC operator connects string B to string.
2. numeric expression
A numeric expression consists of numeric constants, numeric variables, operators, and parentheses.
The numeric variable is declared by pseudo operations gbla or llinoleic and assigned by seta, which represents a 32-bit number.
OPERATOR:
(1) Not: bitwise Inversion
: Not:
A is a 32-bit number.
(2) +,-, ×,/, and mod Arithmetic Operators
A + B, A-B, A × B, a/B
A: mod: B indicates the remainder of a divided by B.
(3) ROL, Ror, SHL, and SHR shift
A: Rol: B shifts the integer a to the left of B
A: SHL: B shifts integer a to the left of B
(4) and, Or, And eor logical operators by bit
A: And: B. Logical and operation of numeric expressions A and B in bits
3. Register-based and PC-based expressions
A register-based expression represents the addition (or subtraction) of a register value to a numeric expression.
A pc-based expression indicates that a numeric expression is added (or subtracted) to the value of the PC register. A pc-based expression usually consists of a program label and a digital expression. Related operators:
(1) Base: return the Register number in the register-based expression.
: Base: A is a register-based expression.
(2) index: return the offset of a register-based expression relative to its base register.
: Index: A is a register-based expression.
(3) +,-: Positive and negative signs, which can be placed before a digital expression or a PC-based expression.
+ A (-A) A is a PC-based or digital expression.
4. logical expressions
It consists of logical quantities, logical operators, Relational operators, and parentheses. The value ranges from {flase} to {true}
Relational OPERATOR: used to represent the relationship between two similar expressions. The relational operator and its two operands form a logical expression. The value is {false} or {true}
For example, A = B indicates that A is equal to B.
A/= B, A <> B indicates that A is not equal to B
Logical operators: Perform basic logical operations between two logical expressions. The operation result is {flase} or {true}
: Lnot: the value of logical expression A is reversed.
A: Land: B logical expressions a and B logical and
5. Other operators
(1 )? : Returns the number of bytes of executable code generated by the code line defining symbol.
? A
A is a symbol.
(2) Def: Determine whether a symbol has been defined
: Def:
If the symbol A has been defined, the above result is {true}; otherwise, it is {flase}
(3) sb_offset_19_12
: Sb_offset_19_12: label indicates a label.
Return the bits of (Label-Sb) []
(4) sb_offset_11_0
: Sb_offset_11_0: Label
Return the bits of (Label-Sb) []