3.2 GUN as assembly (most of the content in this article references the original text, not original)

Source: Internet
Author: User

As86 assembly is only used to compile the boot/bootsect. s boot Sector Program in the kernel and the setup program boot/setup. s in the real mode. All other assembly language programs in the kernel (including C language compilation programs) are compiled using gas and linked to the modules produced by C language compilation.

3.2.2 main differences between GUN Assembly syntax and INTEL assembly Syntax:

* In AT&T syntax (GUN Assembly syntax), a character '$' must be added before the immediate operand, and a percent sign '%' must be added before the register operand '; before the absolute jump/call (relative to the jump/call related to the program counter) operations, add the star number '*'. Intel assembly syntax does not have these restrictions.

* AT&T syntax and intel syntax use the same order of source and destination operands. For example, the AT&T statement "addl $4, % eax" adds the values in the 4 and eax registers and stores the results in the eax registers. In intel, the result is "add eax, 4 ".

* The length (width) of the memory operand in AT&T is determined by the last character of the operating code. 'B', 'w', and 'l' indicate that the memory reference width is 8 bytes, 16 characters (word), and 32 characters (long ). Intel assembly uses prefix "byte ptr", "word ptr", and "dword ptr" to achieve the same purpose. Therefore, the intel statement "mov al, byte ptr foo" corresponds to AT&T's statement "movb $ foo, % al ".

* In AT&T syntax, the immediate form of remote jump and forward call is "ljmp/lcall $ section, $ offset", while intel's is "jmp/call far section, offset ". Similarly, AT&T Cosco returns the command "lret $ stack-adjust" corresponding to intel's "ret far stack-adjust ".

* AT&T assembly does not support multi-code snippet programs. UNIX operating systems require that all codes be in one segment.

SymbolIt is an identifier composed of characters. valid characters of the symbols are taken from the case-sensitive character set, digits, and three characters "_. $". It cannot start with a number and has different case meanings. The length is unlimited.

End with a line break or line separator. The last statement of the file must end with a line break. If the Backslash "\" (before the line break) is used at the end of a row, multiple rows can be used in a statement. When the as reads the backslash and line break, the two characters are ignored.

It is a number, which can be divided into character constants and numerical constants. Character constants can also be divided into strings and individual characters. Numeric constants can be divided into integers, large numbers, and floating-point numbers.

3.2.3 command statements, operations, and addressing

It is the operation performed by the CPU, and the command is also called the Opcode );

Is the object of command operation;

Is the location of the specified data in the memory.

Command operation code name

, Or. Generally, the operation code prefix can be used as a command without an operand to exclusively occupy one line and directly located before the affected command. It is better to be on the same line as the command it modifies. For example, the serial scan command "scas" uses the prefix to perform repeated operations: repne scas % es :( % edi), % al.

Used to assist compilers and programmers in temporarily using names. A program contains 10 local symbols ("0"... "9") for reuse. To define a local symbol, you need to write a label (N represents any number) in the form of "N ). If you reference this symbol recently defined, you need to write it as "Nb". If you reference the next definition local symbol, you need to write it as "Nf" (B-backwards, f-forwards ). There are no restrictions on the use of local labels, but at any time, you can only forward/backward reference up to 10 local labels.

Symbol attributes

In addition to the name, each symbol has the "value" and "type" attributes. Depending on the output format, symbols can also have secondary attributes. If it is not defined, a symbol is used. as assumes that all its attributes are 0. This indicates that the symbol is an externally defined symbol.

The value of a symbol is usually 32 bits. For a location symbol in the text, data, bss, or absolute area, its value is the address value from the beginning of the area to the label. For the text, data, and bss areas, the value of a symbol usually changes due to the base address of the ld change area during the link process. The value of the symbol in the absolute area will not change.

Ld performs special processing on undefined symbol values. If the undefined symbol value is 0, it indicates that it is not defined in the Assembly source program. ld will try to determine its value based on other linked files. If a symbol is used in the program but is not defined, such a symbol will be generated. If the undefined symbol value is not 0, it indicates the length of the Public bucket to be retained in the. comm public declaration. The symbol points to the first address of the bucket.

The Type attribute of a symbol contains important positioning information for the linker and debugger, indicating that the symbol is an external sign, and other optional information.

3.2.6 as assembly command

An assembly command is a pseudo command that indicates the assembler operation method. Assembly commands are used to require the assembler to allocate space for variables, determine the program start address, specify the current assembly area, and modify the location counter value. The names of all Assembly commands start with ".". The rest are characters and are case-insensitive. However, lowercase characters are usually used.

1. align abs-expr1, abs-expr2, abs-expr3

. Align is a memory alignment Assembly command used to set (ADD) the value of the position counter to the next specified storage boundary in the current subarea. The first absolute value expression abs-expr1 specifies the required boundary alignment value. For the 80x86 system that uses the target file in a. out format, the value of this expression is the number of 0 bits on the rightmost side of the binary value after the position counter value increases, that is, the power of 2. For example, ". align 3" means to increase the value of the position counter to a multiple of 8. If the value of the position counter is a multiple of 8, you do not need to change it. However, for 80x86 systems in ELF format, the value of this expression is the number of bytes that require alignment. For example, ". align 8" is to increase the position counter to a multiple of 8.

The second expression provides the byte value for alignment and filling. This expression can be omitted. If omitted, the filled byte value is 0. The third optional expression abs-expr3 is used to indicate the maximum number of bytes that can be skipped by the alignment operation. If the alignment operation requires that the number of bytes to be skipped is greater than the maximum value, the alignment operation is canceled.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.