For any operating system written in advanced languages, a small part of its kernel source code is written in assembly language. Read
Readers who have used UNIX sys v source code know that about 30 thousand lines of code are written in assembly language in about 2000 lines of core code.
There are less than 20 files with extensions of. s and. M, most of which are the underlying programs for interrupt and exception handling, and
Programs and common subprograms called in some core code.
Writing part of the core code in assembly language is generally out of the following considerations:
| ● |
The underlying program in the operating system kernel deals directly with hardware and requires some special commands, which There is no corresponding language component in the language. For example, in the 386 system structure, the input/output commands for peripherals such as INB and outb And so on. Therefore, these underlying operations need to be written in an assembly language. CPU The same is true for register operations. For example, when you want to set a segment register, you have to write it in assembly language. |
| ● |
Some special commands in the CPU do not have the corresponding C language components, such as disconnection and interruption. In addition In different CPU chips of the system structure, especially new chips, some new commands are often added, such Pentium, Pentium II, and Pentium MMX both expand new commands in the original infrastructure and use these commands You must also use the assembly language. |
| ● |
The process, program segment, or function that implements certain operations in the kernel will be called very frequently during runtime, so the (time) Efficiency is very important. However, when the algorithms and data structures are the same, the efficiency of programs written in assembly languages is usually It is higher than writing in advanced languages. In such programs or program segments, the use of each Assembly instruction usually needs to be pushed. . The entry and return of system calls are a typical example. System calls are frequently used in and out processes. Tens of thousands of times may be used in seconds, and the time efficiency is crucial. Besides, the incoming and outgoing processes of system calls also involve User space and system space switch back and forth, and some commands for this purpose are not in the C Language Therefore, the entry and return of system calls must be compiled in assembly language. |
| ● |
In some special cases, the space efficiency of a program is also very important. The operating system introducing different programs is an example. Child. The boot program of the system must be in the first sector of the disk. At this time, even if the program is large It does not work if one byte is smaller, so it can only be written in assembly language. |
In the source code of the Linux kernel, programs or program segments written in assembly languages have several different forms:
The first type is complete assembly code. Such code uses. s as the suffix of the file name. In fact, despite being a "pure" Compilation code, modern compilation tools also absorb the advantages of C language preprocessing and also add a preprocessing before compilation, while preprocessing
The previous file is suffixed with. S. Such (. s) files are the same as C Programs. You can use the # include, # ifdef, and other components.
The data structure can also be defined in the. h file.
The second is the Assembly Language Segment embedded in the C program. Although the ansi c language standard does not contain
It is stipulated that, in fact, all the actually used C compilations have been expanded in this aspect, while the gnu c compiler GCC has also made
Strong expansion.
In addition, there are several intel-format assembly language programs in the kernel code, which are used for system guidance.
Since we focus on the Linux kernel under the Intel i386 system structure, we will only introduce GNU's support for i386 assembly language.
For new users who are familiar with Linux kernel source code, even if they are familiar with i386 assembly languages
Programs or fragments may feel difficult, and some may even be discouraged. The reason is that GNU uses
Unlike syntaxes commonly used in 386 assembly languages, some guidance on how to allocate assembly tools is added in the segments embedded in C Programs.
Use registers and language components that combine with the variables defined in the C program. These components allow compilation embedded in C Programs
Language snippets actually become an intermediate language between 386 Assembler and C.
Therefore, we will first focus on the 386 assembly language used in these two cases in the kernel, and then in the specific scenario
The specific assembly language code is also explained.
1 GNU 386 Assembly Language
In the DOS/Windows Field, 386 assembler languages use intel-defined statement (Instruction) formats, which are also almost
The format used in textbooks or reference books related to assembly language programming 386. However, in the Unix field
The format defined by at&t. When at&t ported UNIX to a 80386 processor
This format is often defined as needed. Unix was originally developed on PDP-11 machines and has been ported to vax and 68000 Series
. The assembly languages of these machines are different in style and format from Intel. At&t's 386
Assembly languages are closer to those languages. Later, this format was retained in unixware. GNU is mainly used in UNIX
(Although GNU is short for "GNU is not UNIX ). To be as good as possible with previous UNIX versions and tools
The system tools developed by GNU naturally inherit at&t's 386 assembly language format instead of Intel's
Format
So what is the gap between the two assembly languages? In fact, they are similar. However, sometimes it is important to make a difference.
If you do not pay attention to it, it will cause problems. Specifically, there are the following differences:
| (1) |
In intel format, most uppercase letters are used, while in at&t format, lowercase letters are used. |
| (2) |
In at&t format, Register names must be prefixed with "%", but not intel format. |
| (3) |
In at&t's 386 assembly language, the order of the source and target operands of commands is the same as that in Intel's 386 assembly language. The opposite is normal. In intel format, the target is in the front and the source is in the back; In at&t format, the source is in the front and the target is in the back. For example, to send the eax register content to EBX, in Intel format, it is "Move EBX, eax", while in at&t The formula is "Move % eax, % EBX". What Intel designers think is "EBX = eax", while at&t The format designer thinks "% eax 1> % EBX ". |
| (4) |
In the at&t format, the operand size (width) of the inner access command is determined by the last letter of the operation code name (that is, the operation code ). As the suffix of the Code. The letters used as the operation code suffix are B (8 bits), w (16 bits), and L (represent 32-bit ). In intel format, "Byte PTR", "Word PTR ", or" dword ptr. For example, the byte in the memory unit indicated by foo is taken into the 8-bit storage space. In the following two formats: MoV Al, byte PTR Foo (Intel format) Movb Foo, % A1 (at&t format) |
| (5) |
In at&t format, the "$" prefix must be added to the direct operand, but not to the intel format. So, In intel format, "Push 4" is changed to "pushl $4" in at&t format ". |
| (6) |
In the at&t format, the operands of the absolute transfer or call command jump/call (that is, the destination address of the transfer or call ), "*" Should be added as the prefix (readers may think of pointers in C), but not in Intel format. |
| (7) |
The name of the operation code for the remote Transfer Instruction and subroutine call instruction, in the at&t format for "ljmp" and "lcall gas In intel format, it is "JMP far" and "Call far ". When the transfer and call targets are direct operands, The two differences are as follows: Call far section: offset (Intel format) JMP far sectiom: offset (Intel format) Lcall $ section, $ offset (at&t format) Ljmp $ section, $ offset (at&t format) The corresponding remote return command is: RET far stack_adjust (Intel format) LRET $ stack_adjust (at&t format) |
| (8) |
The general format of indirect addressing. The differences between the two are as follows: Section: [base + Index * scale + disp] (Intel format) Section: disp (base, index, scale) (at&t format) |
Note that the calculations are implicitly performed in the at&t format. For example, if section is omitted, index and scale are also omitted,
When base is EBP and disp (displacement) is 4, It is shown as follows:
[Ebp-4] (Intel format)
-4 (% EBP) (at&t format)
If there is only one base in the brackets in at&t format, you can omit the comma. Otherwise, it cannot be omitted, so (% EBP) is equivalent
+ (% EBP,), which is further equivalent to (EBP, 0, 0 ). For example, when index is eax, scale is 4 (32 bits), DISP
Is Foo, while others are omitted, it indicates:
[Foo + eax * 4] (Intel format)
Foo (, % eax, 4) (at&t format)
This addressing method is often used to access a field in a specific element in an array of data structures. Base is the starting address of the array,
Scale is the size of each array element, and index is a subscript. If the array element is a data structure, DISP is the specific field in the Structure
.
2. Assembly Language embedded in C Language
When you need to embed an Assembly Language Segment in a C program, you can use the "ASM" Statement function provided by GCC. The specific format is as follows:
| _ ASM _ ("assembly code segment ") |
| _ ASM _ volatile _ (specified operation + "assembly code segment ") |
Since the specific assembly language rules are quite complex, we only care about the main rules related to the kernel source code and use several examples to describe them. For details about other rules, refer to the relevant CPU manual.
Example 1
: There is such a line in include/asm-i386/IO. h:
# DEFINE _ slow_down_io _ ASM _ volatile _ ("outb % Al, $0x80 ")
Indicates an 8-bit output command. B indicates that this is 8 bits, while 0x80 is a constant, that is, the so-called "direct operand". Therefore, the prefix "$" must be added, and the Register name Al also adds the prefix "% ".
Example 2
: You can also insert multi-row assembler in the same ASM statement. In the same file, __slow_down_io has different definitions under different conditions:
# DEFINE _ slow_down_io _ ASM _ volatile _ ("/njmp 1f/N1:/tjmp 1f/N1 :")
This is not so intuitive. Here, a three-line assembly statement is inserted, "/N" is a line break, and "/t" indicates a Tab character. These rules are the same as those for the intermediate characters in the printf statement:
Here, the target lf of the Transfer Instruction indicates going to (F indicates forward) and finding the line with the first label L. Correspondingly, if it is LB, it indicates finding it later. Therefore, the purpose of this short piece of code is to make the CPU empty for two transfer commands, which consumes some time.
Example 3
: The following is a piece of code from include/asm-i386/Atomic. h.
| Static _ inline _ void atomic_add (int I, atomic_t * V) { _ ASM _ volatile __( Lock "addl % 1, % 0" : "= M" (V-> counter) : "Ir" (I), "M" (V-> counter )); } |
In general, it is very complicated to insert assembly language code into C code, because there is a problem of combining the allocation register with the variables in C code. For this purpose, the assembly language used must be expanded to provide guidance for the Assembly Tool.
Next, we will introduce and explain the general format of the assembly components inserted into the C code. In the future, we will prompt you when we encounter specific code:
An assembly language code snippet inserted into the C code can be divided into four parts, separated by the ":" sign, the general form is:
Command: Output: input: damaged
Be sure not to confuse these ":" with those used in the program label (such as the preceding 1.
The first part is the Assembly Statement itself. Its format is basically the same as that used in the assembler, but there are also differences. Different expenses will be discussed immediately. This part can be called the "command department" and is mandatory, while the sub-sections of other departments can be omitted according to the specific situation. Therefore, it is basically the same as the conventional Assembly statement in the simplest case, as in the previous two examples.
In the instruction department, the number is prefixed with %, for example, % 0, % 1, and so on, indicating that the Register's sample operand needs to be used. The total number of such operations can be used depends on the number of General registers in the CPU,
In this way, the instruction Department uses several different operands, which indicates that there are several variables that need to be combined with registers. The GCC and gas are modified according to the following constraints during compilation.
So how can we express the constraints on variable combination? This is the role of the other parts. "Output Department
To specify the output variable, that isTarget operand
How to combine constraints. When necessary, the output department may have multiple constraints separated by commas. Each output constraint starts with "=", and then represents a description of the operand type with letters, and then constraints on variable combination. For example:
: "= M" (V-> counter). There is only one constraint. "= m" indicates that the target operand (% 0 in the instruction Department) is a memory unit.
V-> counter. All registers or operands that are combined with the operands described in the output part are reserved after the assembly code is embedded, this provides GCC with the basis for scheduling to use these registers.
The output part is followed by the "input part"
".
The format of the input constraint is similar to that of the output constraint without the "=" sign. In the preceding example, the input part has two constraints. The first one is "ir" (I), indicating that % 1 in the instruction can be a "straight" in the register
And the operand comes from variable name I (in parentheses) in the C code ). The second constraint is "M" (V-> counter), which means the same as that in the output constraint.
Looking back, let's look at the Number % in the instruction department, which represents the number of the instruction's operand, indicating that the number starts from the first constraint (number 0) in the output department, each constraint is counted once.
In addition, in some special operations, when performing the byte operation on the operands, you can also specify which byte operation is performed on, insert "B" between "%" and "Serial Number" to indicate the lowest byte, and insert "H" to indicate the second low byte.
| List of common constraints |
| M, V, O |
-- Indicates the memory unit; |
| R |
-- Indicates any register; |
| Q |
-- One of the registers eax, EBX, ECx, and EDX; |
| I, h |
-- Represents the direct operand; |
| E, F |
-- Indicates a floating point number; |
| G |
-- Indicates "any"; |
| A, B, c, d |
-- Table sharding requires the use of registers eax, EBX, ECx, and EDX; |
| S, d |
-- Register ESI and EDI are required; |
| I |
-- Represents a constant (0 to 31 ). |
Back to the example above, the reader should now easily understand that the function of this Code is to add the value of parameter I to V-> counter. The keyword lock in the Code indicates that the system bus should be locked when the addl command is executed to ensure the "atomicity (atomic)" of the operation.
Example 4
: Look at a piece of embedded assembly code, this time taken from include/asm-i386/bitops. h
# Ifdef config_smp # Define lock_prefix "lock ;" # Else # Define lock_prefix "" # Endif # Define ADDR (* (volatile long *) ADDR) Static _ inline _ void set_bit (int nr, volatile void * ADDR) { _ ASM _ volatile _ (lock_prefix "Btsl % 1, % 0" : "= M" (ADDR) : "Ir" (NR )); } |
The command btsl sets a bit in a 32-bit operand to 1. The NR and ADDR parameters indicate that the 32-digit Nr bit with the memory address ADDR is set to 1.
Example 5
: Let's look at a complex, but very important example from include/asn-i386/string. h:
| Static inline void * _ memcpy (void * To, const void * From, size_t N) { Int D0, D1, D2; _ ASM _ volatile __( "Rep; movsl/n/t" "Testb $2, % B4/n/t" "Je 1f/n/t" "Movsw/N" "1:/ttestb $1, % B4/n/t" "Je 2f/n/t" "Movsb/N" "2 :" : "= & C" (D0), "= & D" (D1), "= & S" (D2) : "0" (N/4), "Q" (N), "1" (long) to), "2" (long) from) : "Memory "); Return (); } |
Here the _ memcpy function is the underlying implementation of the memcpy function kernel that we often call to copy the content of the memory space. The to parameter is the destination address of the copy, the from parameter is the source address, and the length of the content of the N-bit copy, in bytes. GCC generates the following code:
| Rep; movsl Testb $2, % B4 Je 1f Movsw 1: testb $1, % B4 Je 2f Movsb 2: |
The output part has three constraints. The internal variables D0, D1, and D2 of the function correspond to the operands % 0 to % 2 respectively, where D0 must be placed in the ECX register; D1 must be placed in the EDI register; d2 must be placed in the ESI register. Let's look at the input part. Here there are four constraints corresponding to the operands % 3 and,
% 4, % 5, % 6. The operand % 3 and the operand % 0 use the same register ECx to convert the copy length from the number of bytes to the number of characters (N/4); % 4 to the number of N itself, one register is required for storage. % 5 and % 6 are the parameters to and from. They use the same registers (EDI and ESI) as % 1 and % 2 respectively)
Check the instruction Department again. The first command is "Rep", which is only a label, indicating that the next command movsl must be executed repeatedly. Each time it is repeated, the content in the register ECx is reduced by 1 until it is changed to 0. Institute
In this Code, N/4 times are executed. Movsl is a very important complex command in the 386 command system. It copies a long word from the place indicated by ESI to the place indicated by EDI, and
ESI and EDI add 4 respectively. In this way, when the movsl command in the code is executed and the testb command is ready to be executed, all the long words are copied, and at most three bytes are left. In this
The above three registers are implicitly used in the process, which explains why these operands must specify the registers that must be stored in the input and output sections.
Then, it processes the remaining bytes (up to three ). Test the operand % 4 through testb, that is, copy bit2 in the lowest byte of n. If this bit is 1, there are at least two
So a short word is copied through movesw (ESI and EDI are added to 2). Otherwise, it is skipped. Test bit1 of the operand % 4 through testb. If this bit is
1 indicates that there is only one byte left. Therefore, run the command movsb to copy another byte. Otherwise, skip this step. When the number 2 is reached, the execution is complete.
There are many of the most basic assembly functions in the include/asm-i386, if you have time, you may wish to find a few exercises.