Article title: a good way to interpret the Linux operating system kernel source code. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
For many Linux enthusiasts who are very interested in the kernel, this article aims to introduce an entry method for interpreting the linux kernel source code, rather than explaining the complicated Linux kernel mechanism;
I. Core source program file organization:
1. the Linux core source code is usually installed in/usr/src/linux, and it has a very simple numbering convention: any even core (such as 2.0.30) is the core of a stable release, and any odd core (such as 2.1.42) is the core of development.
This article is based on the stable 2.2.5 source code. The second part of the implementation platform is Redhat Linux 6.0.
2. the core source program files are organized in a tree structure. at the top of the source program tree, you will see the following directories:
● Arch: The arch sub-directory contains all core code related to the architecture. Each of its subdirectories represents a supported architecture. for example, i386 is about intel
The cpu and its sub-directories that are compatible with the architecture. Generally, PCs are based on this directory;
● Include: the include sub-directory includes most of the header files required for compiling the core. Platform-independent header files are in the include/linux subdirectory and are not related to intel
Cpu-related header files are in the include/asm-i386 subdirectory, while the include/scsi directory is the header file directory of the scsi device;
● Init: This directory contains the core initialization code (note: it is not the system boot code) and contains two main files. c and Version. c. This is a very good starting point for studying how the core works.
● Mm: This directory includes all
The memory management code of the architecture, such as the distribution and release of the page-based storage management memory. the memory management code related to the architecture is located in arch/*/mm /, for example, arch/i386/mm/Fault. c
● Kernel: the main core code. the files in this directory implement the Kernel functions of most linux systems. The most important file is sched. c; similarly, the architecture-related code is in arch/*/kernel;
● Drivers: Place all the device Drivers in the system. each driver occupies one sub-directory, for example,/block.
For example, ide (ide. c ). If you want to check how all devices that may contain a file system are initialized, you can check device_setup () in drivers/block/genhd. c (). It not only initializes the hard disk, but also the network, because the network is required when the nfs file system is installed:
For example, Lib stores the core library code; Net, core and network-related code; Ipc, which contains the code for inter-process communication between the core; Fs
All file system code and various types of file operation code. each subdirectory of the code supports a file system, such as fat and ext2;
Scripts, which contains the script files used to configure the core.
Generally, each directory contains A. depend file and a Makefile.
Files, both of which are auxiliary files used for Compilation. reading these files carefully is helpful for figuring out the relationship and dependency between each file. Besides, there is Readme in some directories.
File, which is a description of the files in this directory, which is also conducive to our understanding of the kernel source code;
II. Interpreting practice: add a system call for your kernel
Although the Linux kernel source code is organized reasonably and scientifically in a tree structure, all files associated with functions are placed under the same subdirectory, which makes the program more readable. However
The kernel source code is too big and very complex. even if a reasonable file organization method is adopted, there are still many associations between files under different directories, some code at the analysis core usually needs to view several other related files, and these files may not be in the same subdirectory.
The complexity of the system and the complexity of the association between documents may be the main reason why many people are afraid of it. Of course, the return of this daunting Labor is also fascinating: you can not only learn a lot of underlying computer knowledge (such as the system guidance described below ), I realized that the entire operating system architecture is exquisite and the algorithm is clever when solving a specific problem in detail. More importantly, in the source code analysis process, you will be professionally specialized at 1.1 points. Even after analyzing the code in 10 million points, you will deeply understand what kind of code is written by a professional programmer, what kind of code is written by a hobbyist.
In order to allow readers to better understand this feature, The following describes a specific kernel analysis instance, this gives readers some specific understanding of the Linux kernel organization, from which they can also learn some kernel analysis methods.
The following is an analysis instance:
A. Operating Platform:
Hardware: cpu intel Pentium II;
Software: Redhat Linux 6.0; kernel version 2.2.5
B. kernel source code analysis:
1. system boot and initialization: Linux system boot has several methods: Lilo,
The Loadin and Linux bootstrap boot (bootsect-loader), while the latter corresponds to the source program arch/i386/boot/bootsect. s, which is an assembly program of the real-mode. the length is not analyzed here. no matter which mode of guidance is used, the system will jump
Arch/i386/Kernel/setup. S, setup. S is the initialization in the ongoing mode to prepare for the system to enter the protection mode. after that, the system runs
Arch/i386/kernel/head. S (perform arch/i386/boot/compressed/head. S for the compressed kernel );
Setup_idt, an assembler defined in head. S, is responsible for creating an idt table (Interrupt Descriptor
Table), which stores all the inactive and inactive endpoints, including system_call
Of course, in addition, head. S also needs to do some initialization work;
2. the first kernel program asmlinkage void _ init start_kernel (void) that runs after system initialization is defined in
In/usr/src/linux/init/main. c, it calls a function in usr/src/linux/arch/i386/kernel/traps. c.
Void _ init trap_init (void) sets the entry address of each sink and interrupt service program to idt
In the table, the system calls the general control program system_cal, which is one of the interrupted service programs. The void _ init trap_init (void) function calls a macro
Set_system_gate (SYSCALL_VECTOR, & system_call); hangs the entry of the system call control program on the interrupt 0x80;
SYSCALL_VECTOR is a constant 0x80 defined in/usr/src/linux/arch/i386/kernel/irq. h.
System_call
That is, the entry address of the interrupt control program. The interrupt control program is defined in/usr/src/linux/arch/i386/kernel/entry. S in assembly language;
3. the interrupt control program is mainly responsible for saving the status before the processor executes the system call, checking whether the current call is valid, and redirecting the processor to the sys_call_table
The entry of the corresponding system service routine in the table; restores the processor status and returns it to the user program after the system service routine returns;
While the system call vector is defined in/usr/src/linux/include/asm-386/unistd. h; sys_call_table
The table is defined in/usr/src/linux/arch/i386/kernel/entry. S.
/Usr/src/linux/include/asm-386/unistd. h also defines the user programming interface called by the system;
4. it can be seen that the linux system call is also like the dos system's int 21 h interrupt service. it uses 0x80 interrupt as the total entry, and then forwards it to
The entry addresses of various interrupt service routines in the sys_call_table table form different interrupt services;
According to the source code analysis above, to add a system call, you must add one in the sys_call_table table,
Save the entry address of your system service routine and re-compile the kernel. of course, the system service routine is essential.
It can be seen that in the Linux kernel source program of this version, the source program files related to system calls include the following:
Arch/i386/boot/bootsect. S
Arch/i386/Kernel/setup. S
Arch/i386/boot/compressed/head. S
Arch/i386/kernel/head. S
Init/main. c
Arch/i386/kernel/traps. c
Arch/i386/kernel/entry. S
Arch/i386/kernel/irq. h
Include/asm-386/unistd. h
Of course, this is only a few of the main files involved. In fact, adding a system call really wants to modify the file only include/asm-386/unistd. h and arch/i386/kernel/entry. S.
[1] [2] Next page