Linux0.11 Kernel anatomy – Kernel architecture ©fanwu
"Linux kernel full comment" Download: http://files.cnblogs.com/files/HanBlogs/linux-kernel.pdf (click on the lower right corner to save the PDF after entering it ^_^)
A fully available operating system consists mainly of 4 parts: hardware, operating system kernel, operating system services, and user applications, as shown in:
User application refers to the word processing programs, Internet browser programs or user-compiled various applications;
An operating system service program is a program that provides services to users as part of the functionality of the operating system.
On the Linux operating system, these programs include the X Window System, Shell command interpretation system, and the kernel programming interface and other system programs, the operating system kernel program is the book is interested in the part, it is mainly used for the abstraction of hardware resources and access scheduling.
The main purpose of the Linux kernel is to interact with computer hardware, to realize programming control and interface operation of hardware parts, to dispatch access to hardware resources, and to provide an advanced execution environment and virtual interface to hardware for the user program on the computer. In this article, we first based on the Linux version 0.11 kernel source code, concise description of the Linux kernel Basic architecture, the main constituent modules. Several important data structures that appear in the source code are then described. Finally, the method to build the Linux 0.11 kernel compiling experiment environment is described.
1. Linux kernel mode
At present, the structure mode of operating system kernel can be divided into monolithic single kernel mode and hierarchical micro-kernel mode. The Linux 0.11 kernel uses a single kernel mode.
The main advantage of the single kernel mode is that the kernel code structure is compact, the execution speed is fast, the deficiency is mainly the hierarchy is not strong.
In a single-core-mode system, the operating system provides a service that uses the specified parameter value to execute the system invocation instruction (int x80), allowing the CPU to switch from user mode to the kernel mentality (Kernel model), The operating system then invokes a specific system invocation service program based on the specific parameter values, and these services are then required to perform certain functions as needed to support the underlying functions. After completing the service required by the application, the operating system switches back to the user state from the kernel mindset, returning to the application to continue executing the subsequent instructions.
So, in summary, the kernel of a single kernel mode can be roughly divided into three levels: the main program layer that invokes the service, the service layer that executes the system call, and the underlying function that supports system invocation. As shown in the following:
Simple structure model with single kernel mode
2. Linux Kernel system Architecture
The Linux kernel consists of 5 modules, namely: Process scheduling module, memory management module, file system module, interprocess communication module and network interface module.
The process scheduling module is used to control the use of CPU resources by the process. The scheduling strategy is that each process can access the CPU fairly and reliably, while ensuring that the kernel can perform hardware operations in a timely manner.
The memory management module is used to ensure that all processes can safely share the main memory area of the machine, while the memory management module also supports virtual memory management, allowing the Linux support process to use more memory capacity than the actual memory space. The file system can also be used to swap unused memory blocks to external storage devices and swap them back when needed.
File system modules are used to support the drive and storage of external devices. The virtual file system module hides the different details of various hardware devices by providing a common file interface to all external storage devices. This provides and supports a variety of file system formats compatible with other operating systems.
The inter-Process Communication module subsystem is used to support the exchange of information among various processes.
The network interface module provides access to a variety of network communication standards and supports many network hardware.
The dependencies between the several modules are shown in the figure. The lines represent the dependencies between them, and the dashed and dashed parts represent parts of the Linux 0.11 that are not yet implemented (starting with the Linux 0.95 version of the virtual file system, and the support of the network interface to version 0.96).
Linux Kernel System Module architecture and interdependencies:
In the case of a single-core-mode structure model, we can also draw the kernel main module into the diagram structure as shown in the framework of the Linux 0.11 kernel source code:
3. Linux Kernel Process Control
for the Linux 0.11 kernel, the system can have a maximum of 64 processes at the same time. The system, in addition to the first process, is a "manual" establishment, and the rest is a new process created by the process using the system call fork. The kernel program identifies each process by using a process ID, a PID. A process consists of executable instruction code, data, and stack areas. The Code and data sections in the process correspond to a code snippet, a data segment, in an execution file, respectively. Each process can only execute its own code and access its own data and stack area. Communication between processes needs to be done through system calls. For a system with only one CPU, only one process is running at a time. The kernel dispatches the various processes to run through the dispatch program ticks.
in a Linux system, a process can be executed under the kernel State (kernel mode) or user mode, so the Linux kernel stack and the user stack are separate. The user stack is used by the process to temporarily save the parameters of the calling function, local variables and other data in the user state. The kernel stack contains information when the kernel program executes function calls. Kernel programs manage processes through process tables, and each process occupies one item in the process table. In a Linux system, a process table item is a task structure.
when a process executes, the values in all registers of the CPU, the state of the process, and the contents of the stack are referred to as the context of the process. When the kernel needs to switch (switch) to another process, it will need to save all the state of the current process, that is, the context of the current process, so that when the process is executed again, it will be able to revert to the state of the switchover when the execution continues. In the event of an outage, the kernel executes the interrupt service routine under the kernel state in the context of the interrupted process. However, all resources that need to be used are retained so that the execution of the interrupted process can be resumed at the end of the interrupt service.
a process can be in a different set of States, called process state, during its lifetime. See the following:
when the process is being executed by the CPU, it is called in the execution state (running). When a process is waiting for a resource in the system, it is said to be in a sleep wait state. In the Linux system, there are also interruptible and non-interruptible wait states. When a system resource is already available, the process is awakened and ready to run, which is called the ready state. When a process has stopped running, but its parent process has not yet queried its state, the process is said to be in a zombie state. When the process is terminated, it is said to be in a stopped state. The kernel does a process switch operation only when the process is moved from kernel run state to sleep. Processes running in the kernel state cannot be preempted by other processes, and a process cannot change the state of another process. In order to avoid kernel data errors caused by process switching, the kernel disables all interrupts when it executes the critical section code.
4. How the Linux kernel uses memory
In the Linux 0.11 kernel, in order to effectively use the system's physical memory, memory is divided into several functional areas, as shown in:
In Linux 0.11, when it comes to address mapping, we need to distinguish between 3 types of addresses: A. The process virtual address, which is calculated from the virtual address 0, the maximum 64M; B. The linear address space of the CPU (0--4G); c. Actual physical memory address. The virtual address of the process needs to be transformed first through its local segment descriptor into the address in the entire linear address space of the CPU, and then mapped to the actual physical Address page using the page Catalog table PDT (first-level page table) and the page table PT (Level two page table). Therefore, the two transformations cannot be confused. To use actual physical memory, the linear address of each process is dynamically mapped to a different memory page of the main memory region through a Level two memory page table. Therefore, the maximum available virtual memory space per process is 64MB. The logical address of each process can be converted to a linear address by adding the task number *64m. In comments, however, we typically refer to the address in the process simply as a linear address.
5, the Linux kernel source code directory structure
Because the Linux kernel is a single-kernel-mode system, almost all programs in the kernel have close connections, and they are closely related to dependencies and calls. Therefore, when reading a source code file, you often need to refer to other related files. It is therefore necessary to familiarize yourself with the directory structure and arrangement of the source code files before starting to read the kernel source code.
Here we first list the complete source code directory for the Linux kernel, including subdirectories. Then one by one the main functions of the program included in each directory, so that the entire kernel source code can be arranged in our minds to establish a general framework, so that the source code to begin reading work. When we use the tar command to unpack the linux-0.11.tar.gz, the kernel source files are placed in the Linux directory. The directory structure is:
The kernel version of the source code directory contains 14 subdirectories, including a total of 102 code files. The contents of these subdirectories are described below.
1, kernel home directory Linux
linux directory is the home directory of the source code, in addition to all 14 subdirectories in the home directory, Also contains a unique makefile file. The file is a parameter configuration file that compiles the accessibility software make. The main purpose of the Make tool software is to automatically determine which files need to be recompiled in a program system that contains multiple source program files by identifying which files have been modified. Therefore, make tool software is the management software of the program project. This makefile file in the Linux directory also nested calls to the makefile files contained in all subdirectories, so that when any files under the Linux directory (including subdirectories) are modified, make will recompile them. So in order to compile all the source code files for the entire kernel, just run the make software once in the Linux directory.
2, boot launcher directory boot
" Span style= "font-family: ' Microsoft Yahei '; font-size:18px; " The >boot directory contains 3 assembly language files, which are the first to be compiled in the kernel source code files. The main function of these 3 programs is to boot the kernel when the computer is power up, load the kernel code into memory, and do some system initialization before entering the 32-bit protection run mode. BOOTSECT.S and SETUP.S programs need to use AS86 software to compile, using the AS86 assembly language format (similar to Microsoft), and Head.s need to use the GNU as to compile, using the T-format assembly language. These two assembly languages are briefly described in the code comments in the next chapter, as well as in the instructions later in the code list. The BOOTSECT.S program is a disk boot block program that is compiled to reside in the first sector of the disk (boot sector, 0 tracks (cylinder), 0 heads, and 1th sector). After the PC power ROM bios self-test, the BIOS will be loaded into the memory 0x7c00 to perform. The SETUP.S program is primarily used to read the hardware configuration parameters of the machine and move the kernel module system to the appropriate memory location. The HEAD.S program is compiled and connected to the first part of the system module, primarily for probing settings of hardware devices and initial setup of memory management pages.
3, file system directory FS
is the file system implementation of the directory, contains a total of 17 C language programs. The primary reference relationship between these programs shows that each box in the diagram represents a file, from top to bottom by a basic by-reference relationship. The suffix is omitted from each file name. C, the virtual box is the program file is not a file system, the line with arrows represents a reference relationship, thick lines indicate that there is a reciprocal reference relationship.
4, header file home directory include
The <a.out.h>//a.out header file defines the a.out execution file format and some macros. <const.h>//constant symbol header file, currently only defines the flag bits of the I_mode field in the I node. <ctype.h>//Character type header file. Defines some macros about character type judgments and conversions. <errno.h>//Error xwould file. Contains the various error numbers in the system. (Linus introduced from the Minix). <fcntl.h>//File control header file. The definition of the operation control constant symbol used for the file and its descriptor. <signal.h>//Signal header file. Defines signal symbol constants, signal structures, and signal manipulation function prototypes. <stdarg.h>//Standard parameter header file. Defines a variable argument list as a macro. It mainly describes the types (va_list) and three macros (Va_start, Va_arg, and Va_end) for vsprintf, vprintf, vfprintf functions <stddef.h>//Standard definition header files. Defines NULL, Offsetof (TYPE, MEMBER). <string.h>//String header file. It mainly defines some embedding functions for string manipulation. <termios.h>//terminal input/Output function header file. The main definition is the terminal interface that controls the asynchronous communication port. <time.h>//Time Type header file. The main definition of the TM structure and some of the relevant time function prototype. <unistd.h>//linux Standard header file. Various symbolic constants and types are defined, and various functions are declared. If __library__ is defined, it also includes the system call number and the inline assembler _syscall0 (). <utime.h>//User Time header file. Defines the access and modification time structure and the Utime () prototype
Architecture-Related header Files subdirectory include/asm
These header files primarily define data structures, macro functions, and variables that are closely related to the CPU architecture. A total of 4 files.
<asm/io.h> //io header file. Defines functions for IO port operations in the form of an embedded assembler of macros. <asm/memory.h> //Memory copy header file. Contains memcpy () embedded assembly macro functions. <asm/segment.h> //Segment operation header file. An embedded assembler function is defined for the segment register operation. <asm/system.h> //System header file. An embedded assembler macro that defines settings or modifies descriptors/break gates, etc.
Linux Kernel dedicated header file subdirectory include/linux
<linux/config.h>//Kernel configuration header file. Defines the keyboard language and hard disk type (Hd_type) options. <linux/fdreg.h>//Floppy header file. Contains some definitions of the floppy disk controller parameters. <linux/fs.h>//File system header file. Define the file table structure (File,buffer_head,m_inode, etc.). <linux/hdreg.h>//HDD parameter header file. Define access to the hard disk register port, status code, partition table and other information. <linux/head.h>//head header file, defines the simple structure of the segment descriptor, and several selection constants. <linux/kernel.h>//Kernel header file. Contains a prototype definition of some common kernel functions. <linux/mm.h>//Memory management header file. Contains page size definitions and some page release function prototypes. The <linux/sched.h>//Dispatcher header file defines the task structure task_struct, the initial task 0 data, and /or some embedded Assembler function macro statements about descriptor parameter setting and getting. <linux/sys.h>//System call header file. Contains 72 system call C function handlers, beginning with ' Sys_ '. <linux/tty.h>//tty header file, defines the parameters and constants for Tty_io, serial communication.
System-specific data structure subdirectory Include/sys
<sys/stat.h> //File status header file. Contains the file or file system state structure stat{} and constants. <sys/times.h> //defines the in-process runtime structure TMS and The Times () function prototype. <sys/types.h> //Type header file. A basic system data type is defined. <sys/utsname.h>//System name structure header file. <sys/wait.h> //wait for the header file to be called. Defines the system call wait () kernel waitpid () and the associated constant symbol.
5. Kernel Initialization Program Directory init
This directory contains only one file, Main.c. Used to perform all kernel initialization work, then move to user mode to create a new process and run the Shell program on the console device. The program allocates the buffer memory capacity first based on how much of the machine's memory, and if you set up to use the virtual disk, it leaves space behind the buffer memory. All of the hardware is then initialized, including manually creating the first task (task 0), and setting the interrupt allow flag. After performing a move from kernel mentality to user state, the system calls the Create Process function fork () for the first time, creating a process for running Init (), where the console environment is set up and a child process is generated to run the shell program.
6, kernel program home directory kernel
linux/kernel directory contains 12 code files and one Makefile File, and there are 3 additional subdirectories. Because the call relationships between the code in these files are complex, the reference graph between the files is not detailed here, but it can still be roughly categorized:
ASM.S//program is used to handle interrupts caused by system hardware exceptions, The actual processor for each hardware exception is in the Traps.c file, during each interrupt processing, the corresponding C language processing function in TRAPS.C will be called respectively EXIT.C//program mainly includes the system call to handle the process termination. Includes the process release, session (process Group) termination and program exit handler functions, and system call functions such as killing processes, terminating processes, suspending processes, and so on. The FORK.C//program gives the Sys_fork () system calls using two C language functions: Find_empty_process () and copy_process (). MKTIME.C//program contains a time function used by the kernel mktime (), used to calculate the number of seconds from 0 o'clock January 1, 1970 to the day of the boot, as the power-on seconds time. is called only once in INIT/MAIN.C. Panic. The program contains a function panic () that displays the kernel error message and shuts down. PRINTK.C//program contains a kernel-specific information display function PRINTK (). SCHED.C//Programs include basic functions for scheduling (sleep_on, wakeup, schedule, etc.) as well as some simple system call functions. There are also several disk operation functions related to timing. SIGNAL.C//The program includes 4 system calls about signal processing and a function do_signal () that processes the signal in the corresponding interrupt handler. SYS.C//Programs include many system call functions, some of which have not yet been implemented. SYSTEM_CALL.S//program implementation of the Linux system call (int 0x80) interface processing process, the actual processing is included in each system calls the corresponding C language processing functions, these processing functions distributed throughout the Linux kernel code VSPRINTF.C// The program implements a string formatting function that is now grouped into standard library functions.
Block device driver subdirectory Kernel/blk_dev
Typically, a user accesses a device through a file system, so the device driver implements the calling interface for the file system. When using block devices, because of their large data throughput, a high-speed buffering mechanism is used between user processes and block devices in order to be able to use data on block devices efficiently. When accessing data on a block device, the system first reads the data on the block device into the buffer zone in the form of a block, which is then provided to the user. The Blk_dev subdirectory contains a total of 4 C files and 1 header files. Header file Blk.h with the C file because it is dedicated to the block device program. The approximate relationship between these several documents:
Blk.h //defines a block device structure and a block request structure for 3 C programs. HD.C/ /program main implementation of hard disk data block read/write the underlying driver function, mainly do_hd__request () function; floppy.c//program mainly implemented in the floppy disk data block read/write drive function, mainly do_fd_request () Function. LL_RW_BLK.C//In the program realizes the low-level block device data read/write function ll_rw_block (), all other programs in the kernel through the function of the block device data read and write operations.
You will see that the function is called in many places where the block device data is accessed, especially in the buffer zone processing file FS/BUFFER.C.
character device driver subdirectory Kernel/chr_dev
TTY_IO.C//program contains TTY character device read function Tty_read () and write function Tty_write () to provide upper-level access //port for file system. Also included is the C function Do_tty_interrupt () called during serial interrupt processing, which will be called in the processing of the break type as read character //. The console.c//file mainly contains the console initializer and the console write function Con_write (), which is called by the TTY device. Also includes //initialization setup for display and keyboard interrupts Con_init (). The RS_IO.S//assembler is used to implement interrupt handlers for two serial interfaces. The interrupt handler is processed according to the 4 interrupt types obtained from the Interrupt identity register (end/port 0x3fa or 0X2FA) and calls Do_tty_interrupt () in code that handles the interrupt type as a read character. The SERIAL.C//is used to initialize the UART of an asynchronous serial communication chip and set the interrupt vectors for two communication ports. //also includes TTY for Rs_write () function for serial output. TTY_IOCTL.C//program implements the TTY IO Control interface function Tty_ioctl () as well as read and write functions to the Termio (s) terminal IO structure and will //in implementing the system call SYS_IOCTL () FS/IOCTL.C Program is called in. Keyboard. S//program mainly implemented the keyboard interrupt processing process keyboard_interrupt.
coprocessor emulation and operating procedure subdirectory Kernel/math
There is currently only one C program math_emulate.c in this subdirectory. One of the math_emulate () functions is the C function that interrupts the INT7 interrupt handler call. The interrupt is raised when there is no math coprocessor in the machine and the CPU executes the coprocessor's instructions. Therefore, using this interrupt, you can use the software to emulate the coprocessor's functionality. The kernel version discussed in this book does not yet contain simulation code for the coprocessor. This program only prints an error message and sends a coprocessor error signal SIGFPE to the user program.
7. Kernel Library function Directory Lib
Kernel library function is mainly used for user programming call, and is one of the interface functions of compiling system standard library. A total of 12 C language files, in addition to a MALLOC.C program compiled by Tytso longer, other programs are very short, and some only one or two lines of code.
8, Memory management program directory mm
The directory consists of 2 code files. Mainly used for the management of the main memory area of the application, the implementation of the process of the logical address to the linear address and the linear address to the main memory of the physical memory address mapping, through the memory paging management mechanism, the virtual memory page of the process and the main memory area of the physical memory page to establish a corresponding relationship.
The page.s file includes a memory page exception interrupt (int 14) handler, which is primarily used to handle page protection caused by page breaks and access to illegal addresses caused by a fault in the program.
The memory.c program includes the function Mem_init () that initializes the memory, and the Do_no_page () and do_wp_page () functions that are called by PAGE.S's memory processing interrupt procedure. When you create a new process and perform a copy process operation, you use the memory handler function in the file to allocate administrative memory space.
9. Compiling kernel Tools Directory tool
The BUILD.C program in this directory is used to merge the target code connections generated from each directory of Linux into a running kernel image file image. Specific features are described in detail later
6, the relationship between the kernel system and the user program
In a Linux system, the kernel provides two interfaces for the application. One is the system call interface, that is, interrupt call int 0x80, on the other hand, through the kernel library function, with the kernel for information exchange. Kernel library functions are part of the basic C function library libc. Many of the system calls are implemented as part of the basic C-language function library. System calls are mainly provided to the system software directly or for the implementation of library functions. The common user-developed program is to access kernel resources by invoking functions in libraries like libc. By invoking the programs in these libraries, application code can perform a variety of common tasks, such as opening and closing access to a file or device, making scientific calculations, error handling, and accessing system information such as group and user ID IDs. The system call is the highest layer of the kernel and the outside interface. In the kernel, each system call has a serial number (defined in the Include/linux/unistd.h header file) and is often implemented in the form of a macro.
Linux0.11 Kernel anatomy – Kernel architecture ©fanwu