Linux Kernel Structure
Author: Chen Lijun
The Linux kernel consists of five subsystems: process scheduling, memory management, virtual file system, network interface, and inter-process communication.
1. Process Scheduling (sched): controls the process's access to the CPU. When you need to select the next process to runProgramSelect the process that is most worth running. A running process is actually a process that only waits for CPU resources. If a process is waiting for other resources, it is not a running process. Linux uses a simple priority-based process schedulingAlgorithmSelect a new process.
2. Memory Management (MM) allows multiple processes to securely share the primary memory area. Linux memory management supports Virtual Memory, that is, programs running on computers.CodeThe total amount of data and stacks can exceed the actual memory size. The operating system only keeps the currently used block in the memory, and the remaining block in the disk. If necessary, the operating system is responsible for switching program blocks between disks and memory. Memory Management is logically divided into hardware-independent parts and hardware-related parts. The hardware-independent part provides process ing and logical memory swap; the hardware-related part provides virtual interfaces for the memory management hardware.
3. Virtual File System (VFS) hides the details of various hardware and provides unified interfaces for all devices, VFS provides dozens of different file systems. Virtual File systems can be divided into logical file systems and device drivers. A logical file system is a file system supported by Linux, such as ext2 and fat. A device driver is a device driver module written by each hardware controller.
4. network interfaces (net) provide access to various network standards and support for various network hardware. Network Interfaces can be divided into network protocols and network drivers. The network protocol Section is responsible for implementing each possible network transmission protocol. Network Device Drivers communicate with hardware devices. Each possible hardware device has a corresponding device driver.
5. inter-process communication (IPC) supports various inter-process communication mechanisms.
All other subsystems depend on the process scheduling in the center, because each subsystem needs to suspend or resume the process. Generally, a process is suspended when the hardware operation is completed. when the operation is completed, the process is resumed. For example, when a process sends a message over the network, the network interface needs to suspend the sending process until the hardware successfully sends the message. After the message is successfully sent, the Network Interface returns a code to the process, indicating that the operation is successful or failed. Other subsystems depend on Process Scheduling for similar reasons.
Dependencies between subsystems are as follows:
The relationship between process scheduling and memory management: These two subsystems depend on each other. In a multi-program environment, you must create a process for the program to run. The first thing to create a process is to load the program and data into the memory.
The relationship between inter-process communication and memory management: the inter-process communication subsystem depends on memory management to support shared memory communication. This mechanism allows two processes to have their own private space, you can also access the common memory area.
Relationship between virtual file systems and Network Interfaces: The Virtual File System uses network interfaces to support network file systems (NFS), and also uses memory management to support ramdisk devices.
The relationship between memory management and Virtual File System: Memory Management uses the Virtual File System to support switching, and swapd is scheduled regularly by the scheduler, this is the only reason memory management depends on process scheduling. When a memory ing accessed by a process is swapped out, the memory management sends a request to the file system and suspends the currently running process.
In addition to these dependencies, all subsystems in the kernel also depend on some common resources. These resources include the processes used by all subsystems. For example, the process of allocating and releasing memory space, printing warnings or error messages, and debugging routines of the system.
System Data Structure
In the implementation of Linux kernel, some data structures are frequently used. They are:
Task_struct
The Linux kernel uses a data structure (task_struct) to represent a process. The data structure pointer of the process forms a task array (in Linux, tasks and processes are in the same terminology ), this type of pointer array is also called a pointer vector. The size of this array is determined by nr_tasks (512 by default), indicating the maximum number of processes that can run simultaneously in Linux. When a new process is created, Linux assigns a task_struct structure to the new process and stores the pointer in the task array. The scheduler maintains a current pointer pointing to the currently running process.
Mm_struct
The virtual memory of each process is represented by a mm_struct structure, which actually contains information about the currently executed image and contains a set of pointers pointing to the vm_area_struct structure, the vm_area_struct Structure describes a region of the virtual memory.
Inode
Files and directories in the Virtual File System (VFS) are represented by corresponding index nodes (inode. The content of each VFS index node is provided by the file system-specific routine. The VFS index node only exists in the kernel memory and is actually saved in the cache of the VFS index node. If the two processes are opened with the same process, the data structure of the INADE can be shared. The data blocks in the two processes point to the same inode.
Linux Structure
The specific structure refers to the structure implemented by the system.
The specific structure of Linux is similar to the abstract structure. This correspondence is because the abstract structure comes from the specific structure, and our division is not strictly in accordanceSource codeDirectory structure, and does not fully match the sub-system group, but it is very close to the directory structure of the source code.
Although the abstract structure discussed earlier shows that there are only a few dependencies between subsystems, the five subsystems in the specific structure have a high dependency relationship. We can see that many dependencies in a specific structure do not appear in the abstract structure.
Linux kernel source code
At present, the newer and more stable kernel versions are 2.0.x and 2.2.x, because the versions are slightly different, so if you want a new driver to support both 2.0.x and 2.2.x, you need to perform Conditional compilation based on the kernel version. To achieve this, you must support the macro linux_version_code. If the kernel version is. b. the macro value is 216a + 28B + C. To use the value of the specified kernel version, we can use the kernel_version macro, or we can define it ourselves.
Kernel modifications are released using patch files. The patch utility is used to repair a series of kernel source files. For example, you have source code 2.2.9, but want to move it to 2.2.10. You can obtain the 2.2.10 patch file and apply the patch to modify the 2.2.9 source file. For example:
$ CD/usr/src/Linux
$ Patch-pl <patch-2.2.10
Linux kernel source code structure
The Linux kernel source code is located in the/usr/src/Linux directory.
The/include subdirectory contains most of the contained files required for kernel code creation. This module uses other modules to reconstruct the kernel.
The/init subdirectory contains the kernel initialization code, which is the starting point of kernel work.
The/arch sub-directory contains all kernel code with specific hardware structures. For example, i386 and Alpha
The/drivers subdirectory contains all the device drivers in the kernel, such as Block devices and SCSI devices.
The/fs subdirectory contains the code of all file systems. For example, ext2 or vfat.
The/NET subdirectory contains the connection code of the kernel.
The/mm sub-directory contains all the memory management code.
The/IPC subdirectory contains the inter-process communication code.
The/kernel subdirectory contains the main kernel code.
Where can I start reading the source code?
On the Internet, a Source Code Navigator has been created to provide good conditions for reading the source code. The site is lxr. Linux. no/source.
The following is a clue to read the source code.
System startup and initialization:
In intelbased systems, when loadlin.exe or lilo loads the kernel into the memory and passes control to the kernel, the kernel starts to start. For more information, see ARCH/i386/kernel/head. s and head. s to set the specific structure and jump to the main () Routine of init/Main. C.
Memory Management:
The memory management code is mainly in/mm, but the code of the specific structure is in arch/*/mm. The code for page disconnection processing is in/MM/memory. C, while the code for memory ing and page cache is in/MM/filemap. C. Buffer high-speed cache is implemented in/MM/buffer. C, while switching high-speed cache is implemented in mm/swap_state.c and mm/swapfile. C.
Kernel:
In the kernel, the code of the specific structure is in arch/*/kernel, and the scheduler is in kernel/sched. c. The fork code is in the kernel/fork. c. The kernel routine handler is in include/Linux/interrupt. h. The data structure of task_struct is in inlucde/Linux/sched. h.
PCI:
The PCI pseudo driver is in drivers/PCI. C, which is defined in inclulde/Linux/PCI. h. Each structure has some specific pci bios code. Intel is in arch/alpha/kernel/bios32.c.
Inter-process communication:
All System v ipc object permissions are included in the ipc_perm data structure, which can be found in include/Linux/IPC. h. System V messages are implemented in IPC/msg. C. The shared content is implemented in IPC/SHM. C. Semaphores are implemented in IPC/SEM. C and pipelines in/IPC/pipe. C.
Interrupt Processing: the kernel Interrupt Processing code is exclusive to almost all microprocessors. The interrupt handling code is defined in arch/i386/kernel/IRQ. c In include/asm-i386/IRQ. h.