Understanding of the "turn" Linux conceptual architecture

Source: Internet
Author: User

Ext.: http://mp.weixin.qq.com/s?__biz=MzA3NDcyMTQyNQ==&mid=400583492&idx=1&sn= 3b18c463dcc45103b76a3419ceabe84c&scene=2&srcid=1213z5cbo8w4jcmtsfi74uib&from=timeline& Isappinstalled=0#wechat_redirect

Understanding of the Linux conceptual architecture2015-12-12 Translator: Johnnie Qunar Technology Salon Summary

Two reasons for Linux kernel success:

    1. The flexible architecture design makes it easy for a large number of volunteer developers to join the development process;

    2. Each subsystem, especially those that need improvement, has good scalability.

It is these two reasons that make Linux kernel can evolve and improve.

The location of the Linux kernel in the entire computer system

Fig 1-Hierarchical structure of computer systems

The principles of a hierarchical structure :

The dependencies between subsystems is from the top down:layers pictured in the top depend on lower layers, but Subsys TEMs nearer the bottom does not depend on higher layers.

The dependencies between these subsystems can only be from top to bottom, that is, the subsystem on the topmost part of the diagram relies on the subsystem at the base, and vice versa.

Second, the role of the kernel
    1. Virtualization (abstraction), which abstracts computer hardware into a single virtual machine for users

      Process processes

      The process runs without needing to know how the hardware works, as long as it calls the Linux kernel provided

      Virtual Interface interface

      Can.

    2. multitasking, in fact, multiple tasks in parallel use of computer hardware resources, the task of the kernel is to arbitrate the use of resources, manufacturing each process is considered to be an exclusive system of the illusion.

PS: Process Context switch is to replace the program status Word, replace the contents of the page Table base Register, replace the current point of the task_struct instance, replace the pc--> also replaced the process open files (through the task_struct can be found), Changed the execution space of process memory (can be found by task_struct Mem);

Third, the overall architecture of the Linux kernel

The overall architecture of the Linux kernel

The central system is the process scheduler scheduler,sched: All remaining subsystems are dependent on the process scheduler because the remaining subsystems require blocking and recovery processes. When a process needs to wait for a hardware action to complete, the subsystem blocks the process, and when this hardware action is complete, the subsystem restores the process: this blocking and resuming action depends on the process scheduler to complete.

Each of the dependent arrows has a reason:

    • Process Scheduler Dependency

      Memory Manager Memories Manager

      : When the process resumes execution, it relies on the memory manager to allocate memory for it to run.

    • The IPC subsystem relies on the memory manager: The shared memory mechanism is a method of interprocess communication, running two processes that utilize the same block of shared memory space for information delivery.

    • VFS relies on

      Network Interface Interface

      : Support NFS Network File system;

    • VFS relies on memory manager: supports RAMDisk devices

    • The memory manager relies on VFS because it supports

      Exchange swapping

      , you can swap out a process that is not running temporarily to disk

      Swap partition Swap

      To enter the pending state.

Four, highly modular design system, conducive to the division of labor.
    1. Only a handful of programmers need to work across multiple modules, and this happens only when the current system needs to rely on another subsystem;

    2. Hardware Device driver Hardware device drivers

      File system Module logical filesystem modules

      Network device driver drivers

      And

      Networking Protocol Module Network protocol modules

      The scalability of these four modules is highest.

V. Data structures in the system
    1. Task List


      Process Scheduler maintains a data structure for each process task_struct All processes are managed with a list of links, forming a task List , and the process Scheduler maintains a current pointer to a process that is currently consuming CPU.

    2. Memory-mapped memories map


      The memory Manager stores the virtual address-to-physical address mappings for each process, and also provides how to swap out specific pages or how to do page faults. This information is stored in the data structure mm_struct . Each process has a mm_struct structure that has a pointer in the TASK_STRUCT structure of the process that points to the mm_struct structure of the secondary process. The
      has a pointer pgd in mm_struct that points to the page directory table of the process (that is, the first address of the page directory), and when the process is dispatched, the pointer is replaced with a physical address, and the control register CR3 (the page base register under the x86 architecture)

    3. The
    4. I-nodes
      VFS represents a file image on disk through the Inodes node, inodes is used to log the physical properties of the file. Each process has a FILES_STRUCT structure that represents the file opened by the process and has a files pointer in task_struct. File sharing can be implemented using the Inodes node. There are two ways to file sharing: (1) to open a file with a system file points to the same inodes node, which occurs between parent and child processes, (2) to open files through different systems to the same inode node, for example with hard links , or two unrelated pointers open the same file. The root of all data structures in the

    5. data Connection


      Kernel is in the list of task lists maintained by the process scheduler. The data structure of each process in the system task_struct has a pointer to its memory mapping information; There is also a pointer to files that points to its open file (the user opens the File table), and a pointer to the network socket that the process opens.

VI. Subsystem Architecture

1. Progress Scheduler Process Scheduler schema

(1) Target

The process Scheduler is the most important subsystem in Linux kernel. It is the system that controls access to the CPU-not just the CPU access of the user process, but also the CPU access of the remaining subsystems.

(2) module

Process Scheduler

Scheduling policy module: determines which process obtains access to the CPU, and the scheduling policy should allow all processes to share the CPU as equitably as possible.

    • Architecture-related Modules architecture-specific module

      Design a unified set of abstract interfaces to mask the hardware details of a particular system interface chip. This module interacts with the CPU to block and restore the process. These operations include obtaining registers and status information that each process needs to save, executing assembly code to complete blocking, or resuming operations.

    • Architecture-Independent Modules Architecture-independent module

      Interacting with the scheduling policy module determines the next executing process, and then calls the architecture-related code to restore the execution of that process. Not only that, the module also invokes the interface of the memory manager to ensure that the memory-mapped information for the blocked process is stored correctly.

    • System invoke Interface Module Systems call interface

      Allows user processes to access resources that Linux Kernel explicitly exposes to user processes. Decoupling the user application from the Linux kernel with a set of basically immutable interfaces (POSIX standards) that define the appropriate, so that user processes are not affected by kernel changes.

(3) Data representation

The scheduler maintains a data structure--task list, where the elements are task_struct instances of each active process, and this data structure contains not only information that is used to block and recover processes, but also additional counts and status information. This data structure can be accessed publicly throughout the kernel layer.

(4) Dependencies, data flow, control flow

As mentioned earlier, the scheduler needs to invoke the functionality provided by the memory Manager to select the appropriate physical address for the process that needs to be resumed, and because of this, the process scheduler subsystem relies on the memory management subsystem. When other kernel subsystems need to wait for hardware requests to complete, they rely on the process scheduling subsystem for process blocking and recovery. This dependency is reflected through function calls and access to shared task list data structures. All of the kernel subsystems read or write the data structure representing the current running process, thus forming a bidirectional data flow throughout the system.

In addition to the core layer of data flow and control flow, the OS service layer also provides the user process with an interface for registering timers. This forms the control flow of the user process by the scheduler. The use cases that usually wake up the sleep process are not in the normal control flow range because the user process cannot predict when it will wake up. Finally, the scheduler interacts with the CPU to block and recover the process, which in turn forms the data flow between them and the control flow--CPU is responsible for interrupting the currently running process and allowing the kernel to schedule other processes to run.

2. Memory Manager Storage Management architecture

(1) Target

The memory management module is responsible for controlling how the process accesses physical memory resources. The mapping between process virtual memory and machine physical memory is managed through a hardware memory management system (MMU). Each process has its own independent virtual memory space, so two processes may have the same virtual address, but they actually run in different physical memory areas. The MMU provides memory protection so that the physical memory space of two processes does not interfere with each other. The memory management module also supports swapping-swapping temporarily unused memory pages to swap partitions on disk, a technique that makes the virtual address space of a process larger than the size of physical memory. The size of the virtual address space is determined by the machine word length.

(2) module

Memory Management Subsystem

    • The schema-dependent module Architecture specific module provides a virtual interface for accessing physical memory;

    • Schema-independent Modules architecture Independent module is responsible for address mapping and virtual memory exchange for each process. When a page fault occurs, it is up to the module to decide which memory pages should be swapped out of memory-because the memory page swap-out selection algorithm requires little change, so there is no separate policy module.

    • System call interface provides strict access interfaces for user processes (malloc and Free;mmap and Ummap). This module allows processes to allocate and free memory, and perform memory-mapped file operations.

(3) Data representation

Memory management stores the mapping information for each process's virtual memory to physical memory. This mapping information is stored in the MM_STRUCT structure instance, and the pointer to this instance is stored in the task_struct of each process. In addition to storing mapping information, data blocks should also contain information about how the memory manager obtains and stores pages. For example, executable code can store an executable image as a backup, but dynamically requested data must be backed up to a system page. (This does not understand, please master doubts?) )

Finally, the memory management module should also store access and technical information to ensure the security of the system.

(4) Dependencies, data flow, and control flow

The memory manager controls physical memory and, when page fault occurs, accepts hardware notifications (fault pages)-which means that there is a bidirectional flow of data and control between the memory management module and the memory management hardware. Memory management also relies on file systems to support swap and memory mapping i/o--This requirement means that the memory manager needs to invoke the function interface procedure calls to the file system, storing the memory pages on the disk and fetching the memory pages from the disk. Because the file system request is very slow, the memory manager wants the process to go into hibernation before it waits for the memory page to be swapped in--a requirement that allows the memory manager to invoke the interface of the process scheduler. Because the memory map for each process resides in the data structure of the process scheduler, there are bidirectional data flows and control flows between the memory manager and the process scheduler. The user process can establish a new process address space and be able to perceive the fault of the pages--a control flow from the memory manager is required. In general, there is no user process to the memory manager of the data flow, but the user process can be called through the select System, from the memory manager to obtain some information.

3. virtual File System schema

(1) Target

The virtual file system provides a unified access interface for data stored on hardware devices. Can be compatible with different file systems (ext2,ext4,ntf, etc.). Almost all of the hardware devices in a computer are represented as a common device driver interface. The logical file system facilitates compatibility with other operating system standards and allows developers to implement file systems with different policies. The virtual file system further allows the system administrator to mount any logical file system on any device. The virtual file system encapsulates the details of the physical device and the logical file system, and allows the user process to access the file using a unified interface.

In addition to the traditional file system goals, VFS is also responsible for loading new executables. This task is done by the logical file system module, which allows Linux to support a variety of executable files.

(2) module

Virtual File System module

    • Device Driver Module Driver module
    • Device independent Interface Module device independent Interface

      : Provides the same view for all devices

    • Logical filesystem Logical File system

      : For each of the supported file systems

    • System independent Interface Systems independent interface

      Provides interfaces that are not related to hardware resources and logical file systems, which provide all resources through a block device node or a character device node.

    • System Call Module interface

      Provides unified control access to the file system by the user process. The virtual file system masks all the special features for user processes.

(3) Data representation

All files are represented using the inode. Each Inode records the location information of a file on the hardware device. Not only that, the inode also holds pointers to logical file system modules and device-driven functions that perform specific read and write operations. By storing function pointers in this form (that is, the idea of virtual functions in object-oriented), specific logical file systems and device drivers can register themselves with the kernel without requiring the kernel to rely on specific module features.

(4) Dependencies, data flow, and control flow

A special device driver is RAMDisk, a device that opens up an area in main memory and uses it as a persistent storage device. This device drives the task using the Memory management module, so there is a dependency on the VFS with the Memory management module (the dependency in the diagram is reversed and should be the VFS relies on the memory management module), the data flow, and the control flow.

The logical file system supports the network file system. This file system accesses files from another machine like a local file. To achieve this, a logical file system accomplishes its task through the network subsystem-which introduces a dependency of VFS on the network subsystem and the flow of control and data between them.

As mentioned earlier, the memory manager uses VFS to complete memory exchange functions and memory-mapped I/O. Also, when the VFS waits for a hardware request to complete, the VFS needs to use the process Scheduler to block the process, and when the request completes, the VFS needs to wake the process through the process scheduler. Finally, the system invocation interface allows the user process to call in to access the data. Unlike the previous subsystem, the VFS does not provide a mechanism for the user to register ambiguous calls, so there is no control flow from the VFS to the user process.

4. Web Interface Network Interface architecture

(1) Target

The network subsystem allows the Linux system to connect to other systems through the network. This subsystem supports many hardware devices and also supports many network protocols. The network subsystem masks the implementation details of both the hardware and the Protocol, and abstracts out an easy-to-use interface for user processes and other subsystems-the user process and the rest of the subsystems do not need to know the details of the hardware Device and protocol.

(2) module

Network protocol layer Module diagram

    • Network device driver module networking device drivers
    • Device independent Interface Module device independent interface module

      Provides a consistent access interface for all hardware devices so that the high-level subsystem does not need to know the details of the hardware.

    • Networking Protocol Module Network protocol modules

      Responsible for the implementation of each network transport protocol, for example: Tcp,udp,ip,http,arp and so on ~

    • Protocol independent Module Protocol independent interface

      Provides a consistent interface independent of specific protocols and specific hardware devices. This allows the remaining kernel subsystems to access the network without relying on specific protocols or devices.

    • System calls Interface module system calls interface

      Provides a network programming API that user processes can access

(3) Data representation

Each network object is represented as a socket socket. The socket is the same method associated with the process as the Inode node. With two task_struct pointing to the same socket, sockets can be shared by multiple processes.

(4) Data flow, control flow and dependency relationships

When the network subsystem waits for a hardware request to complete, it needs to block and wake the process through the process scheduling system-which forms the control flow and data flow between the network subsystem and the process scheduling subsystem. Moreover, the virtual file system implements Network File system (NFS) through the network subsystem, which forms the data flow and control flow of the VFS and the network subsystem nails.

Vii. Conclusion

1, the Linux kernel is a layer of the entire Linux system. The kernel is conceptually composed of five main subsystems: The Process Scheduler module, the memory management module, the virtual file system, the network interface module and the interprocess communication module. These modules interact with data through function calls and shared data structures.

2. The Linux kernel architecture has facilitated his success, and this architecture allows a large number of volunteer developers to work collaboratively and to make each specific module easy to expand.

    • Scalability One : The Linux architecture enables these subsystems to be extensible through a data abstraction technique-each specific hardware device driver is implemented as a separate module that supports the unified interface provided by the kernel. In this way, individual developers need to do minimal interaction with other kernel developers to add new device drivers to the Linux kernel.

    • Scalability Two : The Linux kernel supports many different architectures. In each subsystem, the architecture-related code is split to form a separate module. In this way, some manufacturers launch their own chips, their kernel development team only need to re-implement the kernel of the machine-related code, you can say the kernel ported to the new chip to run.

Reference article:

Http://oss.org.cn/ossdocs/linux/kernel/a1/index.html

Http://www.cs.cmu.edu/afs/cs/project/able/www/paper_abstracts/intro_softarch.html

Http://www.cs.cmu.edu/afs/cs/project/able/www/paper_abstracts/intro_softarch.html

Http://www.fceia.unr.edu.ar/ingsoft/monroe00.pdf
Kernel Source: http://lxr.oss.org.cn/

Source: Jane Book reference original: http://oss.org.cn/ossdocs/linux/kernel/a1/index.html Ivan Bowman

Compiled article: Http://www.jianshu.com/p/c5ae8f061cfe Translator: Johnnie

Understanding of the "turn" Linux conceptual architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.