Deep understanding of the Linux kernel Chapter I. INTRODUCTION

Source: Internet
Author: User

Unix File System Overview

Every process in Unix has a current working directory.

To identify a specific file, a process uses the pathname. If the first character of the pathname is a slash, the path is the absolute path, and its starting point is the root directory, and if the first item is a directory name or a file name, then the path is the relative path, and its starting point is the current directory of the process.

Restrictions on hard links

1) do not allow users to create hard links to the directory, as this may turn the tree structure of the directory into a circular structure.

2) You can create a hard link only between files in the same file system. This limitation is large because modern UNIX systems may contain multiple file systems located on different disks and/or partitions, and users cannot know the physical division between them.

To overcome these limitations, a soft link (symbolic link) is introduced, and the symbolic link is a short file that contains any pathname of another file that can point to any file or directory located on any file system, or even a nonexistent file.

File type

Unix files can be one of the following types:

1) Common documents

2) Catalogue

3) Symbolic Links

4) block-oriented device files

5) Character-oriented device files

6) piping (pipe) and Named pipes (FIFO)

7) socket (socket)

File descriptors and Index nodes (inode)

UNIX provides a clear distinction between the contents of a file (data) and the information (Description) of a description file.

In addition to device files and special file system files, each file consists of a sequence of characters, and the contents of the file contain no control information, such as the length of the file or the file Terminator (EOF).

The information for the description file (Description) is stored in a data structure called the Index node (inode), each with its own inode, and the file system identifies the file with the inode.

Inode implementations vary widely on different UNIX systems, but must provide at least the following attributes specified in the POSIX standard:

1) file type

2) Number of hard links associated with the file

3) file length in bytes

4) Device identifier

5) Index node number (inode number) that identifies the file in the file system

6) UID of the owner of the file

7) User group ID of the file

8) Three timestamps: inode status change time, last access time, last modified time

9) Access rights and file mode

Access rights

Three types of users:

File owner

Same group of users, not including owner

All other users

Three kinds of permissions: Read, Write, execute

So there are 9 kinds of access rights combined.

File mode

That is, three additional tags:

When the SUID process executes a file, it usually maintains the UID of the process owner, and if the identity bit is set, the process obtains the UID of the file owner (for example, Sudo,x is represented as s).

Sgid if the flag bit is set, the process obtains the ID of the file user group.

Sticky the executable file with the sticky flag bit is left in memory after the execution of the program, the normal file is ignored, and if the directory is set, other users cannot delete files and directories in that directory, even if other users have write access to the directory (e.g./ TMP directory, x is represented as T).

Unix Kernel Overview

Process/kernel mode

The kernel itself is not a process, but a data structure that can be understood as a process manager.

The CPU can run under the user state, or it can run in the kernel state.

When a program runs in a user state, it cannot directly access kernel data structures or kernel programs, and access the kernel by using system call.

A process is a dynamic entity that has only a limited lifetime within the system. Tasks created, revoked, and synchronized are delegated to a set of routines in the kernel.

In addition to the user process, the UNIX system includes a number of privileged processes called kernel threads (kernel thread) that are assigned special permissions, with the following characteristics:

1) run in kernel state in kernel address space

2) no direct interaction with the user, so no end device required

3) typically created at system startup and then active until the system shuts down

What the Unix kernel is doing:

1) System call

2) Exception handling

3) Interrupt Handling

4) Kernel thread execution

Process implementation

Each process is represented by a process descriptor, which contains information about the current state of the process descriptor.

When the kernel pauses a process execution, it saves the contents of several related CPU registers in the process descriptor, including:

1) Program counter (PC) and stack pointer (SP) register

2) General Purpose register

3) floating-point registers

4) Processor control register containing CPU status information (processor status Word, processor status word)

5) memory management registers used to track process-to-RAM access

When the kernel decides to resume executing a process, the CPU registers are loaded with the appropriate fields in the process descriptor.

When a process is not executing on the CPU, it is waiting for an event, and the Unix kernel can distinguish a lot of wait states, which are typically implemented by the process descriptor queue. Each (possibly empty) queue corresponds to a set of processes waiting for a specific event.

Can re-enter the kernel

All UNIX cores are reentrant, that is, several processes can execute simultaneously in the kernel state.

The kernel control path (kernel) indicates that the kernel processes the sequence of instructions executed by the system call, exception, and interrupt.

Process address space

Each process runs in its private address space.

The process of running under User state involves the private stack, data area, code area.

When the kernel is running, the process accesses the kernel's data area, the code area, but uses an additional private stack. Because the kernel is reentrant, several kernel control paths (each associated with a different process) can be executed in turn, in which case each kernel control path uses its own private kernel stack.

Although each process has its own private address space, there are also shared portions of the address space between processes, in some cases indicated by the display (such as mmap shared memory), and in other cases by the kernel auto-completion to conserve memory (such as instructions for multiple different instances of the same program will be loaded only once in memory).

Synchronization and Critical sections

The implementation of the Reentrant kernel requires a synchronization mechanism, and multiple kernel control paths compete with the kernel data structures.

Non-preemptive kernel, most of the traditional UNIX kernel is non-preemptive, the process in the kernel state, can not be arbitrarily suspended, can not be replaced by another process. The effect on a single-processor system can be, but on multiprocessor systems, inefficient. (simple, not advisable)

Disable interrupts, another synchronization mechanism on a single-processor system that disables all hardware interrupts before entering a critical section, and then re-enables interrupts upon departure. If the critical section is large, continuing blocking of the terminal over a relatively long period of time may cause all hardware activity to freeze. (simple, not advisable)

The semaphore, which can be seen as an object, consists of three parts:

1) An integer variable

2) A list of waiting processes

3) Two atomic methods: down () and up ()

Each data structure to be protected has its own semaphore, with an initial value of 1. If the data structure is to be used, the corresponding semaphore is executed down () and if the current value is not negative, then the data structure is allowed to be accessed, otherwise the process of the kernel-controlled path is added to the semaphore's list and the process is blocked. When another process executes the up method on that semaphore, allowing a process on the signal list to continue execution, the disadvantage is that the time required to modify the data structure is shorter, its check, insert queue, hang, etc. process is time-consuming and inefficient.

Spin locks, which are very similar to semaphores, but do not have a list of waiting processes, compared with semaphores, the advantage is that if the data structure to be modified is shorter, it is more efficient than the semaphore. When a process discovers that a lock is locked by another process, it keeps "spinning", executing a compact loop instruction until the lock is opened (owning a processor). Therefore, in a single processor environment, the spin lock is not valid.

To avoid deadlocks, a deadlock situation can cause the affected process or kernel control path to be completely frozen. In the kernel design, when the amount of kernel semaphore used is high, the deadlock becomes a prominent problem, it is difficult to ensure that the kernel controller interleaved execution in various possible ways does not appear deadlock state. There are several operating systems, including Linux, that avoid deadlocks by requesting semaphores in a prescribed order.

Signal

The UNIX Signal (signal) provides a mechanism for reporting system events to a process, each of which has its own signal number.

The POSIX standard defines about 20 different signals, including two of which are user-defined. In general, a process can respond to a received signal in two different ways:

1) Ignore the signal

2) asynchronous execution of a specified process (signal handler)

If the process does not specify which method to choose, the kernel performs a default action based on the number of signals, with five possible default actions:

1) Terminate the process

2) writes the contents of the execution context process address space to a file (core dump, kernel dump) and terminates the process

3) Ignore the signal

4) Suspend process

5) If the process has been paused, resume its execution

Sigkill and Sigstop signals cannot be processed directly by the process or ignored by the process.

Inter-process communication

Unix System V introduces other kinds of interprocess communication mechanisms in the user state, which are adopted by many UNIX cores: Semaphore, Message queue, shared memory. are collectively referred to as System V IPC.

The kernel implements them as IPC resources, and as with files, the IPC resources are persistent, process creators, process owners, or the Power users must display to release these resources.

The signal volume in the IPC is different from the previously described Semaphore, and the IPC semaphore is used only in the user-state process.

The POSIX standard defines a Message Queuing-based IPC mechanism, known as POSIX Message Queuing, similar to the system V IPC message queue, but provides a simpler interface.

Note to differentiate between: IPC Semaphore and Semaphore, System V IPC and POSIX IPC

Process Management

The most basic system call

The fork () system call is used to create a new process (the difference between copying copy-on-write,vfork when writing)

_exit () system call to terminate a process

The exec () class system call is used to load a new program (the difference between the exec series)

Zombie Process

The parent process can wait through the WAIT4 () system call until one of the child processes ends and returns its process identifier. Waitpid () is similar, except for a single child process.

Before the parent process makes a WAIT4 () call, the child process that has finished running is in a zombie process state, and the kernel saves information about the child process (even if the child process has run to completion). You can use the INIT process to manage these specifically.

Process groups and login sessions

A process group is an abstraction that represents a job.

Login session, informally, contains all descendant processes for that process that has started a working session at the specified terminal.

Memory management

Virtual memory, as a logical layer, is located between the application's memory request and the Hardware Memory management unit (Management Unit, MMU) for the following purposes and benefits:

1) A number of processes can execute concurrently

2) can also run when the application requires more memory than available physical memory

3) The process can execute a program only if part of the code is loaded into memory

4) Allow each process to access a subset of available physical memory

5) A process can share a single memory map of a library function or program

6) The program is relocatable, that is, you can put the program in physical memory anywhere

7) Programmers can write machine-independent code because there is no need to worry about the organizational structure of physical memory

When a process uses a virtual address, the kernel and the MMU work together to locate their actual physical location in memory

Random access memory (RAM) all UNIX operating systems divide RAM into two parts, several of which are dedicated to storing kernel mappings (that is, kernel code and kernel static data structures), and the remainder is typically handled by the virtual memory system and may be used in the following three ways:

1) satisfies kernel requests for buffers, descriptors, and other dynamic kernel data

2) satisfies the process request to the general memory area and the file memory mapping request

3) Get better performance from disk and other buffering devices with cache

Because of the limited RAM, you can call the page box (page-frame-reclaiming) Reclaim algorithm to free other memory when the available memory reaches the critical threshold.

The kernel memory allocator (Kernel Allocator, KMA) is a subsystem that attempts to satisfy all parts of the system for memory requests. Requests may be open from other subsystems, also from the user program, a good KMA should have the following characteristics:

1) must be fast, the most important

2) must reduce memory waste to a minimum

3) Efforts must be made to reduce the memory fragmentation (fragmentation) problem

4) must be able to work with other memory management systems to borrow and release page frames

Several of the KMA have been proposed:

1) Resource Graph allocation algorithm (allocator)

2) 2 Power-party idle list

3) Mckusick-karels allocation algorithm

4) Partner (Buddy) system

5) Mach's Area (zone) allocation algorithm

6) Dynix allocation algorithm

7) Solaris's slab allocation algorithm

The KMA of Linux uses the slab allocation algorithm on the partner system

Process virtual address space processing, the kernel typically holds a set of memory area descriptors describing the process virtual address space. All modern Unix operating systems use the memory allocation policy of the request paging (demand paging). When a process accesses a nonexistent page, the MMU generates an exception, the exception handler finds the affected memory area, allocates a free page, and initializes it with the appropriate data. Similarly, when the process calls malloc () or the BRK () system calls the dynamic request memory, the kernel only modifies the size of the process's heap memory area, and the process actually assigns a page box only if an exception occurs trying to reference the virtual memory address of the process. The virtual address space allocated to a process by the kernel consists of the following memory areas:

1) executable code (. Text) of the program

2) Program initialization data (.)

3) uninitialized data (. BSS) for the program

4) Initial program stack (. Stack)

5) executable code and data required for shared libraries

6) Heap (. Heap)

Cache, and access to disk and other block devices is too slow compared to the time of RAM access.

Device driver, the kernel interacts with the I/O device through device drivers (devices driver). Device drivers are included in the kernel and consist of data structures and functions that control one or more devices, including hard disks, keyboards, mice, monitors, network interfaces, and devices attached to the SCSI bus. Each driver interacts with the rest of the kernel through a specific interface, which has the following advantages:

1) The specific device code can be encapsulated in a specific module

2) Manufacturers can add new devices without knowing the kernel source code and only knowing the interface specification

3) The kernel treats all devices in a uniform manner and accesses them through the same interface

4) The device drivers can be written into modules, and dynamically loaded into the kernel without restarting the system, no longer required, you can also dynamically unload the next module, to reduce the size of the kernel image stored in RAM.

Deep understanding of the Linux kernel Chapter I. INTRODUCTION

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.