Original article: http://blog.csdn.net/skywalkzf/article/details/5185442
Kernel Mode and user mode are two operating levels of the operating system, Intel CPU provides three levels of Ring0-Ring3 run mode. Ring0 is the highest, and ring3 is the lowest. Among them, the privileged level 0 (ring0) is reserved for the operating system code, the device driver code is used, they work in the system core State; and the privileged level 3 (ring3) it is used by common user programs, and they work in the user State. The code running on the core state of the processor is unrestricted and can freely access any valid address for direct port access. Code running in the user State is subject to a lot of checks by the processor. They can only access the virtual addresses that can access the page in the user State specified in the page table items mapped to their address space, and only the I/O permit Bitmap (I/O permission Bitmap) in the task status segment (TSS) can be) (In this case, the iopl In the processor status and Control Mark register eflags is usually 0, indicating that the lowest privilege level for direct I/O is ring0 ). The above discussion is limited to operating systems in protection mode. Operating Systems in DOS mode do not have these concepts, and all the code can be seen as running in the core state.
When a task (process) executes a system call and is executed in the kernel code, it is called that the process is in the kernel running state (or kernel state for short ). At this time, the processor is executed in the kernel code with the highest privilege level (level 0. When the process is in the kernel state, the kernel stack of the current process is used for the executed kernel code. Each process has its own kernel stack. When a process executes the user's code, it says it is in the user running state (User State ). That is, the processor runs in the (3) User code with the lowest privilege level.
In kernel mode, the CPU can execute any command. In user mode, the CPU can only execute non-privileged commands. When the CPU is in the kernel state, you can enter the user State at will. When the CPU is in the user State, the user switches from the user State to the kernel state only when the system is called or interrupted, generally, the program runs in the user State at the beginning. When the program needs to use system resources, it must enter the kernel state by calling the Soft Interrupt.
Linux uses the ring3 running user State, ring0 as the kernel state, and ring1 and ring2 are not used. Ring3 status cannot access the address space of ring0, including code and data. The 4 GB address space for Linux processes. The 3g-4g address space is shared by all users. It is the kernel-state address space, which is stored in the entire kernel code and all kernel modules, and the data maintained by the kernel. When a user runs a program, the process created by the program starts to run in the user State. If you want to perform file operations, network data transmission, and other operations, you must use a system call such as write and send, these system calls will call the code in the kernel to complete the operation. At this time, you must switch to ring0 and then enter the kernel address space in 3gb-4gb to execute the code to complete the operation. After the operation is completed, switch back to ring3, return to the user State. In this way, user-state programs cannot operate on the kernel address space at will, thus providing security protection.
The switch from ring3 to ring0 in processor mode occurs when the control is transferred. There are two situations: the long transfer command call of the Access Call door, And the int command of the access interrupt door or trap door. The transfer details involve complex protection checks and stack switching. For more information, see relevant materials. Modern Operating systems usually use the interrupt gate to provide system services, and execute a command to switch the mode. on Intel x86, this command is int, for example, in Win9x, It is int30 (protection mode callback), int80 in Linux, and int2e in winnt/2000. User-mode service programs (such as system DLL) request system services by executing an intxx, and then the processor mode switches to the core state, the corresponding system code working on the core State serves this request and sends the result to the user program.
1. interrupt handling process
Hardware interruption: from the clock and peripherals
Programmable Interrupt: programmed interrupt, which executes commands that cause software interruption.
Exception: in the case of a page error.
All are handled by the system. When an interrupt occurs, if the CPU is running at the processor running level lower than the interrupt level, it accepts the interrupt before decoding the next instruction and improves the processor running level. The kernel processes the interrupted operations in the following sequence:
1. For an ongoing process, save its current register context and create a new context layer.
2. Determine the interrupt source and identify the interrupt type. Such as clock or disk.
3. Search for the interrupt vector. When the system receives an interruption, it obtains a number from the machine, and the system uses this as the offset of the table. This table is usually an interrupt vector ). The content of the interrupt vector includes the address of the interrupt handler of various interrupt sources, and the method in which the interrupt handler obtains parameters.
4. the kernel calls the interrupt handler.
5. the interrupt handler executes the response and restores (pops up) the previous context layer.
Ii. Soft Interrupt
The Soft Interrupt notification process has an asynchronous event.
The system has a process table. Each process has a process table entry. Each process entry has a soft interrupt signal field, recording all unprocessed Soft Interrupt signals sent to a process.
When a process is about to return from the core State to the user State, or it wants to enter or leave an appropriate low scheduling priority, the kernel should check whether it has received a soft interrupt signal.
The kernel processes Soft Interrupt signals only when a process returns from the core State to the user State.
Iii. system calls
In the C program, calling the system call seems to be a normal function call. When actually calling the system call will cause changes in the user State to the core State, how can this be done?
Originally, the C compiler uses a predefined function library (library of C), in which the function has the name of the System Call, thus solving the problem of requesting the system call in the user program. These library functions generally execute an instruction, which changes the running mode of the process to the core state, and then enables the kernel to start executing the Code for the System Call. We call this command an operating system trap ).
The interface called by the system is a special case of an interrupt handler.
When processing an operating system,
1. the kernel checks the system call entry table based on the system call number and finds the address of the corresponding kernel subroutine.
2. the kernel also determines the number of parameters required for the system call.
3. Copy parameters from the user address space to the U zone (unix v ).
4. Save the current context and run the system call code.
Core State: When the CPU is running kernel code (kernel code is shared ).
User State: When the CPU is running user code.
User Mode: kernel space cannot be accessed (> = 0x80000000)
Kernel Mode: You can access any valid virtual address, including the kernel space. One thread can access any other thread address space.
Kernel Mode and user mode [reprinted]