Linux user and kernel state and process context, interrupt context kernel space User space understanding

Last Update:2018-02-27 Source: Internet

Author: User

Tags call back semaphore switches volatile

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. The CPU of the privileged intel x86 architecture has a total of 0~4 four privileged levels, 0 highest, 3 lowest, and ARM architecture with different privilege levels. Each instruction on the hardware performs a corresponding check on the privilege level that the instruction has. Hardware has provided a set of privilege-level use of the relevant mechanisms, software naturally to make good use of, this is what the operating system to do, for Unix/linux, only use the level 0 privilege level and Level 3 privilege level, that is, the highest minimum privilege level. That is, in the Unix/linux system, a command that works at level 0 privilege has the highest power that the CPU can provide, while a command with a 3-level privilege has the lowest or most basic power provided by the CPU to understand the privilege from the CPU execution command angle. In fact, virtual address to physical address mapping by the MMU hardware implementation, that is, paging mechanism is hardware to paging support, the process has a page table data structure pointing to the user space and kernel space, so that the user state and the kernel state access to memory space is different. &NBSP;2, user state and kernel kernel stacks: Each process in Linux has two stacks, respectively, for the process execution of the user state and the kernel state, where the kernel stack is the stack for the kernel state, and the task_ of the process struct structure, more specifically, the thread_info structure is placed together in two contiguous dimensions of the page frame. Now we understand the user-state and kernel-state from the privilege-level scheduling, when the program runs at level 3 privilege level, it can be called run in the user state, because this is the lowest privilege level, is the ordinary user process run the privilege level, Most of the users directly face the program is running in the user state, conversely, when the program runs at level 0 privilege level, it can be called running in the kernel state. Although there are many differences between the user-state and kernel-state programs, the most important difference is the difference in privilege levels, that is, power. Programs running in the user state cannot access the operating system kernel data structure collection program. when we execute a program in the system, most of the time is run in the user state. It switches to the kernel state when it needs the operating system to help complete some work that it does not have the power and ability to do. The 4GB address space of the linux process, the 3g-4g part is shared, is the kernel-State address space, which is stored throughout the kernel's code and all kernel modules, as well as the data maintained by the kernel. The user runs a program, the process created by the program is run in the user state, if you want to perform file operations, network data sending operations, must pass write,Send system calls, these system calls will call the kernel code to complete the operation, at this time, you must switch to RING0, and then enter the kernel address space in the 3GB-4GB to execute the code to complete the operation, after completion, switch back to Ring3, back to the user state. In this way, the user-state program can not arbitrarily operate the kernel address space, with a certain degree of security protection.
Protection mode, through the memory page table operation mechanism, ensure that the address space between processes does not conflict with each other, the operation of one process does not modify the data in the address space of another process. In the kernel state, the CPU can execute any instruction and the CPU will only perform the non-privileged instruction under the user state. When the CPU is in the kernel state, it can enter the user state arbitrarily, and when the CPU is in the user state, it can only enter the kernel state by the interrupt. The general program starts with the user state, and when the program needs to use the system resources, it must enter the kernel state by invoking a soft interrupt.

The processor is always in one of the following states:

1, the kernel state, running in the process context, the kernel represents the process running in the kernel space;

2, the kernel state, running in the interrupt context, the kernel represents the hardware running in the kernel space;

3, user state, running in user space.

3, the user state and the kernel State conversion 1) User state switch to the kernel State 3 kinds of ways a. System call This is a way for the user-state process to proactively switch to the kernel state, and the user-State process is requested to complete the work using a service program provided by the operating system. The system calls the mechanism, its core is still using the operating system for the user specifically open an interrupt to implement, such as Lx86 int 80h, PowerPC sc b. Exceptions when the CPU executes a program running in the user state, some pre-unknown exceptions occur, triggering the current running process to switch to the kernel-related program that handles the exception, that is, to the kernel state, such as a page fault. c. Interruption of peripheral devices when a peripheral device completes a user-requested operation, it sends a corresponding interrupt signal to the CPU, At this point the CPU pauses execution of the next instruction to be executed to execute the handler corresponding to the interrupt signal, and if the previously executed instruction is a user-state program, then the conversion process naturally occurs from the user-state to the kernel-state switch. such as hard disk read and writeThe system switches to the interrupt handler of the hard disk read/write to perform subsequent operations, etc. These 3 methods are the most important way for the system to go to the kernel state from the user state at runtime, where the system call can be thought to be initiated by the user process, and the exception and the perimeter interrupt are passive. &NBSP;&NBSP;&NBSP;4, specific switching operation from the trigger mode, can be considered as pure in the aforementioned 3 different types, However, from the final actual completion of the switching operation from the user state to the kernel state, the key steps involved are exactly the same, without any difference, is equivalent to the execution of an interrupt response process, because the system call is actually finally the interrupt mechanism implementation, and the exception and interrupt processing mechanism is basically consistent. Details of the interrupt handling mechanism do not do too much analysis here, involves the user-state switch to the kernel state of the steps mainly include: "1" from the current process descriptor to extract its kernel stack ss0 and esp0 information "2" using SS0 and Esp0 point to the kernel stack will be the current process CS,EIP, EFLAGS,SS,ESP information is saved, the process also completes the switching process from the user stack to the kernel stack, while preserving the next instruction of the suspended program. "3" Load the CS,EIP information of the interrupt handler which was retrieved by the previous interrupt vector into the corresponding register and start the execution of the interrupt handler, then the program that goes to the kernel state executes. 5. The following describes the process context and the interrupt context which is a sub-concept in the kernel state and the user state

Kernel space and user space is an important theoretical knowledge of the operating system, the user program runs in the user space, the kernel function module runs in the kernel space, the space is not mutual access, the kernel space and user space refers to its code and data storage memory space. The user-state program must use system calls to access the kernel space. When a user-space application enters kernel space through a system call, it involves a context switch. User space and kernel space have different address mappings, universal registers and special register groups and stack area, and the process of user space to pass many variables, parameters to the kernel, the kernel will also save the user process of some registers, variables, etc., so that the system call back to the user space to continue execution.

The so-called process context is the values that a process passes to the kernel, all the registers of the CPU, the state of the process, and the contents of the stack, as well as the running environment before the process enters the kernel state. Therefore, when switching to the kernel state, you need to save all the state of the current process, that is, the context of the current process, so that when the process is executed again, it can resume the state of the switchover and continue execution. Similarly, the hardware triggers the signal, causing the kernel to call the interrupt handler and enter the kernel space. In this process, some of the hardware variables and parameters are passed to the kernel, the kernel through these parameters for interrupt processing, the interrupt context can be understood as the hardware passed over these parameters and the kernel needs to save some of the environment (mainly the environment of the interrupted process).

When a process executes, the values in all registers of the CPU, the state of the process, and the contents of the stack are referred to as the context of the process. When the kernel needs to switch to another process, it needs to save all the state of the current process, that is, to save the context of the current process so that it can be executed when the process is executed again. In Linux, the current process context is stored in the task data structure of the process. In the event of an outage, the kernel executes the interrupt service routine under the kernel state in the context of the interrupted process. At the same time, all resources that need to be used are preserved so that the execution of the interrupted process can be resumed at the end of the relay service.

The context is simply an environment, relative to the process, that is the environment in which the process executes. An environment that interrupts execution, as opposed to interrupts.

The context of a process can be divided into three parts: User-level context, register context, and system-level contexts.

(1) User-level context: body, data, user stack, and shared storage area;
(2) Register context: Universal Register, Program Register (IP), processor status register (eflags), stack pointer (ESP);
(3) System level context: Process Control block task_struct, memory management information (mm_struct, Vm_area_struct, PGD, Pte), kernel stack.

Process context switches are divided into process scheduling and system calls when the two switches, the consumption of resources, when a process scheduling, process switching is context switch. The operating system must switch all the information mentioned above before the newly scheduled process can run. The mode switching of system calls is much easier and time-saving compared to process switching, because the main task of mode switching is to switch the process register context. In the context of a process, you can either associate the current process with the existing macro, or you can sleep or call the scheduler.

The interrupt context does not support preemption, and kernel code running in the process context can be preempted (Linux2.6 support preemption), which is support for process scheduling. However, an interrupt context usually always occupies the CPU (although interrupts can be nested, but we do not generally) and cannot be interrupted. Because of this, the code running in the interrupt context is subject to some limitations and cannot do the following:

(1) Sleep or abandon the CPU.

The consequences of this are catastrophic, because the kernel shuts down the process schedule before it enters the interrupt, and once it sleeps or discards the CPU, the kernel cannot dispatch another process to execute, and the system will die.

(2) Try to get the semaphore, perform the spin lock

If you do not get the semaphore, the code will sleep and it will produce the same situation as above

(3) Perform time-consuming tasks

Interrupt processing should be as fast as possible, because the kernel responds to a large number of services and requests, and the interrupt context takes too long CPU time to severely affect system functionality.

(4) Access to the virtual address of the user space

Because the interrupt context is independent of a particular process, it is the kernel that is running in kernel space on behalf of the hardware, so the virtual address of the user space cannot be accessed in the interrupt context

6, the following describes the user space and the concept of kernel space

User space and kernel space

We know that the operating system is now using virtual memory, then for 32-bit operating systems, its addressing space (virtual storage space) is 4G (2 of 32). The core of the worry system is the kernel, which is independent of the normal application, has access to the protected memory space, and has all the permissions to access the underlying hardware device. In order to ensure that the user process can not directly manipulate the kernel, to ensure the security of the kernel, worry about the system to divide the virtual space into two parts, part of the kernel space, part of the user space. For the Linux operating system, the highest 1G bytes (from the virtual address 0xc0000000 to 0xFFFFFFFF) for the kernel to use, called the kernel space, and the lower 3G bytes (from the virtual address 0x00000000 to 0xBFFFFFFF) for each process to use, Called User space. Each process can enter the kernel through system calls, so the Linux kernel is shared by all processes within the system. Thus, from a specific process perspective, each process can have a virtual space of 4G bytes. The spatial allocation is as follows:

With user space and kernel space, the entire Linux internal structure can be divided into three parts, from the bottom to the top, in order: hardware----------------- As shown in the following:

Detailed Questions to note:

You can see the composition of the kernel

(1) kernel space is stored in the kernel code and data, and the process of user space is stored in the user program code and data. Both the kernel space and the user space are in virtual space.

(2) Linux uses a level two protection mechanism: level 0 for the kernel, and 3 for user programs.

Why not allocate all the address space to the kernel?

If all the address space is given to memory, then how does the user process use memory? How to ensure that the kernel uses memory and user processes do not conflict?

(1) Let's ignore Linux support for segment memory mapping. In protected mode, we know that regardless of whether the CPU is running in a user or kernel state, the address that the CPU executor accesses is a virtual address, and the MMU must read the value in the control register CR3 as a pointer to the current page directory. This translates the virtual address into a real physical address based on the paging memory mapping mechanism (see related documents) to allow the CPU to actually access the physical address.

(2) for 32-bit Linux, each process has a 4G addressing space, but when a process accesses an address in its virtual memory space, how does it not confuse the virtual space of other processes? Each process has its own page directory pgd,linux the directory's pointer to the memory structure that corresponds to the process task_struct. (struct mm_struct) in MM-&GT;PGD. Each time a process is dispatched (schedule ()), the Linux kernel sets CR3 (SWITCH_MM ()) with the PGD pointer of the process.

(3) When creating a new process, create a new page directory PGD for the new process and copy the kernel interval page directory entries from the kernel's page directory Swapper_pg_dir to the corresponding location of the new Process page directory PGD, as follows:
Do_fork ()--copy_mm ()--mm_init ()--Pgd_alloc ()--set_pgd_fast ()--Get_pgd_slow ()--memcpy (& Amp PGD + USER_PTRS_PER_PGD, Swapper_pg_dir + USER_PTRS_PER_PGD, (PTRS_PER_PGD-USER_PTRS_PER_PGD) * sizeof (pgd_t))
In this way, the page directory of each process is divided into two parts, the first part of the "User space" to map its entire process space (0x0000 0000-0xbfff FFFF) is the virtual address of 3G bytes, the second part is "system space" for mapping (0xc000 0000-0xffff FFFF) 1G bytes of virtual address. It can be seen that the second part of the page directory of each process in the Linux system is the same, so from a process point of view, each process has 4G bytes of virtual space, the lower 3G bytes are its own user space, the highest 1G bytes are the system space shared with all processes and the kernel.

(4) Now suppose we have the following scenario:
In process A, set the host name of the computer in the network by system call SetHostName (const char *name,seze_t len).
In this scenario, we are bound to involve the transfer of data from the user space to the kernel space, where name is the address in the user space, which is set to an address in the kernel through the system call. Let's take a look at some of the details of this process: the specific implementation of the system call is to put the parameters of the system call into the register Ebx,ecx,edx,esi,edi (up to 5 parameters, the scenario has two name and Len), and then the system call number is stored in the register eax, Process A is then brought into system space by the interrupt instruction "int 80". Since the CPU run level of the process is less than or equal to the ingress level 3 of the trap gate set for the system call, it is possible to enter the system space unimpeded to execute the function pointer System_call () set for int 80. Since System_call () is in kernel space, its runlevel is 0,cpu to switch the stack to the kernel stack, which is the system space stack for process a. We know that when the kernel creates the TASK_STRUCT structure for the new process, it allocates two contiguous pages, which is the size of 8 K, and uses the size of about 1k at the bottom for task_struct (such as # define ALLOC_TASK_STRUCT () (struct task _struct *) __get_free_pages (gfp_kernel,1)), while the rest of the memory is used in the stack space of the system space, that is, when the system space is transferred from user space, the stack pointer esp becomes (ALLOC_TASK_STRUCT () + 8192), which is why system space usually defines the current (see its implementation) with a macro to get the task_struct address of the present process. Each time the process enters the system space from the user space, the system stack has been pressed into the user stack SS, the user stack pointer esp, EFLAGS, user space CS, EIP, then System_call () eax Press in, and then call Save_all in turn into ES, DS , EAX, EBP, EDI, ESI, EDX, ECX, EBX, and then call Sys_call_table+4*%eax, this scenario is sys_sethostname ().

(5) in Sys_sethostname (), after some protection considerations, call Copy_from_user (To,from,n), where to points to the kernel space system_ Utsname.nodename, such as 0xe625a000,from, point to user space such as 0x8010fe00. Now that process a enters the kernel and runs in the system space, the MMU completes the mapping of the virtual address to the physical address according to its PGD, and finally completes the replication from the user space to the system spatial data. Before copying the kernel to determine the validity of the user space address and length, as to the user space from the beginning of a certain length of the entire interval has been mapped and not to check, if an address in the interval is not mapped or read and write permissions and other problems occur, it is considered a bad address, resulting in a page exception, Let the page exception service program handle it. The process is as follows: Copy_from_user ()->generic_copy_from_user ()->access_ok () +__copy_user_zeroing ().

(6) Summary:
* Process addressing space 0~4g
* The process can only access 0~3g in the user state, only access to the kernel state 3g~4g
* Process enters kernel state via system call
* The 3g~4g portion of each process virtual space is the same
* Process from the user state into the kernel state will not cause CR3 changes but will cause the stack changes7, Memory Management Unit (MMU) Introduction: It is to achieve virtual address and physical address space, as well as kernel space, user space, the basis

The MMU is the abbreviation of the Memory management unit and is the device used to manage the virtual memory system. The MMU is usually a part of the CPU, and itself has a small amount of storage space from the virtual address to the physical address of the matching table, a conversion method (algorithm). This table is called the TLB (conversion side buffer). All data requests are sent to the MMU, and the MMU determines whether the data is in RAM or in a mass storage device. If the data is not in storage space, the MMU will cause a page fault interruption, and the external memory address space consists of pages, rows, and columns.

The two main functions of the MMU are:

1. Convert the virtual address into a physical address.

2. Control memory access allowed. When the MMU shuts down, the virtual address is output directly to the physical address bus: For example, the Uboot front section.

In practice, the use of MMU solves the following problems:

① when using DRAM as a mass memory, if the physical address of the DRAM is not contiguous, this will cause great inconvenience to the programming debugging of the program, and the proper configuration of the MMU can transform it into a continuous space of virtual address, and the discontinuous physical space becomes a continuous virtual address space.

The interrupt vector table of the ②arm kernel is required to be placed at 0 address, and for the case of ROM at 0 address, the Interrupt service program cannot be debugged, so it is necessary to map the writable memory space to the 0 address during the debugging phase.

Some of the address segments of the ③ system are not allowed to be accessed, otherwise they may have unpredictable consequences, in order to avoid such errors, the MMU matching table can be set to make these address segments a user inaccessible type, that is, kernel space and user space differences.

The matching tables generated in the launcher contain address mappings, store page sizes (1m,64k, or 4K), and allow access to such information, which is the basis for implementing these functions.

For example, the physical address range for the 16-megabyte DRAM on the target board is 0XC000,0000~0XC07F,FFFF,;0XC100,0000~0XC17F,FFFF, which is a logical address, a 8-bit data for an address, and a 16-megabyte ROM with a virtual address range of: 0 X0000,0000~0x00ff,ffff. Match table configuration (the match table input has the virtual Address page size allows access to the page table) as follows:

You can see a contiguous virtual address space on the left, a discontinuous physical address space on the right, and a dram mapped to a 0 address range. The MMU uses the virtual Address and page table location information to obtain the corresponding physical address according to the conversion logic and output to address bus.

It should be noted that after enabling the MMU, the program continues to run, but for programmers the pointer to the program counter has changed to point to the virtual address of the ROM.

The role of the MMU has two

There are two roles for the MMU: Address translation and address protection software is the responsibility of the Configuration page table, the responsibility of the hardware is to complete the address translation and protection work according to the page table. The three functions are used to access a page table. If the CPU does not have a hardware MMU then this table will be meaningless. You have to understand the concept of memory mapping from the perspective of the CPU. The memory map does not call a function and then reads the return value. Instead, the CPU uses the MMU to translate the address to be accessed in an instruction into a physical address, which is then sent to the bus process. There is a book called Understand Linux kernel, patience to see, that book is very good writing.

The MMU is the product of a complex processor to a certain extent. This thing and operating system memory management if combined to learn and understand, the best effect.

Embedded system, the storage system is very different, can contain many types of storage devices, such as Flash,sram,sdram,rom, and so on, these different types of memory parts, such as speed and width, etc., when accessing the storage unit, it is possible to take a flat-panel address mapping mechanism to its operation, or need to use virtual address to read and write to it, in the system, the need to introduce a storage protection mechanism to enhance the security of the system. In order to adapt to such a complex storage system requirements, the ARM processor introduced a storage management unit to manage the storage system, which is the MMU significance.

A Memory management unit (MMU) Overview

In an ARM storage system, use the MMU to implement a mapping of virtual addresses to actual physical addresses. Why do you want to implement this mapping? First of all, from an embedded system of the basic composition and operation of the way to start. When the system is power up, the processor's program pointer starts from 0x0 (or high-end boot at 0xffff_0000), executes the program in sequence, starts the address in the program pointer (PC), and belongs to the non-volatile memory space, such as ROM, flash, etc. However, compared with hundreds of megabytes of embedded processors, FLASH, ROM and other memory response speed is slow, has become a bottleneck to improve system performance. And SDRAM has a high response speed, why not use SDRAM to execute the program? In order to improve the overall speed of the system, it can be assumed that the system is configured with Flash and ROM, and the real application is downloaded into SDRAM, which can improve the performance of the system. However, this idea encountered another problem, when the ARM processor in response to the exception event, the program pointer will jump to a certain location, assuming an IRQ interrupt, the PC will point to the 0x18 (if the high-end boot, the corresponding point to 0vxffff_0018), While the 0x18 is still occupied by non-volatile memory, the execution of the program is still partially performed in Flash or ROM. So can we make the program run completely in SDRAM? The answer is yes, this introduces the MMU, using the MMU, the address of the SDRAM can be fully mapped to a continuous address space in the beginning of 0x0, and the original occupy this space of Flash or ROM mapped to other non-conflicting storage space location. For example, if the address of Flash is from 0X0000_0000-0X00FF_FFFF, and the address range of SDRAM is 0X3000_0000-0X31FF_FFFF, the SDRAM address can be mapped to 0x0000_0000-0x1fff_ FFFF the address of Flash can be mapped to 0X9000_0000-0X90FF_FFFF (where the address space is idle and not occupied). When the mapping is complete, if the processor has an exception, assuming that it is still an IRQ interrupt, the PC pointer points to the address at 0x18, at which point the PC actually reads the instruction from the 0x3000_0018 at the physical address. With the MMU mapping, the program is fully operational in SDRAM, where the address of the physical device is actually attached to the bus address.

In practical applications, two discrete physical address spaces may be assigned to SDRAM. In the operating system, accustomed to the space of SDRAM contiguous, convenient memory management, and applications to request large chunks of memory, the operating system kernel can also be easily allocated. A discontinuous physical address space can be mapped to a contiguous virtual address space through the MMU.

The operating system kernel, or some of the more critical code, is generally not expected to be accessed by user applications. The MMU can control the access to the address space, thus protecting the code from being destroyed, and the kernel space being different from the user's space.

Linux user and kernel state and process context, interrupt context kernel space User space understanding

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux user and kernel state and process context, interrupt context kernel space User space understanding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux user and kernel state and process context, interrupt context kernel space User space understanding

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support