Process virtual address space is an important abstraction of Linux. The system provides a 4 GB virtual address space for each running Linux Process without interfering with each other, you can only access your own virtual address space address. The virtual address space of a Linux process is divided into two parts: user and kernel ~ 3G is the user address space of the Process, 3G ~ 4G is the kernel address space. In addition, the process virtual address space also enables many advanced Linux technologies, such as memory ing, latency allocation, and shared memory. Therefore, the process virtual memory technology is much more complicated than the address space management in the kernel part.
We hope to thoroughly understand the virtual address space of the process through the following sections.
1. Introduction to process virtual address space
2. Data structure involved in the process virtual address space
3. auxiliary operation functions of Process Virtual Address Space
- Find_vma
- Find_vma_intersection
- Get_unmapped_area
- Insert_vm_struct
4. Virtual Address Space Operation functions of processes
- Allocate virtual address space do_mmap
- Release virtual address space do_munmap
5. Page Fault Processing Function
- The error occurred in the created address space.
- Exception occurred outside the address space that has been created
- Pay-as-you-go allocation in the physical page
- Copy on write
- Exceptions caused by access to the kernel vmalloc Zone
6. Heap Management
Basic Concepts
The management of process virtual address space is complicated. I think it is closely related to the technologies, terms, and concepts involved. Therefore, first explain some terms and technologies of process virtual address management.
Virtual Address
Also known as linear address. For ia32 and other 32bit architectures, the virtual address range is 0 ~ 4G, we divide the virtual address into two spaces: User Mode address space and kernel mode address space. A process can only access virtual addresses. The MMU performs address translation through the process page table (the process page table stores the ing between virtual addresses and physical addresses). When the CPU accesses a virtual address, MMU automatically translates the virtual address into the corresponding physical address and sends the request to the translated physical address.
Physical address
Addresses of physical memory units, including Ram, Rom, module registers, and I/O Unit access addresses. In addition to DMA operations, the operating system accesses the memory, devices, and Io through virtual addresses. Physical addresses are scattered between 0 and 0 ~ 4G address range, but the kernel will map them to 3G ~ 4G kernel virtual address space
Page
It is also a physical page box. For ease of management, the physical memory is divided into 4 kb page boxes.
User address space
Part of the virtual address space depends on the proportion of the user kernel space. Generally, the size is 3 GB. The user process can only access the data in the user address space. If you want to access the kernel space, you must call the system to enter the kernel space. Because the page tables of the Process User address space are different (there are also columns, the child process from a process fork actually shares the whole address space ), therefore, the user address space used to access a process does not affect other processes.
Kernel address space
A part of the virtual (linear) address space is generally 1 GB. The page tables of the kernel address space of all processes are the same, that is, the contents of their kernel space are identical. In the kernel space, you have absolute control over the system and can access any system resources. In general (this is generally used because it is also related to the architecture and configuration), the kernel space is divided as follows.
Anonymous anonymous ing
Anonymous ing refers to files that are not associated with the disk in the ing area and are usually used for shared memory between processes.
Demand Paging
Deferred allocation is a dynamic memory allocation technology. When creating a virtual address space, no ing is established between the virtual address and the physical address until the application accesses the address in the virtual address space. When the processor triggers page fault, the kernel will attempt to allocate a physical page box and establish this ing relationship.
Deferred allocation is absolutely necessary:
1. Although most processes only occupy a small amount of space in 3G address space to establish virtual address ing, the total number of processes is still far greater than the physical memory of the system.
2. A process maps a huge file for editing, but in fact the editing process only works.CompositionTherefore, there is no need to create page mappings for the entire file.
If you pay attention to it, you will find that the delay spread throughout every corner of the operating system, such as the late writing of files, the existence of File Buffer, and even the cache write operations of the processor.
Heap
Heap is the memory area where the process is used to dynamically allocate variables and data.ProgramMembers are invisible. Programmers use the standard interface malloc provided by the C library, while malloc calls the BRK system call to expand or reduce the heap.
Stack
The stack starts with stack_top. If pf_randomize is set, the starting point is smaller than the stack_top random quantity. Each Architecture defines its own random volume.
Copy on write
Copy on write is a common technology in the operating system and other software systems, that is, everyone shares an initial copy. When writer1 needs to be changed, copy the data to the writer, writer modifies the new copy.
When a Linux fork process is running, the initial page table of the child process is copied from the parent process, and the page frame content is not copied (too much or too inefficient ), the child process shares these physical page boxes with the parent process and siblings. Of course, these shared page boxes cannot be modified. If you want to modify the page boxes, an exception will occur. The page exception handling function copies a new page box and marks it as Writable, while the original page box keeps write-protected.
Reverse ing
The purpose of the virtual memory ing is to specify the virtual address space and find the physical page box corresponding to the virtual address space through the page table. The so-called reverse ing is the reverse process of this search. Given a physical page box, find all the virtual addresses mapped to this page. Because the ing between virtual addresses and physical addresses is many-to-one, you must add additional data structures and methods to implement this type of reverse ing.
The reverse ing is introduced because all the page tables associated with the virtual address of the physical page need to be updated when the page is swapped out.