Down Tour (20): Process address space (i)

Source: Internet
Author: User
Tags access properties

In addition to managing its own memory, the kernel must also manage the address space of the process-the memory that is seen by each user-space process in the system. The Linux operating system employs virtual memory technology, so that all processes in the system share memory in a virtual manner. Even with a single process, it can have an address space that is much larger than the physical memory of the system.

The process address space consists of a linear address area in each process, and more importantly, the kernel allows the process to use addresses in that space. Each process has a flat (FALG) address space of 32 or 64 bits, and the exact size of the space depends on the architecture. Some operating systems provide segment address space, but typically use a flat address space rather than a segmented memory pattern.

The memory address is a given value that is within the address space range, such as 4021f000. This value represents a specific byte in the process 32-bit address space. We are more related to the virtual memory address space that processes have access to, such as 08048000-0804c000. These accessible legal address zones are called memory regions, and through the kernel, processes can add or reduce memory areas dynamically to their address space.

A process can only access memory addresses within a valid range. Each memory region has properties that have specific access properties: Read-only, write-only, executable, and so on. If a process accesses an address that is not in the valid range, or accesses a valid address in an incorrect manner, the kernel terminates the process and returns a "segment error" message.

Memory areas can contain various memory objects, such as:

1. Memory mapping of executable code, called code snippet.

2. The memory map of an executable's initialized global variable, called a data segment.

3. Contains uninitialized global variables, that is, 0 pages of the BSS segment (the page information is all 0 values, so it can be used to map the BSS segment for purposes such as the memory map).

4. A 0-page memory map for the process user space stack (not to be confused with the process kernel stack, the kernel stack of the process exists independently and maintained by the kernel).

5. Code snippets, data segments, and BSS for each shared library, such as C library or dynamic Link program, are also loaded into the process's address space.

6. Any memory-mapped file.

7. Any shared memory segments.

8. Any anonymous memory mappings, such as the memory allocated by malloc ().

Any valid address in the process address space can only be in a unique region, and these memory areas cannot be overwritten with each other. In the process of execution, each distinct memory fragment corresponds to a separate memory region: Stack, object code, global variable, mapped file, and so on.

Memory Descriptor

The kernel uses the memory descriptor structure to represent the address space of the process, which is structured as follows:


The Mm_users field indicates the number of processes that are using the address. If two processes share the address space, then mm_users is equal to 2, if Mm_users equals 0, then Mm_count equals 0, indicating that there is no reference to the mm_struct struct, this time will be destroyed, conversely, Mm_count equals 1.

All mm_struct structures are linked in a doubly linked list through their own mmlist domain, the first element of the list is the INIT_MM memory descriptor, which represents the address space of the INIT process, and it is also important to note that the use of the linked list is done using Mmlist_ Lock locks to prevent concurrent access. The total number of memory descriptors is stored in the MMLIST_NR global variable.

  Allocating memory descriptors

In the process descriptor of the processes, the MM field holds the memory descriptor of the process, so current->mm points to the memory descriptor of the current process.

There are two ways to allocate memory descriptors, the first of which is that the child process does not share the address space with the parent process (that is, a side-by-side process, and if the address space is shared, the child process is thread-related to the parent process), and the mm_struct structure in the child process is actually through the file kernel/ ALLOCATE_MM () macros in FORC.C are allocated from the Mm_cachep slab cache. The second, by calling Clone (), set the CLONE_VM flag, which is called a thread, and when CLONE_VM is specified, the kernel does not need to call the ALLOCATE_MM () function, just need to call COPY_MM () The function will point the MM field to the memory descriptor of its parent process. Typically, each process has a unique mm_struct structure, which is the unique process address space.

Destroying memory descriptors

When the process exits, the kernel calls the EXIT_MM () function, which performs some regular destruction while updating some statistics. Where the function calls the Mmput () function to reduce the Mm_users user count in the memory descriptor, if the user count drops to 0, the Mmdrop () function continues to be called, the Mm_count usage count is reduced, and if the Mm_count is zero, the memory descriptor no longer has any consumers. , call the FREE_MM () macro to return the mm_struct struct to the Mm_cachep slab cache through the Kmen_cache_free () function.

  Mm_struct and Kernel threads

The kernel thread does not have a process address space, and there is no associated memory descriptor. Therefore, the MM domain of the kernel process is empty. In fact, this is also the true meaning of kernel processes-they don't have user contexts. Because kernel threads do not require access to the memory of the user space.

When a process is dispatched, the address space pointed to by the process's mm field is loaded into memory, and the ACTIVE_MM field in the process descriptor is updated to point to the new address space. The kernel thread does not have an address space, so the MM field is null. When a kernel thread is dispatched, the kernel discovers that its mm domain is null, preserves the address space of a process, and then updates the ACTIVE_MM field in the process descriptor of the kernel thread to point to the memory descriptor of the previous process. So, when needed, the kernel thread can use the page table of the previous process.

Memory Area

The memory area is described by the vm_area_struct structure and is often referred to as virtual memory or VMA in the kernel. It describes a range of independent memory on successive intervals within a specified address space. The kernel manages each area of memory as a separate memory object, each with a consistent set of properties, such as access, and the corresponding operations are consistent, which is similar to the VFS, and the object-oriented approach enables the VMA structure to represent many types of memory regions- such as a memory-mapped file or process user space stack. The structure is as follows:


The VM_MM domain points to and VMA related mm_struct structures, each VMA alignment related MM_STRUCT structure is unique, so even if two separate processes map the same file to their respective address bows, they will each have a Vm_area_ struct structure to flag its own area of memory, if two processes share an address space, then they also share all of the vm_area_struct structure.

VMA logo

The VMA flag is contained within the VM_FLAGS domain, which flags the behavior and information of the pages contained in the memory area. The VMA flag reflects the code of conduct that the kernel must follow to process the page, not the hardware requirements, including information about the pages in the memory area, or the overall information in the memory area. The form is as follows:


VMA operation

The Vm_ops field in the vm_area_struct structure points to the action function table associated with the specified memory area, and the kernel uses the methods in the table to manipulate the VMA. Vm_area_struct represents any type of memory area as a generic object, and the action table describes a specific method for a particular object instance. The Action function table is represented by the VM_OPERATIONS_STRUCT structure, and the structure is as follows:

void open (struct vm_area_struct *area)

The function is called when the specified memory area is added to an address space.

void Close (struct vm_area_struct *area)

The function is called when the specified memory area is removed from the address space.

struct page *nopage (struct vm_area_struct *area,unsigned long address, int unused)

When the page to be accessed does not exist in physical memory, the function is called by the page fault handler.

int populate (struct vm_area_struct *area,unsigned long address,unsigned long len,pgprot_t prot, unsigned long pgoff, int n Onblock)

The function is called by the system to remap_pages () to pre-map a new mapping for the missing pages break that will occur.

  The tree structure of memory area and the list structure of memory area

  The memory area can be accessed through one of the mmap and MM_RB domains in the memory descriptor, each independently pointing to the entire memory area object that the memory descriptor wants to swap. They contain exactly the same pointer to the vm_area_struct struct, just different from the organization method. The former uses a separate chain table to connect all the memory area objects. The latter uses the red-black tree to link all memory area objects.

Each memory area associated with the process corresponds to a vm_area_struct struct. In addition the process differs from the thread, and the process struct TASK_STRCT contains a unique mm_struct struct-body reference.


Reference from: "Linux Kernel Development".

Down Tour (20): Process address space (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.