Mastering the Linux kernel Design (13): Memory management process address space

Source: Internet
Author: User

"Copyright Notice: respect for the original, reproduced please retain the source: blog.csdn.net/shallnet, the article only for learning Exchange, do not use for commercial purposes"

The process address space consists of process-addressable virtual memory, and the virtual address space for Linux is 0~4g bytes (Note: This section describes all 32 as an example). The Linux kernel divides this 4G-byte space into two parts. The highest 1G bytes (from virtual addresses 0xc0000000 to 0xFFFFFFFF) are used by the kernel, which is called "kernel space". Instead, the lower 3G bytes (from the virtual address 0x00000000 to 0xBFFFFFFF) are used by each process, called "User space." Because each process can enter the kernel through system calls. Therefore, the Linux kernel is shared by all processes within the system. Thus, from a specific process perspective, each process can have a virtual space of 4G bytes.

Although a process can address 4G of virtual memory, it does not mean that it has access to all address spaces, and that virtual memory space must be mapped to a physical storage space (memory or disk space) before it can actually be used. A process can only access a legitimate address space, and if a process accesses an illegal address space, the kernel terminates the process and returns a "segment error". Where is the legal address space for virtual memory? Let's take a look at the partition of the process virtual address space:

where the stack is arranged at the top of the virtual address space, the data segment and code snippet are distributed at the bottom of the virtual address space, and the empty part is the space that can be dynamically distributed when the process runs, including the mapping of the kernel address space content, the dynamic requisition address space, the code or data of the shared library, etc. In the virtual address space, only those addresses that are mapped to physical storage space are valid address spaces. Each piece of legal address space corresponds to a separate virtual memory area (vma,virtual memory areas), and the process address space of the processes is made up of these memory regions.
Linux uses a complex data structure to track the virtual address of a process, the process address space is represented by a memory descriptor structure, and the memory descriptor is represented by the mm_struct struct, which represents the <include/linux/mm_types.h> In the file:
struct Mm_struct {struct vm_area_struct * mmap;    /* List of VMAs */struct rb_root mm_rb;    struct vm_area_struct * mmap_cache; /* Last FIND_VMA result */unsigned long (*get_unmapped_area) (struct file *filp, unsigned long addr, un    Signed long Len, unsigned long pgoff, unsigned long flags);    void (*unmap_area) (struct mm_struct *mm, unsigned long addr);        unsigned long mmap_base;        /* Base of mmap area */unsigned long task_size;     /* Size of Task VM space */unsigned long cached_hole_size;        /* If Non-zero, the largest hole below Free_area_cache */unsigned long free_area_cache;    /* First hole of size cached_hole_size or larger */pgd_t * PGD;            atomic_t mm_users; /* How many users with user space?            */atomic_t Mm_count;                /* How many references to "struct Mm_struct" (Users count as 1) */int map_count;    /* Number of VMAs */struct Rw_semaphore mmap_sem; Spinlock_t Page_table_lock;        /* Protects page tables and some counters */struct list_head mmlist;    /* List of maybe swapped mm ' s.                         These is globally strung * together off init_mm.mmlist, and is protected * by Mmlist_lock *//* Special counters, in some configurations protected by the * Page_ta     Ble_lock, in and configurations by being atomic.    */mm_counter_t _file_rss;    mm_counter_t _anon_rss;    unsigned long hiwater_rss;    /* High-watermark of RSS usage */unsigned long HIWATER_VM;    /* High-water Virtual Memory usage */unsigned long TOTAL_VM, LOCKED_VM, SHARED_VM, EXEC_VM;    unsigned long stack_vm, RESERVED_VM, Def_flags, nr_ptes;    unsigned long start_code, End_code, Start_data, End_data;    unsigned long start_brk, BRK, Start_stack;    unsigned long arg_start, arg_end, Env_start, env_end; unsigned long saved_auxv[at_vector_size]; /* FOR/PROC/PID/AUXV */struct LINUX_BINFMT *binfmt;    cpumask_t Cpu_vm_mask;    /* architecture-specific MM Context */mm_context_t context;     /* Swap Token stuff */* * Last value of the global fault stamp as seen by this process.     * In other words, this value gives a indication of how long * it had been since this task got the token.    * Look at MM/THRASH.C */unsigned int faultstamp;    unsigned int token_priority;    unsigned int last_interval; unsigned long flags; /* Must use atomic bitops to access the bits */struct core_state *core_state;    /* coredumping support */#ifdef Config_aio spinlock_t ioctx_lock; struct Hlist_head ioctx_list; #endif #ifdef config_mm_owner/* * "OWNER" points to a task which is regarded as the Canonical * User/owner of this mm. All of the following must is true in * order for it is changed: * * current = = Mm->owner * current-&   Gt;mm! = mm * NEW_OWNER-&GT;MM = = mm * New_owner->alloc_lock is held  */struct task_struct *owner; #endif #ifdef CONFIG_PROC_FS/* Store ref to File/proc/<pid>/exe symlink points    to */struct file *exe_file; unsigned long num_exe_file_vmas; #endif #ifdef config_mmu_notifier struct mmu_notifier_mm *mmu_notifier_mm; #endif};
the first row member of the struct is mmap the memory area, represented by struct struct vm_area_struct:
/* * This struct defines a memory VMM memory area.  There is one of these * per Vm-area/task. A VM area was any part of the process virtual memory * space that had a special rule for the Page-fault handlers (ie a shar Ed * library, the executable area etc).    */struct vm_area_struct {struct mm_struct * VM_MM; /* The address space we belong to.        */unsigned long vm_start;        /* Our start address within VM_MM. */unsigned long vm_end; /* The first byte after our end address within VM_MM.///* linked list of VMS areas per task, Sor    Ted by Address */struct vm_area_struct *vm_next;        pgprot_t Vm_page_prot; /* Access permissions of this VMA.        */unsigned long vm_flags; /* Flags, see mm.h.    */struct Rb_node vm_rb;     /* * For areas with a address space and backing store, * linkage into the address_space->i_mmap prio tree, or * Linkage to the list of like VMAs hanging off it node, or * linkage of VMA in the Address_space->i_mmap_nonlinear list.            */Union {struct {struct list_head list;    void *parent;        /* Aligns with prio_tree_node parent */struct vm_area_struct *head;        } Vm_set;    struct Raw_prio_tree_node prio_tree_node;    } GKFX; /* * A file ' s map_private VMA can is in both i_mmap tree and ANON_VMA * list, after A COW of one of the file pages    .  A map_shared VMA * can only is in the I_mmap tree.     An anonymous map_private, a stack * or BRK VMA (with a NULL file) can only is in an ANON_VMA list.    */struct List_head anon_vma_node;    /* Serialized by Anon_vma->lock */struct ANON_VMA *anon_vma; /* Serialized by Page_table_lock */* Function pointers to deal with the this struct.    */const struct VM_OPERATIONS_STRUCT *vm_ops;        /* Information about our backing store: */unsigned long vm_pgoff; /* Offset (within vm_file) in page_size units, *not* page_cache_sIZE * * struct file * vm_file; /* File We map to (can is NULL).        */void * VM_PRIVATE_DATA;    /* was Vm_pte (shared mem) */unsigned long vm_truncate_count;/* truncate_count or restart_addr */#ifndef Config_mmu    struct Vm_region *vm_region;    /* NOMMU Mapping Region */#endif #ifdef config_numa struct mempolicy *vm_policy; /* NUMA Policy for the VMA */#endif};
The vm_area_struct structure describes an independent memory range on successive intervals within the process address space, each of which is represented by the struct, each of which is connected as a doubly linked list. In addition to the linked list structure, Linux also uses red-black tree mm_rb to organize vm_area_struct. With this tree structure, Linux can quickly locate a virtual memory address.
the members Vm_start and vm_end in the struct represent the first address and the end address of the memory interval, and the two value subtraction is the length of the memory interval.
The member vm_mm points to the process address space structure to which it belongs. So two different processes map the same file to their own address space, each with a VM_AREA_STRUCT structure to identify its own area of memory. Two threads with a shared address space are identified by only one vm_area_struct struct, because they are using the same process address space.
Vm_flags identifies the behavior and information of the pages contained in the memory area, reflecting the code of conduct that the kernel must follow to process the page.
You can use the Cat/proc/pid/maps command and the PMAP command to view a given process space and the area of memory contained therein. Take the process of process number 17192 on the author's system as an example.
# cat/proc/17192/maps     //Displays all memory areas in the process address space 001e3000-00201000 R-xp 00000000 fd:00 789547     /lib/ ld-2.12.so00201000-00202000 r--p 0001d000 fd:00 789547     /lib/ld-2.12.so00202000-00203000 rw-p 0001e000 fd:00 789547     /lib/ld-2.12.so00209000-00399000 R-xp 00000000 fd:00 789548     /lib/libc-2.12.so00399000-0039a000---P 00190000 fd:00 789548     /lib/libc-2.12.so0039a000-0039c000 r--p 00190000 fd:00 789548     /lib/ libc-2.12.so0039c000-0039d000 rw-p 00192000 fd:00 789548     /lib/libc-2.12.so0039d000-003a0000 Rw-p 00000000 00:00 008048000-08049000 R-xp 00000000 fd:00 1191771    /home/allen/myprojects/blog/conn_user_kernel/test/ a.out08049000-0804a000 Rw-p 00000000 fd:00 1191771    /home/allen/myprojects/blog/conn_user_kernel/test/ a.outb7755000-b7756000 Rw-p 00000000 00:00 0b776d000-b776e000 rw-p 00000000 00:00 0b776e000-b776f000 R-xp 00000000 00:00 0          [vdso]bfc9f000-bfcb4000 Rw-p 00000000 00:00 0          

# Pmap 1719217192:   ./a.out001e3000    120K r-x--  /lib/ld-2.12.so    //bank and the following two behavior dynamic linker ld.so code snippet, data segment, BSS segment 00201000      4K r----  /lib/ld-2.12.so00202000      4K rw---  /lib/ld-2.12.so00209000   1600K r-x--  /lib/libc-2.12.so    //bank and below are code snippets, data segments, and BSS segments in C library 00399000      4K-----  /lib/ libc-2.12.so0039a000      8K r----  /lib/libc-2.12.so0039c000      4K rw---  /lib/libc-2.12.so0039d000     12K RW---    [Anon]08048000      4K r-x--  /home/allen/myprojects/blog/conn_user_kernel/test/a.out    // Code snippet for executable object 08049000      4K rw---  /home/allen/myprojects/blog/conn_user_kernel/test/a.out    // Data segment of executable object b7755000      4K rw---    [anon]b776d000      4K rw---    [anon]b776e000      4K r-x--    [anon] bfc9f000     84K rw---    [stack]    //Stack segment Total     1860K
The vm_ops field in the structure specifies the memory area related Operation function table, the kernel uses the table method operation VMA, the action function table is represented by the VM_OPERATIONS_STRUCT structure, defined in the <include/linux/mm.h> file:
/* * These is the virtual MM functions-opening of an area, closing and * unmapping it (needed to keep files on disk up- To-date etc), pointer * to the functions called when a no-page or a wp-page exception occurs.    */struct vm_operations_struct {void (*open) (struct vm_area_struct * area);    Specifies that the memory area is loaded into an address space when the function is called Void (*close) (struct vm_area_struct * area); Specifies that the memory area is removed from the address space when the function is called int (*fault) (struct vm_area_struct *vma, struct vm_fault *VMF); When a page that does not appear in physical memory is accessed, the page fault handling invokes the function/* notification that a previously read-only page was about to become * writable, if    An error is returned it would cause a sigbus */int (*page_mkwrite) (struct vm_area_struct *vma, struct vm_fault *VMF); /* Called by ACCESS_PROCESS_VM if Get_user_pages () fails, typically * for use by special VMAs that can switch bet Ween Memory and hardware */INT (*access) (struct vm_area_struct *vma, unsigned long addr, void *buf, I NT Len, int write); #ifdef Config_numa ... #enDIF}; 
/* Look up the first VMA which satisfies addr < Vm_end, NULL if none.    */struct vm_area_struct *find_vma (struct mm_struct *mm, unsigned long addr) {struct Vm_area_struct *vma = NULL;        if (mm) {/* First check the most recently used area of memory to see if the cached VMA contains the required address */* (hit record close to 35%.) */VMA = mm->mmap_cache;        If the cache does not contain VMA that do not contain the desired, the function searches for a red-black tree. if (! (            VMA && vma->vm_end > Addr && vma->vm_start <= addr) {struct Rb_node * rb_node;            Rb_node = mm->mm_rb.rb_node;            VMA = NULL;                while (Rb_node) {struct vm_area_struct * VMA_TMP;                Vma_tmp = Rb_entry (rb_node, struct vm_area_struct, VM_RB);                    if (Vma_tmp->vm_end > Addr) {VMA = vma_tmp;                    if (vma_tmp->vm_start <= addr) break;                Rb_node = rb_node->rb_left; } Else Rb_node = Rb_node->rB_right;        } if (VMA) Mm->mmap_cache = VMA; }} return VMA;}
When an image of a program starts executing, the executable image must be loaded into the virtual address space of the process. If the process uses any shared library, the shared library must also be mounted into the virtual address space of the process. As you can see, Linux does not load the image into physical memory, but instead, the executable is simply connected to the virtual address space of the process. As the program runs, the part of the referenced program is loaded into physical memory by the operating system, and this method of linking the image to the process address space is called a "memory map."
        When an executable image is mapped to the virtual address space of a process, a set of vm_area_struct structures is generated to describe the starting and ending points of the virtual memory interval,each VM_AREA_STRUCT structure represents part of an executable image,may be executable code,It is also possible to initialize variables or uninitialized data,These are all implemented in the function Do_mmap (). With the generation of vm_area_struct structures,The standard operating functions on the virtual memory interval described by these structures are also initialized by Linux.
Static inline unsigned long do_mmap (struct file *file, unsigned long addr,    unsigned long len, unsigned long prot,    unsigned long flag, unsigned long offset) {    unsigned long ret =-einval;    if (offset + page_align (len)) < offset)        goto out;    if (! ( Offset & ~page_mask)        ret = do_mmap_pgoff (file, addr, Len, prot, flag, offset >> page_shift); Out:    ret URN ret;}
The function adds a new address range to the address space of the process. Defined in<include/linux/mm.h>.
meaning of the parameters in the function:
file: Represents the files to be mapped.
offset\: The offset in the file, because we do not map a file all at once, it may just be part of the mapping file, and off indicates the starting position of that part.
Len: The length of the portion of the file to be mapped.
Addr: An address in a virtual space that represents the beginning of a free virtual zone from this address.
Prot: This parameter specifies access rights to the pages contained in this virtual area. Possible signs are Prot_read, Prot_write, Prot_exec and Prot_none. The first 3 marks are the same as those of the Mark Vm_read, Vm_write and Vm_exec. Prot_none indicates that the process does not have any of the above 3 access rights.
flag: This parameter specifies additional flags for the virtual area.
The function calls the Do_mmap_pgoff () function, which does the main work of memory mapping, the function is longer, detailed implementation can be viewed <mm/mmap.c> file.
     Since the mapping of the file to the virtual memory is only a mapping relationship, the mapping between the virtual memory page and the physical page has not been established. When an executable image is mapped to process virtual memory and begins execution, because only a very small portion of the virtual memory interval is loaded into physical memory, it is likely that the data being accessed is not in physical memory. At this point, the processor will report a page failure and its corresponding failure to Linuxreason,
the kernel must load it into physical memory from a disk image or swap file (This page is swapped out), this is the page mechanism.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Mastering the Linux kernel Design (13): Memory management process address space

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.