Memory Space and user space

Source: Internet
Author: User
Tags data structures

a.introduction (32-bit system)

Linux simplifies the fragmentation mechanism so that virtual addresses (logical addresses) and linear addresses are always consistent, so the virtual address space for Linux is also 0~4g (2^32).

The Linux kernel divides this 4G-byte space into two parts. The highest 1G bytes (from virtual addresses 0xc0000000 to 0xFFFFFFFF) are used by the kernel, which is called "kernel space". Instead, the lower 3G bytes (from the virtual address 0x00000000 to 0xBFFFFFFF) are used by each process, called "User space." Because each process can enter the kernel through system calls, the Linux kernel is shared by all processes within the system. Thus, from a specific process perspective, each process can have a virtual space of 4G bytes.

* Process Addressing space 0~4g
* The process can only access 0~3g in the user state, only access to the kernel state 3g~4g
* Process enters kernel state via system call
* The 3g~4g portion of each process virtual space is the same
* Process from the user state into the kernel state will not cause CR3 changes but will cause the stack changes

About Virtual kernel space (logical address) mapping to physical space:

When kernel module code or thread accesses memory, the memory address in code is logical address, and corresponding to true physical memory address, need address one-to -one mapping, such as logical address 0xc0000003 corresponding physical address is 0x 3,0xc0000004 corresponds to the physical address of 0x4, ..., the logical address corresponds to the physical address of the relationship is

Physical Address = logical address –0xc0000000

Logical Address Physical memory Address
0xc0000000 0x0
0xc0000001 0x1
0xc0000002 0x2
0xc0000003 0x3
... ...
0xe0000000 0x20000000
... ...
0xFFFFFFFF 0x40000000??

For kernel space, the address map is a very simple linear mapping, 0xc0000000 is the physical address and linear address between the amount of displacement, in the Linux code is called Page_offset.


One, Linux user space and kernel space

second, the Linux process address space of a step-by-step inquiry refer:


one, Linux user space and kernel space

Linux operating systems and drivers run in kernel space, applications run in user space, both cannot simply use pointers to pass data, because the virtual memory mechanism used by Linux, user-space data may be swapped out, and when kernel space uses user-space pointers, the corresponding data may not be in memory.

we know that the address space of each process of the Linux operating system is independent, in fact, the independence is said to be independent of the physical space.

1. Linux kernel address mapping model

The x86 CPU uses a Segment-page address mapping model. The address in the process code is a logical address, and after a segment-page address mapping, the physical memory is actually accessed.

The paragraph-page mechanism is shown below.

2.Linux kernel address space partitioning

Typically, the 32-bit Linux kernel address space is partitioned 0~3g to user space and 3~4g to kernel space. Note that this is a 32-bit kernel address space partition, and the 64-bit kernel address space partition is different.

3.Linux Kernel High-end memory origin

When kernel module code or thread accesses memory, the memory address in code is logical address, and corresponding to true physical memory address, need address one-to -one mapping, such as logical address 0xc0000003 corresponding physical address is 0x 3,0xc0000004 corresponds to the physical address of 0x4, ..., the logical address corresponds to the physical address of the relationship is

Physical Address = logical address –0xc0000000

Logical Address Physical memory Address
0xc0000000 0x0
0xc0000001 0x1
0xc0000002 0x2
0xc0000003 0x3
... ...
0xe0000000 0x20000000
... ...
0xFFFFFFFF 0x40000000??

Assuming that the above simple address mapping relationship, then the kernel logical address space access is 0xc0000000 ~ 0xFFFFFFFF, then the corresponding physical memory range is 0x0 ~ 0x40000000, that is, only access to 1G of physical memory. If 8G physical memory is installed in the machine, the kernel will only be able to access the first 1G physical memory, and the 7G physical memory will be inaccessible because the kernel's address space has all been mapped to the physical memory address range 0x0 ~ 0x40000000. Even if 8G of physical memory is installed, then the physical address is 0x40000001 memory, the kernel how to access it. The code must have a memory logical address, 0xc0000000 ~ 0xFFFFFFFF of the address space has been exhausted, so the physical address 0x40000000 later memory can not be accessed.

It is not clear that the kernel address space 0xc0000000 ~ 0XFFFFFFF is all used for simple address mapping. So the x86 architecture divides the kernel address space into three parts: ZONE_DMA, Zone_normal, and Zone_highmem. Zone_highmem is the high-end memory, which is the origin of memory high-end memory concept.

In the x86 structure, three types of zones are as follows:

ZONE_DMA memory starts at 16MB

Zone_normal 16MB~896MB

zone_highmem 896MB ~ End

4.Linux Kernel High-end memory understanding

Earlier we explained the origin of high-end memory. Linux divides the kernel address space into three parts ZONE_DMA, Zone_normal, and Zone_highmem, and the high-end memory High_mem address space ranges from 0xf8000000 to 0xFFFFFFFF (896MB~1024MB). So as the kernel is how the use of 128MB high-end memory address space is how to achieve access to all physical memory .

When the kernel wants to access more than 896MB of physical address memory, find a corresponding size free logical address space from 0xf8000000 to 0xFFFFFFFF address space and borrow it for a while. Using this logical address space, create a map to the physical memory that you want to access (that is, populate the Kernel Pte page table), temporarily for a while, and then return when you're done. This allows others to use the address space to access other physical memory, allowing access to all physical memory using a limited address space. The following figure.

For example, the kernel wants to access 2G of physical memory at the beginning of 1MB, that is, the physical address range is 0x80000000 ~ 0x800fffff. Find an idle address space of 1MB size before access, assuming that the free address space found is 0xf8700000 ~ 0xf87fffff, and this 1MB logical address space is mapped to the memory of the physical address space 0x80000000 ~ 0X800FFFFF. The mapping relationship is as follows:

Logical Address Physical memory Address
0xf8700000 0x80000000
0xf8700001 0x80000001
0xf8700002 0x80000002
... ...
0xf87fffff 0x800fffff

When the kernel accesses 0x80000000 ~ 0X800FFFFF physical memory, it releases the 0xf8700000 ~ 0xf87fffff kernel linear space. This allows other processes or code to access other physical memory using the 0xf8700000 ~ 0xf87fffff address.

From the above description, we can know the most basic idea of high-end memory : Borrow an address space, set up a temporary address mapping, after the release, to reach the address space can be reused, access to all physical memory.

See here, can not help someone will ask: In case there is a kernel process or module has been occupied a certain period of logical address space is not released, what to do. If this is the case, then the kernel's high-end memory address space is becoming more and more tense, if it is not released, then no mapping to physical memory is inaccessible.

Some office buildings in Tsim Sha Tsui, Hong Kong, have few toilets and have locks. If the customer wants to go to the restroom, they can take the key to the front desk and return the key to the front desk after convenience. This way, although there is only one washroom, it can satisfy all the customers ' need to go to the washroom. If a customer has been occupying the bathroom and the key is not returned, other customers will not be able to use the bathroom. The Linux kernel has a similar idea of high-end memory management.

5.Linux Kernel High-end memory partitioning
The kernel divides high-end memory into 3 parts: Vmalloc_start~vmalloc_end, Kmap_base~fixaddr_start, and fixaddr_start~4g.

For high-end memory, the corresponding page can be obtained through alloc_page () or other functions, but to access the actual physical memory, the page must be converted to a linear address (why. Consider how the MMU accesses physical memory, which means that we need to find a linear space for the page that corresponds to the high-end memory, a process called high-end memory mapping.

For 3 parts of high-end memory, there are three ways of high-end memory mapping:
map to "kernel Dynamic mapping Space" (noncontiguous memory allocation)
This is simple, because with Vmalloc (), when the "kernel Dynamic mapping Space" is requested for memory, it is possible to obtain a page from high-end memory (see VMALLOC implementation), so that high-end memory may be mapped to "kernel dynamic mapping Space".

Persistent Kernel mapping (permanent kernel mapping)
If you get the page for high-end memory by Alloc_page (), how to find a linear space for it.
The kernel specifically leaves a linear space for this, from Pkmap_base to Fixaddr_start, to map high-end memory. On the 2.6 kernel, this address range is 4g-8m to 4g-4m. This space is called the "kernel permanent mapping Space" or "permanent kernel mapping Space". This space and other space use the same page catalog table, for the kernel, is swapper_pg_dir, for ordinary processes, through the CR3 register point. Typically, this space is 4M in size, so just a page table is needed, and the kernel looks for the page table by pkmap_page_table. With Kmap (), you can map a page to this space. Since this space is 4M in size, you can map up to 1024 page at a time. Therefore, for unused page, and should be released from this space (that is, to de-map), through Kunmap (), a page corresponding to the linear address from the space release.

Temporary mapping (temporary kernel mapping)
The kernel retains some linear space between Fixaddr_start and fixaddr_top for special needs. This space is called a "fixed mapping space" in this space, with a subset of the temporary mappings for high-end memory.

This space has the following characteristics:
(1) Each CPU occupies a piece of space
(2) in the space occupied by each CPU, divided into several small spaces, each small space size is 1 page, each small space for one purpose, these purposes are defined in the kmap_types.h in the Km_type.

When you want to do a temporary mapping, you need to specify the purpose of the mapping, according to the mapping purposes, you can find the corresponding small space, and then the address of the space as a map address. This means that a temporary mapping will cause the previous mappings to be overwritten. Temporary mappings can be implemented by Kmap_atomic ().

6. Frequently Asked Questions:

1. Whether the user space (process) has a high-end memory concept.

User processes do not have high-end memory concepts. Only high-end memory exists in kernel space. A user process can access up to 3G of physical memory, and the kernel process can access all physical memory.

2, the 64-bit kernel has high-end memory.

Currently, there is no high-end memory in the 64-bit Linux kernel because the 64-bit kernel can support more than 512GB of memory. If the machine installs more physical memory than the kernel address space, there will be high-end memory.

3. How much physical memory the user process can access. How much physical memory the kernel code can access.

The 32-bit system user process can access up to 3GB, and the kernel code can access all physical memory.

The 64-bit system user process can access more than 512GB, and the kernel code can access all physical memory.

4, high-end memory and physical address, logical address, linear address relationship.

High-end memory is only related to the logical address, not the logical address, the physical address is not directly related.

5, why not all the address space is allocated to the kernel.

If all the address space is given to memory, then the user process how to use memory. How to ensure that the kernel uses memory and user processes do not conflict.

(1) Let's ignore Linux support for segment memory mapping. In protected mode, we know that regardless of whether the CPU is running in a user or kernel state, the address that the CPU executor accesses is a virtual address, and the MMU must read the value in the control register CR3 as a pointer to the current page directory. This translates the virtual address into a real physical address based on the paging memory mapping mechanism (see related documents) to allow the CPU to actually access the physical address.

(2) for 32-bit Linux, each process has a 4G addressing space, but when a process accesses an address in its virtual memory space, how does the implementation not be confused with the virtual space of other processes? Each process has its own page directory pgd,linux the directory's pointer to the memory structure that corresponds to the process task_struct. (struct mm_struct) in MM->PGD. Each time a process is dispatched (schedule ()), the Linux kernel sets CR3 (SWITCH_MM ()) with the PGD pointer of the process.

(3) When creating a new process, create a new page directory PGD for the new process and copy the kernel interval page directory entries from the kernel's page directory Swapper_pg_dir to the corresponding location of the new Process page directory PGD, as follows:
Do_fork () –> copy_mm () –> mm_init () –> pgd_alloc () –> set_pgd_fast () –> get_pgd_slow () –> memcpy (&PG D + USER_PTRS_PER_PGD, Swapper_pg_dir + USER_PTRS_PER_PGD, (PTRS_PER_PGD-USER_PTRS_PER_PGD) * sizeof (pgd_t))
In this way, the page directory of each process is divided into two parts, the first part of the "User space" to map its entire process space (0x0000 0000-0xbfff FFFF) is the virtual address of 3G bytes, the second part is "system space" for mapping (0xc000 0000-0xffff FFFF) 1G bytes of virtual address. It can be seen that the second part of the page directory of each process in the Linux system is the same, so from a process point of view, each process has 4G bytes of virtual space, the lower 3G bytes are its own user space, the highest 1G bytes are the system space shared with all processes and the kernel.

(4) Now suppose we have the following scenario:
In process A, set the host name of the computer in the network by system call SetHostName (const char *name,seze_t len).
In this scenario, we are bound to involve the transfer of data from the user space to the kernel space, where name is the address in the user space, which is set to an address in the kernel through the system call. Let's take a look at some of the details of this process: the specific implementation of the system call is to put the parameters of the system call into the register Ebx,ecx,edx,esi,edi (up to 5 parameters, the scenario has two name and Len), and then the system call number is stored in the register eax, Process A is then brought into system space by the interrupt instruction "int 80". Since the CPU run level of the process is less than or equal to the ingress level 3 of the trap gate set for the system call, it is possible to enter the system space unimpeded to execute the function pointer System_call () set for int 80. Since System_call () belongs to kernel space, its runlevel is 0,cpu to switch the stack to the kernel stack,
The system space stack for process a. We know that when the kernel creates the TASK_STRUCT structure for the new process, it allocates two contiguous pages, which is the size of 8 K, and uses the size of about 1k at the bottom for task_struct (such as # define ALLOC_TASK_STRUCT () (struct task _struct *) __get_free_pages (gfp_kernel,1)), while the rest of the memory is used in the stack space of the system space, that is, when the system space is transferred from user space, the stack pointer esp becomes (ALLOC_TASK_STRUCT () + 8192), which is why system space is typically defined by a macro (see its implementation) to get the current process
The reason for the task_struct address. Each time the process enters the system space from the user space, the system stack has been pressed into the user stack SS, the user stack pointer esp, EFLAGS, user space CS, EIP, then System_call () eax Press in, and then call Save_all in turn into ES, DS , EAX, EBP, EDI, ESI, EDX, ECX, EBX, and then call Sys_call_table+4*%eax, this scenario is sys_sethostname ().

(5) in Sys_sethostname (), after some protection considerations, call Copy_from_user (To,from,n), where to points to the kernel space system_ Utsname.nodename, such as 0xe625a000,from, point to user space such as 0x8010fe00. Now that process a enters the kernel and runs in the system space, the MMU completes the mapping of the virtual address to the physical address according to its PGD, and finally completes the replication from the user space to the system spatial data. Before copying the kernel to determine the validity of the user's space address and length, the entire interval from the user's space address has been mapped and not checked, if an address is not mapped or read-write permissions in the interval, the problem is considered a bad address,
Causes a page exception to be handled by the page exception service program. The process is as follows: Copy_from_user ()->generic_copy_from_user ()->access_ok () +__copy_user_zeroing ().

Ii. A step-by-step study on the address space of Linux process 2012-05-03 00:00 China it lab Anonymous

Keywords: Linux

We know that the size of the address space of a process in a Linux operating system on a 32-bit machine is 4G, where 0-3g is the user space and 3g-4g is the kernel space. In fact, this 4G address space does not exist, which is what we call virtual memory space.

( we know that the address space of each process of the Linux operating system is independent, in fact, the independence is said to be independent of the physical space.) )

What is the virtual memory space, and how it corresponds to the actual physical memory space, why the virtual memory technology, we can run a larger than the actual physical memory of the application, how it did it. ()

Oh, all of this is a mystery, the following we will step by step to solve the mystery of the heart!

Let's see what happens when we write an application and compile it.

For example:

With the command size a.out will get:

Where text is put in code, data is placed on an initialized global variable or static variable, and BSS puts an uninitialized global variable or static variable

For historical reasons, the C program has been composed of the following parts:

A. Body segment. This is the part of the machine instruction that is executed by the CPU. Normally, body segments are shareable, so even frequently executed programs (such as text-editing programs, C-compilers, shells, and so on) require only one copy in memory, and the body segment is often read-only to prevent the program from modifying its own instructions due to an accident.

B. Initialize the data segment. This segment is often referred to as a data segment, which contains variables that need to be assigned an initial value in the program. For example, a description outside any function in a C program:

int maxcount = 99; (global variable)

C. Non-initialized data segments. This segment is often referred to as a BSS segment, which is derived from an operation of the earlier assembler, which means "block started by symbol", which initializes this segment to 0 before the program begins execution. Description outside the function:

Long sum[1000];

Causes this variable to be stored in a non-initialized data segment.

D. Stacks. The automatic variables and the information you need to save each time the function is called are stored in this section. Each time the function is called, its return address, and the caller's environment information (for example, some machine registers) are stored in the stack. The newly called function then allocates storage space for its automatic and temporary variables on the stack. By using the stack in this way, the C function can be called recursively.

E. Heap. Dynamic storage allocations are typically performed in the heap. Due to the historically established conventions, the heap is located between the top of the non-initialized data segment and the bottom of the stack.

From the above figure we see that the stack space is under growth, the heap space is growing from the bottom, they will meet. Generally not, because they are spaced very large, such as:



int Bss_var;

int data_var0 = 1;

int main ()


printf ("Test location:\n");

printf ("\taddress of Main (Code Segment):%p\n", main);

printf ("_____________________________________\n");

int stack_var0 = 2;

printf ("Stack location:\n");

printf ("\tinitial End of Stack:%p\n", &stack_var0);

int stack_var1 = 3;

printf ("\tnew End of Stack:%p\n", &stack_var1);

printf ("_____________________________________\n");

printf ("Data location:\n");

printf ("\taddress of Data_var (data Segment):%p\n", &data_var0);

static int data_var1 = 4;

printf ("\tnew End of Data_var (data Segment):%p\n", &data_var1);

printf ("_____________________________________\n");

printf ("BSS location:\n");

printf ("\taddress of Bss_var:%p\n", &bss_var);

printf ("_____________________________________\n");

printf ("Heap location:\n");

Char *p = (char *) malloc (10);

printf ("\taddress of Head_var:%p\n", p);

return 0;


The results of the operation are as follows:

Oh, here we see the address, this address is a virtual address, these addresses when how to come. In fact, when we compile,

These addresses have been identified, as shown in the red line below.

That is, we no matter how many times we run the A.out program, these addresses are the same. we know that the address space of each process of the Linux operating system is independent, in fact, the independence is said to be independent of the physical space. the same virtual address, different physical address, how they relate to each other. We continue to explore ...

In the Linux operating system, each process is described by a TASK_STRUCT structure, each process's address space is described by a mm_struct, each section of the C language space is represented by vm_area_struct, their relationship is as follows:

When running a program, the operating system needs to create a process, and what is done between the process and the program.

When a program is executed, the contents of the program must be placed into the virtual address space of the process, as is the case with the shared library of executable programs. An executable program is not actually read into physical memory, but only in virtual memory that is linked to the process.

When an executable program maps to a process virtual address space, a set of VM_AREA_STRUCT data structures is generated. Each VM_AREA_STRUCT data structure represents part of an executable impression, an executable code, or initialized data, and uninitialized data.

The Linux operating system maps and reads executable files through Sys_exec, with the following steps:

1. Create a set of vm_area_struct

2. Delimit a virtual user space and save its starting end address (set in the Elf section) to Vm_start and Vm_end.

3. Save the disk file handle in Vm_file

4. Save the corresponding segment's offset value in the disk file (set in Elf section) in Vm_pgoff;

5. Save the disk operation function of the disk file in Vm_ops

Note: There is no corresponding page catalog table entry to create the page table, and there is no Set page table entry.

Suppose now that there is a command in the program to read something between the vm_start–vm_end above

For example: mov [0x08000011],%eax, then the following sequence will be executed:

1.CPU based on CR3 (CURRENT->PGD) to find 0x08000011 address corresponding pgd[i], because the pgd[i] content remains initialized state is 0, resulting in a CPU exception.

2.do_page_fault is called, in this function, for Pgd[i], allocates a page table in memory and points the table entry to it, as shown in the following figure:

Note: Here I is 0x08000011 high 10 bits, J for its middle 10 bits, at this time the PT table entries are all 0 (Pte[j] also 0);

3. Assign a real physical memory page for PTE[J], and invoke Vm_ops to filemap_nopage the disk file according to Vm_file, Vm_pgoff, and vm_ in Vm_area_struct The contents of the Pgoff offset are read into the physical page, as shown in the following figure:

①. Allocate physical memory pages;

Ii. Read content from a disk file to a physical memory page

From the above we can know that during process creation, the program content is mapped to the virtual memory space of the process, in order to allow a large program to run in a limited amount of physical memory space, we can put the beginning of this program to load into the physical memory space to run, because the operating system handles the virtual address of the process, If the physical address is found to be in a virtual-to-Physical address conversion project, a page fault (nopage) will occur at this time, and then the operating system will load data that is not yet loaded into memory on the disk into physical memory, and the corresponding process pages table is updated. You might ask, if the physical memory is full, how the operating system will handle it.

Let's look at how the Linux operating system is handled:

If a process wants to load a virtual page into physical memory without available free physical pages, the operating system must retire other pages in physical memory to make room for the page.

In the Linux operating system, the physical page is described as follows:

struct MEM_MAP


1. This page uses a count, which counts more than 1 when the page is shared by many processes.

2.age describes the age of this page and is used to determine if it is a good candidate for elimination or exchange.

3.map_nr describing the page frame number of the physical page


If a page that is retired from physical memory comes from an image or data file and has not been written yet, the page does not have to be saved and it can be discarded. If a process needs the page, it can retrieve the memory from the image or data file.

However, if the page has been modified, the operating system must retain the contents of the page for later access. This kind of page is called "Dirty (Dirty) page", when it is removed from memory, it will be saved in a special file called swap file.

Relative to the speed of the processor and physical memory, access to the swap file takes a long time, and the operating system must spend it on writing the page to disk and retrieving the memory when it is reused.

If the algorithm used to determine which page is being eliminated or swapped is not efficient enough, a situation called "jitter" may occur. In this case, the page is always written to disk and read back, and the operating system is too busy to do the real work.

Linux uses the "least recently Used (Least recently used, LRU)" page scheduling technique to choose which page to use fairly to remove from the system. Each page in this design system has an "age", and the age varies with the page being accessed. The more pages are accessed, the younger they are, and the less they are accessed, the older they become. The old page is the best candidate page for swapping.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.