Linux memory management overview

Source: Internet
Author: User
Article Title: linux memory management overview. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.

First, let's look at the memory space in linux from the process perspective.

**************************************** **************************************** ***

Address Space in Linux (1)

Is there a series of problems that affect you: what is the scope of the address space formed by the user program compilation connection? What is the scope of the address space after the kernel is compiled? What is the I/O address space for access to peripherals?

Answer the first question first. The most common Executable file Format in Linux is elf (Executable and Linkable Format ). In the executable code in elf format, ld always schedules the "code segment" of the program starting from 0x8000000, which is true for each program. As for the actual address in the physical memory during program execution, the memory ing is temporarily allocated by the kernel. The specific address depends on the physical memory page allocated at that time.

We can use the Linux utility objdump to disassemble your program to know its address range.

For example, suppose we have a simple C program Hello. c.

# Include

Greeting (){

Printf ("Hello, world! \ N ");

}

Main (){

Greeting ();

}

This simple program is written into two functions to illustrate the instruction transfer process. We use gcc and ld to compile and connect them to get the executable code hello. Then, use the Linux utility objdump to disassemble it:

$ Objdump? D hello

The main snippets are as follows:

08048568 :

8048568: pushl % ebp

8048569: movl % esp, % ebp

804856b: pushl $0x809404

8048570: call 8048474 <_ init + 0x84>

8048575: addl $0x4, % esp

8048578: leave

8048579: ret

804857a: movl % esi, % esi

0804857c :

804857c: pushl % ebp

804857d: movl % esp, % ebp

804857f: call 8048568

8048584: leave

8048585: ret

8048586: nop

8048587: nop

Among them, an address like 08048568 is a virtual address that we often say (this address actually exists, but it seems "virtual" because of the existence of the physical address ).

Virtual Memory, kernel space, and user space

The Linux virtual memory size is 2 ^ 32 (on a 32-bit x86 machine). The kernel divides the 4G Byte space into two parts. The maximum 1 GB (from the virtual address 0xC0000000 to 0 xFFFFFFFF) is used by the kernel, which is called the "kernel space ". The low 3G bytes (from the virtual address 0x00000000 to 0 xBFFFFFFF) are used by various processes, which are called "user space ". Because each process can enter the kernel through a system call, the Linux kernel space is shared by all processes in the system. Therefore, from the perspective of a specific process, each process can have 4 GB of virtual address space (also called virtual memory ).

Each process has its own private user space (0 ~ 3G), this space is invisible to other processes in the system. The maximum 1 GB kernel space is shared by all processes and kernels. In addition, the process's "user space" is also called "address space". In the subsequent descriptions, we will not distinguish these two terms.

User space is not shared by processes, but isolated by processes. Each process can have a maximum of 3 GB user space. A process's access to one of the addresses will never conflict with other processes's access to the same address. For example, a process can read an integer 8 from its user space address 0x1234ABCD, while another process can read an integer 20 from its user space address 0x1234ABCD, this depends on the logic of the process.

At any time point, only one process is running on one CPU. Therefore, for this CPU, there is only one 4 GB virtual address space in the system at this time, and this virtual address space is for this process. When a process is switched, the virtual address space also changes. It can be seen that each process has its own virtual address space, and its virtual address space is known only when the process is running. At other times, its virtual address space is unknown to the CPU. Therefore, although each process can have 4 GB of virtual address space, there is only one virtual address space in the eyes of the CPU. Changes in the virtual address space as the process switches.

From the above we know that the address space formed after a program is compiled and connected is a virtual address space, but the program will eventually run in the physical memory. Therefore, any virtual address provided by the application must be converted to a physical address. Therefore, the virtual address space must be mapped to the physical memory space, this ing relationship needs to be established through the data structure specified by the hardware architecture. This is what we call the segment descriptor table and page table. in Linux, the page table is used for ing.

Therefore, we come to the conclusion that if the given page table is different, the CPU will convert the addresses in a virtual address space into physical addresses. Therefore, we create a page table for each process and map the virtual address space of each process to the physical address space as needed. Since only one process can be running on a certain CPU at a time, when the process is switched, the page table is also changed to the page table of the corresponding process, this allows each process to have its own virtual address space without affecting each other. Therefore, at any time, for a CPU, you only need to have the page table of the current process to convert its virtual address to the physical address.

Linux kernel space (2)

Ing from kernel space to physical memory

The kernel space is shared with all processes, where kernel code and data are stored, and the user space of the Process stores the code and data of the user program, whether it is a kernel program or a user program, the commands and symbolic addresses formed after they are compiled and connected are virtual addresses (see the example in section 2.5), rather than physical addresses in the physical memory.

Although the kernel space occupies a maximum of 1 GB in each virtual space, the ing to physical memory always starts from the lowest address (0x00000000), as shown in 4.2, it is used to establish a simple linear ing relationship between the kernel space and the physical memory. 3 GB (0xC0000000) is the displacement between the physical address and the virtual address. It is called PAGE_OFFSET in Linux code.

Let's take a look at the description and definition of address ing in the kernel space in the include/asm/i386/page. h header file:

# Define _ PAGE_OFFSET (0xC0000000)

......

# Define PAGE_OFFSET (unsigned long) _ PAGE_OFFSET)

# Define _ pa (x) (unsigned long) (x)-PAGE_OFFSET)

# Define _ va (x) (void *) (unsigned long) (x) + PAGE_OFFSET ))

For kernel space, given a virtual address x, its physical address is "x-PAGE_OFFSET". Given a physical address x, its virtual address is "x + PAGE_OFFSET ".

It is explained again that the macro _ pa () only maps the virtual address of a kernel space to a physical address, but it is not applicable to the user space. The address ing of the user space is much more complex, it is done through the paging mechanism.

The kernel space is 3 GB ~ 4 GB, which is divided into the following parts.

First, describe the meaning of the symbol:

PAGE_OFFSET: 0XC0000000, that is, 3 GB

High_memory: the literal meaning of this variable is high-end memory. What is high-end memory? According to Linux kernel, the first 896 of RAM is called low-end memory, while the first 896 ~ A total of MB of 1 GB memory is high-end memory. If your memory is 512 M, What Is high_memory? It is 3 GB + 512 M, that is, the physical address x <= 896 M, there is the kernel address 0xc0000000 + x. Otherwise, high_memory = 0xc0000000 + 896 M

Or the actual value of high_memory is 0xc0000000 + x, but the maximum value cannot exceed 0xc0000000 + 896 M.

In the source code, the mem_init function has the following line:

High_memory = (void *) _ va (max_low_pfn * PAGE_SIZE );

Max_low_pfn indicates the maximum number of pages in the physical memory.

Therefore, in the figure, the physical memory ing is between PAGE_OFFSET and high_memory. Only the physical and virtual addresses in a segment are in a simple linear relationship.

It should also be noted that the kmalloc () function is called to allocate memory during this period. Conversely, the physical pages of the memory allocated through kmalloc () are continuous.

Void * kmalloc (size_t size, gfp_t flags );

* The @ flags argument may be one:

* % GFP_USER-Allocate memory on behalf of user. May sleep.

* % GFP_KERNEL-Allocate normal kernel ram. May sleep.

* % GFP_ATOMIC-Allocation will not sleep.

VMALLOC_START: the starting address of the discontinuous zone.

VMALLOC_END: end address of the discontinuous Area

In the discontinuous area, there is an 8 Mb security zone between the end of the physical memory ing and the first VMalloc to "capture" out-of-bounds access to the memory. For the same reason, insert the other 4 kb security zone to isolate the discontinuous zone.

The vmalloc () function is called for the allocation of discontinuous zones.

Void * vmalloc (unsigned long size );

Both vmalloc () and kmalloc () are functions used to allocate memory in kernel code, but what is the difference between them?

We can see from the previous introduction that the memory allocated by these two functions is in the kernel space, that is, from 3 GB ~ 4 GB; but the location is different, the memory allocated by kmalloc () is 3 GB ~ High_memory: the kernel space and physical memory are mapped one by one, and VMALLOC_START ~ exists in the vmalloc () allocation ~ Between 4 GB, the non-contiguous memory area mapped to the physical memory may also be non-consecutive.

Vmalloc () works in a similar way as kmalloc (). The main difference is that the physical address allocated by the former does not need to be consecutive, the latter ensures that pages are physically consecutive (virtual addresses are also consecutive ).

Although physical contiguous memory blocks are only required in some cases, many kernel code calls kmalloc () instead of vmalloc () to obtain the memory. This is mainly because of performance considerations. The vmalloc () function must create a page table to convert a physically discontinuous page into a continuous page in a virtual address space. Also, the pages obtained through vmalloc () must be mapped one by one (because they are not physically consecutive), which leads to a buffer refresh that is much larger than direct memory ing. For these reasons, vmalloc () is used only when absolutely necessary-typically to obtain a large memory, for example, when a module is dynamically inserted into the kernel, load the module to the memory allocated by vmalloc.

[1] [2] [3] Next page

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.