Deep understanding of Linux memory management

Source: Internet
Author: User

Reference: Live and learn

1. Memory address

In the case of Intel's central processor, the basic unit of physical memory in a Linux 32-bit system is byte (byte) and 1 bytes have 8 bits. Each memory address points to a byte, and the memory address plus 1 Gets the address of the next byte. The address used here to represent the physical memory's actual location is the physical address that is usually said to be physical. The process code, process data, and stack area data that the CPU is executing are temporarily persisted in physical memory.

The linear address (Linear address, or virtual addresses) is for the following reasons

(1) Isolating the memory address space used by different processes;
(2) Improve the memory utilization rate;
(3) Determine the address of the program when it is run;

(4) Expansion of memory, that is, running the required inner layer is larger than the physical memory of the program

The middle tier is added between the physical address and the program. The virtual address range corresponds to the CPU's addressing ability, the virtual address range of the 32-bit CPU is 0x00000000 ~ 0xFFFFFFFF, that is, the maximum virtual memory is 2^32 Bytes = 4GB, the corresponding 64-bit CPU maximum virtual memory is 2^64 Bytes, However, most of the current operating systems and applications do not require such a large virtual address space, and the 64-bit long address increases the complexity of the system and address conversion costs, so the current x86-64 architecture uses only the virtual address low 48 bits (0 ~ 47) as the virtual address, and the value of the 47th bit to fill 48 ~ 63 High, so the maximum virtual memory for a 64-bit CPU is 2^48 = 256TB. In general, the physical address space is just a subset of the virtual address space.

In order to improve the efficiency of memory management, play the role of virtual space, can set the CPU CR0 register the highest bit (PG, paging flag bit), enable paging mechanism to divide virtual space into several pages, and then manage and use virtual space by page frame. Physical memory Specifies that the page size is 4096 bytes,8192 BYTES,2MB, 4MB, and so on, because the virtual space pages are actually stored in physical memory, so the virtual space is also used to paging the size. The normal paging size is in the 4KB standard.

In modern computer systems, programmers do not need to manipulate the physical address directly, but instead, the operating system assigns the virtual address for the process to be executed by the page frame. Each page frame can be mapped to any available physical memory page. CPU in the process of executing the program, the CPU issued to the corresponding virtual address read or write operations, hardware equipment (MMU, memory management unit, generally integrated on the CPU chip) after analyzing the Virtual Address Query page table and calculate, the virtual address mapped to a physical address, Then through the North Bridge chip (North Bridge chip main function is responsible for the CPU and physical memory between the communication) to connect the memory bus, so that the CPU can access to the physical memory of the process code and data.

The logical address (Linear addresses) refers to the address offset within the program. This address is based on the program entry address assigned by the operating system, specifying the address of the operand or instruction in the program. The logical address is the address that the programmer manipulates directly, for example, in C programming, an int variable is defined, and then the address obtained using the FETCH address operator (&VAR) is the logical address.

The logical address consists of the segment selector (Segment Selector) and the Intra-segment offset (offset), which is a 16-bit (2-byte) unsigned number, and the segment offset is a 32-bit unsigned number. The contents of the segment selector are as shown.

Figure 1 Segment Selector fields< yo? " http://www.2cto.com/kf/ware/vc/"target=" _blank "class=" Keylink "> vcd4kcjxomt4ylide2rtmudza7daut9a2zrv61sy8l2gxpgokpha+ts6jqfnlz21lbnrhdglvbqopysfu2rboyr3e2rtmudza7bxeumxe7s/ c0m6zybxeyvxt76gjts7kvbncwo21xlv5sb7lvm/rysew0bpm0ple2sjdu/k5/bpmudjptbfws8m2zqoswp3i57t6wuu2zqgiyv2+ 3bbotcijrllz1/fptc2zsls2zs6qvfizzlfwxetq6ctitdjwt7/ vvosjrnxi0fmyu82svfizzlbov9w85lk7zayjrmq1z9bby734s8y49mdrus3e2rtmsao7pkgjw7+49rbotrzt0nfuvlq1xmpoyva3+ 6oou2vnbwvudcbezxnjcmlwdg9ylca4lwj5dgugbg9uz6opo6zv4tcpw+jk9rf7sbuxo7tm1nriq77ww+ jk9rf7se2jqedsb2jhbcbezxnjcmlwdg9yifrhymxllcbhrfsjqbvyvtayv8poyva3+ 7hto6hmb2nhbcbezxnjcmlwdg9yifrhymxllcbmrfsjqdbqoao2zspoyva3+7xexnri3cjnzbwyy/nkvqgjpc9wpgogpgltzybzcmm9 "http:/ /www.2cto.com/uploadfile/collfiles/20141113/2014111309173162.png "alt=" \ ">

Figure 2 Segment Descriptor

Base address is a total of bits, which points to the linear address of the first byte of the current segment. The Limit section is a total of bits, which indicates the offset of the last byte of the virtual space relative to the first byte, so it can also represent the length of the segment. Different from the page (fixed to 4KB, etc.), the length of the segment varies according to the program's corresponding content. In addition, if the flag bit G is set to 0, then the offset increases by 1 and the address value increases by 1 byte, then the maximum length for this period is 1 byte * 2^20 = 1MB; If the flag bit G is set to 1, then the offset is 1, the address value is increased by 4 KB, and the corresponding maximum length is 4KB * 2 ^20 = 4GB.

During the execution of a process, when an instruction that requires access to memory is encountered, the corresponding linear address is obtained based on the logical address, and then the physical address is obtained based on the linear address. The linear address is obtained according to the logical address as shown in procedure 3.

Figure 3 Translating a Logical Address

The CPU provides the CS register to temporarily hold the segment selector of the executing process snippet, the DS register temporarily saves the segment selector of the process data segment, and the SS register temporarily saves the segment selector for the stack segment. Thus, when the logical address is converted, the CPU is based on the segment selector currently stored in CS (see Figure 1), where the TI flag determines whether the segment descriptor is located in the GDT or Ldt,index section to determine the position of the segment descriptor in the table (GDT or LDT) so that the segment descriptor corresponding to the logical address can be found The starting linear address of the segment is found based on the base address in the segment descriptor, and the actual linear address pointed to by the instruction is obtained by using the start and the offset in the instruction logical address.

Because of the fragmentation mechanism associated with Intel processors, segmented memory management may not be supported on other hardware systems, so in Linux, the operating system tends to manage memory in the same way that it uses paging. In user mode, all processes share user code snippets and user data segments. In user mode, all processes use the base address portion of the segment descriptor of the code snippet to point to the linear address 0x00000000, while the base address portion of the segment descriptor of the data segment also points to the linear address 0x00000000; in kernel mode (Kernel mode), all processes share the kernel code snippet and the kernel data segment. Kernel All processes use the base address portion of the segment descriptor of the code snippet to point to the linear address 0x00000000, while the base address portion of the segment descriptor of the data segment also points to the linear address 0x00000000. The G-bit of the above segment descriptor is set to 1, and the corresponding virtual space for the segment is from 0 to 2^32, corresponding to the maximum virtual space for the entire 32-bit CPU.

The above method solves the situation that the other hardware platform does not support the segment management, greatly simplifies the address translation operation, but because theoretically each process's available linear space range is 4G, that is, the process common segment table, uses the segment boundary to isolate the process memory the goal cannot realize. Therefore, in Linux, each process is assigned a separate page table, relying purely on paging mechanisms to provide memory protection and process isolation. Next, the paging mechanism is described in detail.

3. Paging mechanism of memory management

The paging mechanism takes the entire linear address space and the entire physical memory as a number of storage blocks of the same size, and manages the blocks as pages (each unit is called a page after the virtual space is paged) or page frames (each unit is called a page frame after the physical memory page is paged). Regardless of the memory access permissions, any page of the linear address space can theoretically be mapped to any one of the page frames in the physical address space. The most common paging method is to divide pages in 4KB units, and to ensure that page address boundaries are aligned, that is, the starting address of each page should be divisible by 4 K. The entire virtual space of a 32-bit machine is divided into 2^20 pages under the 4KB page unit. Because virtual addresses are all mapped to page frames of the same size by page, and page boundaries are aligned, the last 12 bits of the virtual address can be used directly as low 12 bits of the physical address.

To save the memory space required to store the page table (2^20 * 4 B = 4M), the 32-bit operating system often uses a two-level page table structure to record the status of virtual address space paging. So each virtual address is made up of three parts, the high 10 bits are the index of the content in the page directory, the middle 10 bits are the page table index, and the lower 12 bits are the offset of the corresponding physical address in the page frame.

Figure 4 Paging mechanism

The page directory is saved in the CR3 register and can be accessed directly. A 32-bit page catalog entry with a linear address high 10 bits is accessed and retrieved directly for the corresponding index. The 32-bit page directory entry in structure 4 in the page Directory section shows that the high 20 bits of the catalog entry are used to give a high 20 bits of the physical address of the page table that corresponds to the catalog item in memory, and 1024 catalog items can give the entry address of 1024 page tables exactly. The low 12 bits of the catalog entry are some flag bits, where the P flag indicates whether the page table corresponding to the current catalog item is in memory, the U flag indicates the access rights for the page that corresponds to the current catalog item, and the S flag indicates that the page size is 4KB or 4MB, and so on. Also, because each page catalog entry has a length of 32 bits, or 4 bytes, there are 1024 page catalog entries in the page catalog, so the total size of the page catalog is 4KB.

The page table is saved in memory. The length of the page table entry is 32 bits, and each page table has 1024 page table entries, which results in a 4 KB size per page table. When a page table is stored in memory, it is aligned with the size of the physical paging (4KB), so the last 12 bits of the physical memory where each page table resides is 0. The high 20 bits of the physical address are specified by the high 20 bits in the page catalog entry corresponding to the page table, so that the page table in the physical memory can be found. After the page table is found, the 32-bit page table entry corresponding to the index is retrieved for the median 10 bits of the linear address. Similar to page catalog entries, the high 20 of the page table entry is used to give a high 20 bits of the starting physical address of its corresponding page frame. The low 12 bits of the page table entry are about the page's flag bits.

The page frame corresponds to the physical memory. After finding the high 20 bits of the starting physical address of the page frame based on the previous two steps, because the physical memory is divided into page frames by 4KB size, the lower 12 bits of the starting physical address of the page frame are 0. This makes the top 20 bits plus 12 bits lower, getting the starting physical address of the page frame. When a page frame is found, the lower 10 bits of the linear address are used as offsets, and the physical address corresponding to the linear address is found after the starting physical address of the page frame. It is important to note that the correspondence between page frames and page table entries is not deterministic, the page to which the page table entry points is first a virtual page, and then the contents of the virtual page are stored in any appropriate page frame.

The operating system assigns virtual address ranges to each process by page, theoretically using up to 4G of virtual memory according to the program. However, because the operating system needs to protect kernel process memory, the kernel process virtual memory and the user process virtual memory are separated, the former free space is 1G virtual memory, the latter is 3G virtual memory. When the process executes, the page directory of the page to which the operating system is assigned is loaded into the CR3 register, and the page table is loaded into physical memory. During the process of translating a linear address into a physical address, the paging cell checks to see if the current process has permission to access the page, and if the page data corresponding to the linear address is in physical memory, the paging unit will generate a page fault exception if the above check condition has not been passed, thereby aborting the process or loading the corresponding paging data into physical memory.

4. Physical Address Extension

Physical Address extensions (physical addresses Extension) are a unique form of virtual address paging on Intel 32-bit CPUs. Theoretically, the 32-bit CPU has 32 memory addressing lines, which can access up to 4G of physical memory; in fact, in a Linux system, a user-mode program requires a linear address space, so that the kernel can access up to 1G of physical memory directly. However, with the development of computer software, a 32-bit computer may run many processes at the same time, and these concurrently running processes will require more than 4G of memory, so Intel has increased its 32-bit CPU to 36 lines, a total of 4, so that the amount of physical memory supported by the CPU increased to 2^ 36, which is 64GB. While extending the physical memory, keep the virtual address space range to 4G unchanged. This allows 32-bit applications to continue to use 32-bit addresses, and the maximum virtual memory that each process can use is still 4GB.

64GB Physics exists under 4KB paging, is divided into 2^24 page frames, the starting physical address of each page frame after 12 bits is still 0, but the first 24 bits need to be provided by the page table. And we know that in regular paging, the page table entry is 32 bits, which can only provide 20 bits as its high 20-bit physical address to the page frame, and does not meet the addressing needs of the 36-bit system. This problem can be solved by increasing the total length of the page table entries, in order to ensure that the 4KB boundary is aligned, we increase the length of the page table entry to 64 bits, 8 bytes (not just the 36 bits needed), and the page table size remains 4KB, then there are only 512 page table entries (2^12/8) in a page table.

Accordingly, the page directory also adapts to the 36-bit physical memory addressing capability, each page directory entry length also becomes 64 bits, the page directory size remains 4KB, and a page directory has only 512 page catalog entries. Such a page directory can retrieve a total of 512x512=2^18 pages, and the virtual address space has a total of 2^20 pages, so a total of 4 page catalogs are required.

A new hierarchy is added between the CR3 controller and the page directory, and this new hierarchy is the page Directory pointer table (page directory Pointer table). The page Catalog pointer table has four 64-bit length pointers pointing to the 4 page directories described above. The page directory pointer table is loaded onto the first 4GB of 64GB memory (Physical address 0x00000000 ~ 0xFFFFFFFF), and CR3 is the starting physical address of the page directory pointer table.

When physical extended addressing is turned on, the way to convert a linear address to a physical address is significantly different from the previous one, as shown in procedure 5.

Figure 5 Linear Address translation with PAE

First, the physical address of the page pointer table is obtained by CR3, and then the page directory is indexed as 30 ~ 31 bits of the linear address. The next 21 ~39 bits (a total of 9 bits, which exactly provide an index of all 512 page catalog entries) can help find a linear address for the page table, 12 ~ 20 (total 9 bits, which exactly provide the physical address of all 512 page table entries) can help find the physical address of the page frame corresponding to the linear address.

5. Paging mechanism for 64-bit operating systems

The 64-bit machine's addressing capability is 2^64 Bytes, but the actual use of so much virtual memory, the use of 64-bit addressing will also cause increased addressing time, memory space waste and other adverse factors, so in practical applications, 64-bit machine using 48-bit addressing mode (maximum support 256TB physical memory). Similarly, the physical memory is divided into 4KB-sized page frames, which requires a 48-12 = 36-bit physical address high to determine the page frame position. In order to reduce the physical memory required to store the page table, memory access can be achieved by adding two pages of the directory layer to distribute the page table. In Linux, this is achieved using a 4-layer paging approach.

The
system creates a process that executes the program, then loads the program or program fragment, and then starts executing the code snippet sequentially. In this process, the operating system generally does three things:

(1) Create a separate virtual address space (range) for the process

For example, in the general paging state of 32-bit systems, the operating system discovers the sum of instructions and data of the program to be executed is 32KB, then the operating system allocates 8 pages of virtual memory space for the process, assigns the page directory and the page table, loads the page directory into the CR3, loading the page table used by the process into memory. However, instructions and data are not loaded into memory.

(2) Read the program executable file header and establish a mapping between the virtual space and the code snippet in the executable file, the logical address of the data segment

This step maps the program directives and data into the virtual memory space.

(3) Set the instruction register of the CPU to the entry address of the executable file, start running

When the procedure is executed, if the current instruction or data is in the virtual address space and is not actually in physical memory (no instruction or data is loaded into physical memory in the first two steps), a page fault will occur, and then the operating system allocates an idle physical page frame from the physical memory. and the data corresponding to the virtual address page is loaded into the physical page frame from the disk copy, and the mapping relationship between the page table item and the page frame is established. As the process executes, page faults continue to occur, and the operating system responds to each page fault and allocates physical memory page frames to the process. However, physical memory is limited, and the amount of physical memory allocated for a process can be limited. After all available physical memory is allocated to the process, if the process continues to throw page faults to request more physical memory, then the operating system according to its own page substitution operation algorithm, in order to ensure the normal operation of the process, the previous allocated to the process of the physical memory page frame retracted, re-assigned to the process.

Deep understanding of Linux memory management

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.