CPU memory management and Linux memory paging mechanism

Source: Internet
Author: User

First, the basic concept

Physical addresses (physical address) are used for memory chip-level cell addressing, which corresponds to the address bus that the processor and CPU are connected to. This concept should be one of the best understanding of these concepts, but it is worth mentioning that although the physical address can be understood directly into the machine on the memory itself, the memory as a large array of bytes from 0 bytes to the maximum empty byte, and then the array is called the physical address, but in fact, This is just a hardware-to-software image, and the memory is not addressed in this way. So, it is "relative to address bus", is more appropriate, but aside from the physical memory addressing mode of consideration, the physical address directly to the physical memory one by one corresponds, is also acceptable. Perhaps the wrong understanding is more favourable to the metaphysical image.

       This is a description of the entire memory (not the one that is plugged into the machine). It is relative to physical memory, can be directly understood as "unreal", "false" memory, for example, a 0x08000000 memory address, it is not on the physical address of the large array 0x08000000-1 that address element; Because modern operating systems provide a memory-managed image, virtual memory. The process uses the address in virtual memory, which is assisted by the operating system to "transform" it into a real physical address. This "conversion" is the key to all the issues discussed. With such a pump, a program can use a much larger address space than the real physical address. (Rob Peter, pay Paul, banks do the same), and even multiple processes can use the same address. Not surprisingly, since the converted physical address is not the same. --You can decompile the connected program and see that the connector has assigned an address to the program, for example, to invoke a function A, the code is not call a, but called 0x0811111111, that is, the address of function A has been fixed. There is no such thing as a "transform", there is no concept of virtual address, and this does not work at all. Hold on, the question goes on again, and it won't hold up.

logical addresses (logical address) For compatibility, Intel retains the ancient memory management of the segment. A logical address is a machine language instruction used to specify an operand or an address of an instruction . In the example above, we say that the 0x08111111 of the connector is the logical address of the assigned address. --but I'm sorry to say, it seems to go against the Intel Middle-management, the logical address requirements, "a logical address, is a segment identifier plus a specified segment of the relative address offset, expressed as [segment identifier: Intra-paragraph offset], that is, the 0x08111111 in the above example, should be represented as [a code snippet identifier: 0x08111111], so that it is complete "

A linear address, or also called a virtual address , is similar to a logical address, which is also an linear address, if the logical address is the corresponding hardware platform segment to manage the pre-conversion address, Then the linear address corresponds to the pre-conversion address of the hardware page memory. -------------------------------------------------------------the CPU translates the address in one virtual memory space into a physical address, Two steps are required: first, given a logical address (in fact, the offset within the paragraph, this must be understood!!!) ), the CPU uses its segment memory management unit to convert a logical address into a linear address and then use its page-based memory management unit to convert to the final physical address. Doing this two conversions is really cumbersome and unnecessary because you can directly draw a linear address to the process. The reason for this redundancy is that Intel is completely compatible.

2. CPU segment memory management, how to convert logical address to linear address

A logical address consists of two parts, the segment identifier: The offset within the segment. A segment identifier is made up of a 16-bit long field called a segment selector. The first 13 bits are an index number. The following 3 bits contain some hardware details, and the last two bits involve permission checks. index number, or directly understand the array subscript-then it always corresponds to a set of the bar, it is what the index of what? This thing is "paragraph descriptor (segment descriptor)", hehe, the specific address of the segment descriptor describes a paragraph (for the "paragraph" of the word understanding, I think of it, took a knife, the virtual memory, cut into a number of sections--). In this way, a number of segment descriptors, a set of an array, called "segment Descriptor Table", so that you can go through the first 13 bits of the segment identifier, directly in the Segment descriptor list to find a specific segment descriptor, this descriptor describes a paragraph, I just the image of the paragraph is not very accurate, because look at what the descriptor inside That is how it is described, to understand what the paragraph really is, each segment descriptor consists of 8 bytes, these things are very complex, although you can use a data structure to define it, but I only care about the same here, is the base field, which describes the beginning of a segment of the linear address. Intel designed the idea that some global segment descriptors are placed in the Global Segment Descriptor List (GDT), and some parts, such as each process's own, are placed in the so-called "local segment Descriptor List (LDT)". When should the GDT be used, and when should the LDT be used? This is represented by the T1 field in the segment selector, = 0, which indicates the use of the LDT with Gdt,=1. The address and size of the GDT in memory are stored in the GDTR control register of the CPU, while the LDT is in the LDTR register. A lot of concepts, like tongue twisters.

First, given a complete logical address [segment selector: offset address within paragraph], 1, see the segment selector t1=0 or 1, know that the current is to be converted into a GDT in the segment, or the LDT in the section, and then according to the corresponding register, to get its address and size. We've got an array. 2, take out the segment selector in the first 13 bits, you can find the corresponding segment descriptor in this array, so that it is base, that is, the base address to know. 3, the base + offset, is to convert the linear address. is quite simple, for the software, in principle, the need to convert the hardware required to prepare the information, you can let the hardware to complete the conversion. OK, let's see what Linux does.

3. Linux Segment Management

Intel requires two conversions, which is compatible, but it is very redundant, oh, no way, hardware requirements to do so, software can only comply, how to have the same formalism. On the other hand, some other hardware platforms do not have the concept of two conversions, and Linux also needs to provide a high-level image to provide a unified interface. So, the segment management, in fact, just "coaxing" a bit of hardware. According to Intel's original intent, the global use of GDT, each process of its own ldt--but Linux for all processes using the same field for instruction and data addressing. That is, user data segment, user code snippet, corresponding, kernel data segment and kernel code snippet. There is nothing strange about this, it is formality, as we write the year-end summary.

It is important to conclude that "under Linux, the logical address is always consistent with the linear address (consistent, not the same as some people say), that is, the value of the offset field of the logical address is always the same as the value of the linear address." "Too many details, such as permission checks for segments, are ignored. Oh. In Linux, most processes do not use the LDT, unless you are using wine to simulate Windows programs.

4.CPU of page-memory management

The page-type memory management unit of the CPU, which is responsible for translating a linear address into a physical address. From the point of view of management and efficiency, the linear address is divided into fixed-length units, called pages (page), such as a 32-bit machine, the linear address can be a maximum of 4G, the use of 4KB as a page to divide, this page, the entire linear address is divided into a tatol_page [2^20] A large array of 2 of the 20 pages in a single page. This large array we call the page directory. Each directory entry in the directory is an address-the address of the corresponding page. Another type of "page", which we call a physical page, or a page box, page frames. It is the paging unit that divides all physical memory into fixed-length management units whose length is typically one by one corresponding to the memory page. Note here that this total_page array has 2^20 members, each member is an address (32-bit machine, one address is 4 bytes), then to represent such an array, it takes up 4MB of memory space. To save space, a two-level management-mode machine was introduced to organize paging units.

1, paging Unit, the page directory is unique, its address in the CPU's CR3 register, is the starting point for address translation. The long march began to grow.

2, the process of each activity, because all have its own corresponding virtual memory (page directory is also unique), then it also corresponds to a separate page directory address. --run a process that needs to place its page directory address in the CR3 register and save it.

3. Each 32-bit linear address is divided into three parts, the Polygon directory index (10-bit): Page table index (10-bit): offset (12-bit) is converted according to the following steps:

1, remove the page directory address of the process from CR3 (the operating system is responsible for the scheduling process, the address of the corresponding register), 2, according to the first ten linear address, in the array, find the corresponding index entry, because the introduction of the two management mode, the page directory of the item, is no longer the address of the page, but a page table (An array is introduced, and the address of the page is placed in the page table.) 3, according to the middle 10 bits of the linear address, in the page table (also array) to find the starting address of the page, 4, the starting address of the page and the last 12 bits of the linear address to add, the final we want to get the gourd; This conversion process, should be said to be very simple. All by hardware, although a number of procedures, but save a lot of memory, or worthwhile. Then simply verify that: 1, such a level two mode can still represent the address of 4G; page Directory total: 2^10, that is, there are so many page table each table corresponds to: 2^10 page; addressable: 2^12 bytes per page. or 2^32 = 4GB2, such a level two mode is really saving space; that is, the page Catalog item and page table item occupy space (2^10 * 4 + 2 ^10) = 8KB. Hey...... How to say it!!! Red error, mark it, after this discussion in the post ... It is worth mentioning that, although the page directory and the page table items are 4 bytes, 32 bits, but they are only high 20 bits, low 12-bit shielding for 0--to the page table of the low 12 shield to 0, it is very well understood, because it is just a page size corresponding to each other, everyone into an integer increase. It's a lot easier to calculate. However, why should you also screen the page directory low 12 bit off it? Because by the same token, just block its low 10 bit on it, but I think, because 12>10, so that can make the page directory and page table use the same data structure, convenient. This post only introduces the principle of general conversion, the extension of paging, page protection mechanism, the page mode of paging these troublesome things will not be wordy ... You can refer to other professional books.

5.Linux of page-memory management

In principle, Linux only needs to allocate the required data structure for each process, put in memory, and then in the scheduling process, the switch register CR3, the rest of the hardware to complete (hehe, in fact, much more complex, but I only analyze the most basic process). It says I386 's two-level page management architecture, but some CPUs, there are three levels, or even four-tier architectures, and Linux provides a unified interface for each CPU in order to provide a higher level of image extraction. Provides a four-layer page management architecture that is compatible with these two-, three-, and four-level management architecture CPUs. These four levels are: page Global directory PGD (corresponding to the page directory just now) page Superior Catalog PUD (newly introduced) page Intermediate directory PMD (also new introduced) page Table PT (corresponds to the page table just now). The whole conversion is based on the principle of hardware conversion, just two more times the index of the array,so, for hardware that uses level two management architecture 32 bits, and now the four-level conversion, how can they work together in a coordinated way? Well, look at this situation, how to divide the linear address it! From the hardware point of view, the 32-bit address is divided into three parts-that is, how to do not manage the software, the final implementation of hardware, but also only know the three boss. From the software point of view, due to the introduction of more than two parts, that is, a total of five parts. --it's easy to make the hardware of the two-tier architecture aware of five parts, and in the Address Division,set the page ancestor directory and the page intermediate directory length to 0.。 In this way, the operating system to see is five parts, hardware or its rigid three parts division, will not be wrong, that is to say, we build a harmonious computer system. In this way, although superfluous, but considering the 64-bit address, using the four-tier conversion architecture of the CPU, we will no longer set the median two to 0, so that the software and hardware again harmony--The smoke is strong AH!!! For example, a logical address has been converted to a linear address, 0x08147258, converted to two, which is: 0000100000 0101000111 001001011000 The kernel divides this address PGD = 0000100000 PUD = 0 PMD = 0 PT = 0101000111 offset = 001001011000 Now to understand Linux for hardware tricks, because the hardware can not see the so-called PUD,PMD, so, essentially requires PGD index, directly corresponds to the PT address. Instead of going to pud and PMD to look up arrays (although they are both in linear addresses, with lengths of 0,2^0 = 1, which means that they all have an array of array elements), how does the kernel properly arrange the addresses? From the software's point of view, because it has only one, 32-bit entries, it can store exactly the same length of the address pointer as in the PGD. So the so-called first to the PUD, to the PMD to do mapping conversion, it becomes to maintain the original value unchanged, a change of hands on it. In this way, the implementation of "logically points to a PUD, and then point to a PDM, but is physically directly pointed to the corresponding PT of this image, because the hardware is not known to have PUD, PMD this thing." And then to the hardware, hardware to this address division, see is: Page directory = 0000100000 PT = 0101000111 offset = 001001011000 Well, first, according to 0000100000 (32), in the page Directory array index, find the address of its elements, Take its high 20 bits, find the address of the page table, the address of the page table is dynamically allocated by the kernel, and then add an offset, is the final physical address.

CPU memory management and Linux memory paging mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.