Linux kernel--section page method for managing memory

Source: Internet
Author: User

First, the concept
Physical addresses (physical address)
For memory chip-level cell addressing, corresponds to the address bus that the processor and CPU are connected to.
--This concept should be one of the best understanding of these concepts, but it is worth mentioning that although the physical address can be directly understood as plugged into the machine on the memory itself, the memory as a
A large array of numbers from 0 bytes to the maximum empty byte, and then the array is called the physical address, but in fact, this is just a hardware to provide software for the image, memory addressing
That's not the way it is. So, to say that it is "relative to address bus", is more appropriate, but aside from the consideration of physical memory addressing methods, the physical address and physical memory directly
The corresponding, is also acceptable. Perhaps the wrong understanding is more favourable to the metaphysical image.
Virtual Memory (Vsan)
This is a description of the entire memory (not the one that is plugged into the machine). It is relative to the physical memory, can be directly understood as "not straight", "false" memory, for example,

A 0x08000000 memory address that does not 0x08000000-1 that address element in the large array on the physical address;
This is because modern operating systems provide a memory-managed image, virtual memory. The process uses the address in virtual memory by the operating system
Assist with the relevant hardware and "convert" it into a real physical address. This "conversion" is the key to all the issues discussed.
With such a pump, a program can use a much larger address space than the real physical address. (Rob Peter, pay Paul, banks do the same), and even multiple processes can use
The same address. Not surprisingly, since the converted physical address is not the same.
--You can reverse-compile the connected program and see that the connector has assigned an address to the program, for example, to invoke a function A, the code is not call a, but calls
0x0811111111, that is, the address of function A has been fixed. There is no such thing as a "transform", there is no concept of virtual address, and this does not work at all.
Hold on, the question goes on again, and it won't hold up.
Logical addresses (logical address)
For compatibility, Intel retains the ancient memory management of the segment. A logical address is a machine language instruction used to specify an operand or an address of an instruction. Above
For example, we say that the connector is the 0x08111111 of a assigned to this address is the logical address.
--but I'm sorry to say, it seems to violate the Intel Middle-management, the logical address requirements, "a logical address, is a segment identifier plus a specified segment within the relative address
Offset, expressed as [segment identifier: offset within paragraph], that is, the 0x08111111 in the example above, should be represented as [a code snippet identifier: 0x08111111], so that
Complete some "
Linear address (linear addresses) or also called virtual address (Vsan)
Similar to the logical address, it is also an unreal address, if the logical address is the corresponding hardware platform segment management of the pre-conversion address, then the linear address corresponds to the Hardware page-type memory
The address before the change.
-------------------------------------------------------------
The CPU converts an address in a virtual memory space into a physical address, which takes two steps: First, given a logical address (in fact, the offset within the paragraph, this must be understood!!!). ),
The CPU uses its segment memory management unit to convert a logical address into a thread address, and then use its page-based memory management unit to convert to the final physical address.
Doing this two conversions is really cumbersome and unnecessary because you can directly draw a linear address to the process. The reason for this redundancy is that Intel is completely compatible.
2. CPU segment memory management, how to convert logical address to linear address
A logical address consists of two parts, the segment identifier: The offset within the segment. A segment identifier is made up of a 16-bit long field called a segment selector. The first 13 bits are an index number. Back 3 bit
Contains some hardware details,

The last two bits involve permission checks, which are not included in this post.

index number, or directly understand the array subscript-then it always corresponds to a set of the bar, it is what the index of what? This stuff is "segment Descriptor (segment
Descriptor) ", huh, the segment descriptor specific address describes a paragraph (for the word" paragraph ", I imagined it, took a knife, the virtual memory, cut into a number of
Cut-off-segment). Thus, a number of segment descriptors, a set of an array, called "segment Descriptor Table", so that the first 13 bits of the segment identifier can be directly in the segment descriptor tables to find a
The segment descriptor of the body, this descriptor describes a segment, and I just don't have a very accurate picture of the segment, because look at what's inside the descriptor-that's how it's described,

Just understand what the paragraph is all about, each segment descriptor consists of 8 bytes, such as:



These things are very complex, although a data structure can be used to define it, but I only care about the same here, that is, the base field, which describes the starting position of a segment of the linear address.
Intel designed the idea that some global segment descriptors are placed in the Global Segment Descriptor List (GDT), some local, such as each process's own, is placed in the so-called "local segment description
List (LDT) ". When should the GDT be used, and when should the LDT be used? This is represented by the T1 field in the segment selector, = 0, which indicates the use of the LDT with Gdt,=1.
The address and size of the GDT in memory are stored in the GDTR control register of the CPU, while the LDT is in the LDTR register.
A lot of concepts, like tongue twisters. This picture looks more intuitive.


First, given a complete logical address [segment selector: offset address within paragraph],

1, see segment selector t1=0 or 1, know the current to convert is a GDT in the segment, or the LDT in the section, and then according to the corresponding register, get its address and size. So we have an array.
The
2, take out the segment selector in the first 13 bits, you can find the corresponding segment descriptor in this array, so that it is base, that is, the base address to know.
3, the base + offset, is to convert the linear address.
is quite simple, for the software, in principle, the need to convert the hardware required to prepare the information, you can let the hardware to complete the conversion. OK, let's see what Linux does.
3. Linux Segment Management
Intel requires two conversions, which is compatible, but it is very redundant, oh, no way, hardware requirements to do so, software can only comply, how to have the same formalism.
On the other hand, some other hardware platforms do not have the concept of two conversions, and Linux also needs to provide a high-level image to provide a unified interface. So, the segment management of Linux, in fact
Just "coaxing" a little bit of hardware.
According to Intel's original intent, the global use of GDT, each process of its own ldt--but Linux for all processes using the same field for instruction and data addressing. That is, the user data segment,
The user code snippet corresponds to the kernel data segment and the kernel code snippet in the kernel. There is nothing strange about this, it is formality, as we write the year-end summary.
Include/asm-i386/segment.h
CODE:  [Copy to clipboard] [-] #define GDT_ENTRY_DEFAULT_USER_CS 14#define __user_cs (GDT_ENTRY_DEFAULT_USER_CS * 8 + 3) #define GDT_ENTRY_DEFAULT_USER_DS 15#define __user_ds (Gdt_entry_default_user_ds * 8 + 3) #define Gdt_entry_kernel_ BASE 12#define Gdt_entry_kernel_cs  (gdt_entry_kernel_base + 0) #define __KERNEL_CS (GDT_ENTRY_KERNEL_CS * 8) #define Gdt_entry_kernel_ds  (gdt_entry_kernel_base + 1) #define __KERNEL_DS (GDT_ENTRY_KERNEL_DS * 8)

To replace the macro with a value:
CODE:  [Copy to clipboard] [-] #define __USER_CS [00000000 1110 0] #define __USER_DS 123 [00000000 1111 0] #de Fine __kernel_cs [00000000 1100 0] #define __KERNEL_DS 104 [00000000 1101 0 00]
The square brackets are the 16-bit two representations of the four segment selectors, and their index numbers and T1 field values can also be calculated.
CODE:  [Copy to clipboard] [-]__user_cs index= t1=0__user_ds index= [T1=0__kernel_cs] index=  DS index= t1=0t1 are 0, which means that the GDT is used, and then the corresponding 12-15 items (arch/i386/head) are initialized in the content of the GDT. S): Code:  [Copy to clipboard] [-].quad 0x00cf9a000000ffff/* 0x60 kernel 4GB CODE at 0x00000000 */.quad 0x00cf9200000 0FFFF/* 0x68 kernel 4GB data at 0x00000000 */.quad 0x00cffa000000ffff/* 0x73 user 4GB Code at 0x00000000 */.quad 0X00CFF 2000000FFFF/* 0x7b user 4GB data at 0x00000000 */
As described in the preceding paragraph descriptor table, they can be expanded to find that the 16-31 bits are all 0, that is, the base site of the four segments is all 0.
Thus, given a paragraph offset address, according to the previous conversion formula, 0 + paragraph offset, converted to linear address, you can come to an important conclusion, "under Linux, the logical address and the linear address
Always consistent (consistent, not some people say the same), that is, the value of the offset field of the logical address is always the same as the value of the linear address. ”
Too many details, such as permission checks for a segment, are ignored. Oh.
In Linux, most processes do not use the LDT, unless you are using wine to simulate Windows programs.
4.CPU of page-memory management
The page-type memory management unit of the CPU, which is responsible for translating a linear address into a physical address. From the point of view of management and efficiency, linear addresses are divided into groups of fixed-length units,
Called a page, such as a 32-bit machine, the linear address can be up to 4G, the 4KB is divided into a page, this page, the entire linear address is divided into a tatol_page
A large array of [2^20], with 2 of 20 pages in total. This large array we call the page directory. Each directory entry in the directory is an address-the address of the corresponding page.
Another type of "page", which we call a physical page, or a page box, page frames. It is the paging unit that divides all physical memory into fixed-length management units, and its length is generally the same as the memory page
One by one corresponds to.
Notice here that this total_page array has 2^20 members, each member is an address (32-bit machine, one address is 4 bytes), then to represent such an array alone,
will take up 4MB of memory space. To save space, a two-level management-mode machine was introduced to organize paging units. The text description is too tired, look at the picture intuitively some:


Such as
1, paging Unit, the page directory is unique, its address in the CPU's CR3 register, is the starting point for address translation. The long march began to grow.
2, the process of each activity, because all have its own corresponding virtual memory (page directory is also unique), then it also corresponds to a separate page directory address. --run a process,
You need to put its page directory address in the CR3 register and save it.
3. Each 32-bit linear address is divided into three parts, Surface directory Index (10-bit): Page table index (10-bit): offset (12-bit)
The conversion is performed according to the following steps:
1, remove the page directory address of the process from the CR3 (the operating system is responsible for the scheduling process, the address is loaded into the corresponding register);
2, according to the first ten linear address, in the array, find the corresponding index entry, because the introduction of the level two management mode, the page directory of the item, no longer the address of the page, but the address of a page table.
(An array is introduced, and the address of the page is placed in the page table.)
3, according to the middle of the linear address 10 bits, in the page table (also array) to find the starting address of the page;
4, add the starting address of the page and the last 12 bits of the linear address, get the gourd we want eventually;
This conversion process should be said to be very simple. All by hardware, although a number of procedures, but save a lot of memory, or worthwhile. So, simply verify that:
1, such a level two mode is still able to represent the address of 4G;
Page Catalogs total: 2^10, which means there are so many page tables
Each eye table corresponds to: 2^10 page;
Addressable in each page: 2^12 bytes.
or 2^32 = 4GB
2, such a level two mode is really saving space;
That is to say, the page Catalog item and page table item occupy space (2^10 * 4 + 2 ^10) = 8KB. Hey...... How to say it!!!
Red error, mark it, after this discussion in the post ...
It is worth mentioning that, although the page directory and the page table items are 4 bytes, 32 bits, but they are only high 20 bits, low 12-bit shielding for 0--to the page table of the low 12 shield 0, is very good
Solution, because then it corresponds to a page size, and everyone adds up to an integer. It's a lot easier to calculate. However, why should you also screen the page directory low 12 bit off it?
Because by the same token, just block its low 10 bit on it, but I think, because 12>10, so that can make the page directory and page table use the same data structure, convenient.
This post only introduces the principle of general conversion, extended paging, page protection mechanism, PAE mode paging these troublesome things will not be wordy ... You can refer to other professional books.
5.Linux of page-memory management
In principle, Linux only needs to allocate the required data structure for each process, put it into memory, and then switch the register CR3 when scheduling the process, leaving the rest to the hardware to complete.
(hehe, it's actually much more complicated, but I only analyze the most basic process).
It says I386 's two-level page management architecture, but some CPUs, there are three levels, or even four-tier architectures, and Linux provides a unified interface for each CPU in order to provide a higher level of image extraction. Provide
A four-layer page management architecture that is compatible with these two-, three-, and four-tier management architectures. These four levels are:
Page Global directory PGD (corresponding to the page directory just now)
Page Parent Directory PUD (newly introduced)
Page Intermediate Catalog PMD (also on newly introduced)
Page Table pt (corresponds to the page table just now).
The whole conversion is based on the principle of hardware conversion, just two more times the index of the array, such as


So, for hardware that uses level two management architecture 32 bits, and now the four-level conversion, how can they work together in a coordinated way? Well, to see how this happens, how to divide the linear address
It!
From the hardware point of view, the 32-bit address is divided into three parts-that is, how to do not manage the software, the final implementation of hardware, but also only know the three boss.
From the software point of view, due to the introduction of more than two parts, that is, a total of five parts. -It is easy to make the hardware of the two-tier architecture aware of the five parts, and when the address is divided, the page parent
The length of the catalogue and the middle of the page are set to 0.
In this way, the operating system to see is five parts, hardware or its rigid three parts division, will not be wrong, that is to say, we build a harmonious computer system.
In this way, although superfluous, but considering the 64-bit address, using the four-tier conversion architecture of the CPU, we will no longer set the median two to 0, so that the software and hardware again harmony--the image
Is strong!!!
For example, a logical address has been converted to a linear address, 0x08147258, into two, which is:
0000100000 0101000111 001001011000
The kernel divides this address
PGD = 0000100000
PUD = 0
PMD = 0
PT = 0101000111
offset = 001001011000
Now to understand Linux for hardware tricks, because the hardware can not see so-called PUD,PMD, so, essentially requires PGD index, directly corresponds to the PT address. And not to the PUD.
And PMD to look up arrays (although they are both in linear addresses, with lengths of 0,2^0 = 1, that is, they all have an array of array elements), then how does the kernel properly arrange
Where's the address?
From the software's point of view, because it has only one, 32-bit entries, it can store exactly the same length of the address pointer as in the PGD. So the so-called first to the PUD, to the PMD to do mapping conversion,
It becomes the same as the original value, and it is possible to change hands. In this way, it is realized that "logically points to a pud, and then to a PDM, but is physically directly pointing to the corresponding PT of this
Like, because the hardware does not know that there is a pud, PMD this thing. "
Then to the hardware, hardware to this address division, see is:
Page directory = 0000100000
PT = 0101000111
offset = 001001011000
Well, first, according to 0000100000 (32), index in the page directory array, find the address of its element, take its high 20 bits, find the address of the page table, the address of the page table is dynamically allocated by the kernel,
Next, add another offset, which is the final physical address.

Linux kernel--section page method for managing memory

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.