Linux kernel focuses on the programming wisdom of the world's top programmers, and in the operating system class, the teacher speaks about the four functions of the operating system: Process scheduling memory management device driver network. Engaged in the development of embedded software, the device driver and network contact more. and the process scheduling and memory management contact less, more is at a distance.
I understand that in the kernel development to have a deeper level of technical progress, should be the kernel of memory management process scheduling and other deep technology has a certain understanding. However, these 2 pieces of content is the core of the kernel, the actual core development work involves less, very few problems to cut into the study, the web also has no systematic information to explain, learning is not easy.
In the spirit of my not into hell, who into the principle of hell, this period of time using the memory management of some research, combined with the previous work encountered some memory management problems, the memory management framework has a little understanding. Can only say, this point of understanding let me more awe of the kernel, do not say process scheduling, memory management alone enough to write a 500-page book!
My understanding memory management can be divided into the page table mechanism and memory allocation mechanism of two large chunks, this period of research has only let me have some understanding of the page table mechanism. First write a few articles to write the page table mechanism, the page table mechanism details, and then to learn the memory allocation mechanism, such as BOOTMEM and slab partner system.
Process scheduling is a bigger pit than memory management, if you are lucky to crawl out of the pit of memory management, then go to the pit of process scheduling.
Suddenly feel the burden on their shoulders is very heavy, this may be the role of blogging, share their knowledge to everyone, but also let everyone urge themselves to further study.
But they are only to participate in the work only a few years of primary school students, trying to analyze these advanced knowledge, there will certainly be a mistake and wrong place, but since the sharing, is to let everyone and I together quweicunzhen, to improve. So I hope that we have more ideas, work together!
Because memory management is a relatively abstract knowledge, it is based on three principles to study:
(1) with some questions to study, put forward some questions, from these questions to cut into
(2) Draw a logical diagram of abstract knowledge to make the framework clearer
(3) instantiation, as far as possible from the actual device to the memory management research
This series of articles is based on the ARM architecture, the Linux kernel version number 3.4.55
The Linux kernel page table mechanism is simply to manage the device real physical address and virtual address of a dynamic or static mapping, is a hardware-based MMU, the processor must provide the MMU memory management unit, the Linux page table mechanism to work properly, both complement each other.
Learning Kernel memory Management If it is out of the MMU hardware principle, only to learn its software logic, it is really difficult to understand. In the final analysis, the logic of software code is for the hardware services, just to give full play to the functions of the hardware, so learning the memory management mechanism of Linux, the first thing to learn under the processor architecture of the MMU working principle, so that we understand the logic of the page table mechanism is helpful. (as the low-level software engineer, nothing to turn over datasheet very useful ah, more from the hardware thinking to consider the problem)
The MMU is the hardware logic inside the processor core, so only in the processor core datasheet will be detailed instructions, ARM's MMU logic for different versions of the processor is similar, I have a arm920t manual, detailed reading the MMU chapter, I have the following several questions need to be resolved:
a MMU uses TLB to convert between PA (physical address) and VA (virtual address), and processor addressing is address matching directly in the TLB. But when the kernel initializes the page table in memory, what does the page table relate to the TLB?
There are 64 instruction TLB and data tlb in the MMU of arm, the actual and false address translation of the processor addressing is a matching completion mapping between the TLB, but in the kernel initialization the page table Swapper_pg_dir is created in memory (this process can be seen in my other blog post: http ://blog.csdn.net/skyflying2012/article/details/41447843) and configure the address in the CP15 register. What does this page table have to do with the TLB of arm?
920T in the MMU chapter I found the answer, as follows:
When the
CPU accesses the VA (virtual address), the TLB hardware completes the conversion of VA to PA (physical address), but if there is no Entry,mmu hardware unit for that VA's TLB translation table walk The Hardware (Page Table index unit) Indexes the memory page table provided by the CP15 register C0, makes address translation, and obtains PA access. and the page table information is updated to the TLB, and the page table and TLB are not a concept, TLB is a cache hardware for the memory page table! The
means that arm's MMU not only uses TLB for address translation, but also resolves and addresses translation of page tables provided in memory, while the TLB stores some of the most commonly used addresses of CPUs. TLB speed is fast, which can speed up the efficiency of address conversion.
If none of the page table information is found for the VA, the MMU sends an exception to the CPU (either data abort or instruct abort, depending on data or instruct), and the page table is populated in the exception handling function.
This also allows me to see why the create_mapping function (the key function of memory mapping) and the page fault handler function Do_page_fault in the kernel initialization are all updates to the memory pages table, not the TLB. because arm's MMU itself uses the memory page table!
and ARM's direct operation of the TLB is more complicated than manipulating the memory page table, allowing the MMU to update the TLB itself according to the memory page table itself.
Of course, the memory page table that arm's MMU can manipulate is also in a fixed format, and that's our next question.
two look at the kernel code, the two-level page table mapping used by arm Linux, then how does arm's MMU hardware complete the VA-to-pa conversion?
ARM's MMU Use memory page table how to complete address translation, a diagram of the MMU Operations page table is listed in several ways, as follows:
If this picture is fully understood, the address conversion of ARM's MMU hardware will be fully understood. It can be seen that arm's MMU completed address conversion in many ways, the total is divided into 2 kinds, section-mapping and page-mapping. Linux's Level two page table method belongs to page-mapping, but 2 ways of mapping the Linux kernel are used, this later.
The memory page table (written CP15 C0) that we give to the CPU is the first-level page table (also called the page directory) address, the page table total 4,096 indexes, each index occupies 4 bytes, a single table item can map 1MB address space, This allows the 16KB-size page table to encompass the maximum 4GB space that 32-bit CPUs can address.
As you can imagine, querying these 4,096 indexes requires only 32 high 12 bits of the virtual address, the CPU first obtains the page directory base address (TTB), plus the high 12 bits of the virtual addresses to be converted, that is, the page directory entry for that virtual address. This process for section-mapping and page-mapping are the same, how to distinguish the mapping method, the key in the page directory entry of the minimum 2bit, as follows:
MMU according to the page directory entry minimum 2bit to determine what to do next, all 0, Invalid page directory, the MMU will be issued to the CPU fault pages. Page-mapping is subdivided into coarse page table and fine page table (fine-page tables), except that the two-level page table maps the 64k/4k or 1K pages, and the Linux kernel uses 4K pages. This article therefore focuses on 4K pages in the Rough page table.
Next look at section-mapping and page-mapping the specific virtual reality address conversion principle.
1 section-mapping
A graph stream is as follows:
This diagram clearly illustrates how the Section-mapping method works, according to the high 12-bit index and TTB added to the page catalog entries, the MMU found that the low 2 bits is 10, is section-mapping, take the page catalog item high 12 bits and the virtual address of the low 20-bit splicing, The physical address is obtained and the conversion is completed.
2 page-mapping
A graph stream is as follows:
This diagram understands how the 4 K page works in the Page-mapping Way, is a two-level page table method, can be subdivided into 5 steps:
(1) The MMU is removed from the CP15 C0 TTB (page directory) base, and the VA (virtual address) high 12 bits Add, get the VA in the page directory of the corresponding page directory entry value.
(2) MMU gets the page directory entry minimum of 2bit, is 01, indicating the 1MB data of this map is the page-mapping of the 4k small page.
(3) MMU gets the high 22 bits of the page catalog entry (the page table is 256x4=1k, so the page table base is 1K aligned) is the page table base address, which is added to the middle 8 of the VA, that is, the corresponding page table entry address of the VA, thereby obtaining the corresponding page table entry value for the VA (page table entry)
(4) The MMU gets the high 20 bits of the page table entry value, which is the physical address of the 4K page, plus the lower 12 bits of the VA (that is, the offset in the 4K page), which is the physical address of the VA.
(5) The MMU accesses the physical address for the read and write operations given by the CPU
The above describes the 2 mapping methods of the virtual address conversion logic, it can be seen, regardless of section-mapping or page-mapping, in the first-level page table is to complete the mapping of 1MB addresses, and page-mapping second-level page table entries to complete the 4K page mapping.
So, regardless of the first-level page table entry or the second-level page table entry in addition to the physical address, there will be a lot of bit is free, the spare bit to complete the mapping of the address access and Operation properties control, mainly including the AP bit (access permission) and the cache attribute bit, For Section-mapping, whose control bit is in the first-level page table (because it has only one level), Section-mapping's first-level page table entry bit is defined as follows:
For page-mapping, the control bit is mainly in the second-level page table key, which is defined as follows:
These bits are not described in detail here, and are to be analyzed when specific problems are encountered later.
Here, for the arm of the MMU in the virtual address translation work has been explained, with these hardware foundation, then to learn the Linux Kernel page table mechanism will be more relaxed.
Next, we will jump into the Linux code to analyze and analyze!
Arm-linux Memory Management Learning Note (1)-Hardware principle of memory page table