ARM MMU page table frame
First on the arm MMU of the page table structure of the general block diagram (the following discussion is gradually expanded by the diagram):
The above is arm's page diagram of the typical structure: that is, the two-level page table structure:
where the First Level page table (L1) is made up of 12bit (Bits[31:20]) of the virtual address, the first Level page table has 4,096 item, each item occupies 4 bytes, so the size of the page table is 16KB, The lowest 2bit of each entry in the first page table can be used to distinguish between the categories of page table entries, 2bit can differentiate 4 page table entries, and each page table entry has the following structure:
in short, there are two major categories of page table entries in the L1 page table:
the first category refers to the base site of the Second Level page table (L2 page table);
the second category directly points to the physical memory of 1MB.
Each table entry in the L1 page table can overwrite 1MB of memory, and because of the 4096K option (item), the total can overwrite the 4096K*1MB=4GB memory space.
specific to Linux, because the Linux software architecture is to support the 3-level page table structure, and the arm architecture is actually only 2-level page table structure, so the Linux code in the Middle Level page table implementation is empty. In Linux code, the page catalog entries for the first-level page table are represented by PGD, and the page catalog entries for the middle-level page table are represented by PUD (ARM architecture is not actually needed), and the page table entries for the third level page tables With PMD (because the middle pud is empty, so PGD=PMD), in addition to the current arm system of mobile devices in the RAM page size is generally 4kb/page, so L1 page table entries are pointed to fine page table.
but in the initialization phase of the Linux kernel startup, the page table (initial page tables) is temporarily created to provide an execution environment for the Linux kernel initialization, when the L1 page table entry uses the second Page table entry (section Enty). He mapped directly to 1M of memory space. The concrete can refer to arch/arm/kernel/head. The __create_page_tables function in S, which is limited to space, is not expanded here.
the process of performing virtual address transfer physical addresses for this section page Translation,mmu hardware is as follows:
The above temporary page tables used in the initialization process (initial page) are overwritten later in the kernel boot, that is, in the paging_init--->map_lowmem function, the page table is again created, This function establishes a mapping table of one by one mappings for physical memory from 0 addresses to low-end memory (Lowmem_limit). The so-called one by one mapping is the physical address and virtual address on the difference between a fixed offset, the offset is generally 0xc0000000 (hehe, why is 0xc0000000. )
to introduce an important concept here is the high memory relative to the low-end memory, and what is the high-end memory. Why high-end memory is required. To resolve this problem, we assume that we use physical memory with a 2GB size, and because our kernel space has a range of addresses from 3g-4g, and as mentioned earlier, the low memory space of the Linux kernel is one by one mapped, and if the concept of high-end memory is not introduced, All use the one by one mapping, the kernel can only access to 1GB of physical memory, but in fact, we need the kernel in the kernel space to access all the size of 4GB memory, then how to do it.
The approach is to not allow 3g-4g space to use one by one mappings, but to map the physical address [0x00,fix_addr] (FIX_ADDR<1GB) to the kernel space virtual address [0x00+3g,fix_addr+3g], and then [ FIX_ADDR+3G,4G] This space is reserved for dynamic mapping so that we can access the physical memory space from FIX_ADDR to 4GB through this virtual address. How to do it.
For example, we want to access the physical address [FIX_ADDR,4GB] Any section of this interval, I will use the precious kernel virtual address [fix_addr+3g,4g] a section to map him, build good MMU hardware use of the page table, after the visit, will map clear, the kernel of this virtual address release For the next access to other physical memory usage. This will enable access to all 4GB of physical memory.
So how does the kernel code build the mapping table?
We focus on the analysis from the create_mapping function in ARCH/ARM/MM/MMU.C. Before we analyze, let's look at how the arm MMU hardware implements virtual address transfer physical address in Level two page table structure.
first post the original code (ARCH/ARM/MM/MMU.C):
The function describes the function as follows:
Create the page directory entries and any necessary
Page tables for the mapping specified by ' MD '. We
are able to cope-with varying sizes and address
Offsets, and we take full advantage of sections and
Supersections.
line737-line742: Parameter validity check, this function does not create a mapping table for the virtual address of user space (remember to ask yourself one more reason why.) )
line744-line750: If it is iomemory, the mapped virtual address range should be in the high memory interval, because we are here a regular MEMORY, that is, type is mt_memory, so we do not enter the branch
line775: Gets the table entry for this virtual address addr that is part of the first page table (L1), and tracks the Pgd_offset_k function in detail (defined in: arch/arm/include/asm/ pgtable.h), you will find that the base address of our kernel's L1 page catalog is located in 0xc0004000, and our kernel code is placed at the beginning of 0xc0008000. The size of the range from 0xc0004000 to 0xc0008000 is 16KB, just the size of the L1 page table (see the description at the beginning of the article) .
Here's a concept to note: The kernel's page directory table entries and the process's page catalog entries, and the kernel's page catalog entries are public to all processes on the system, while the process's page catalog entries are related to a particular process, and each application process has its own page catalog table entry. But the page catalog tables for the kernel space for each process are the same. It is because each process has its own page catalog table phase that each process can independently own its own [0,3GB] memory space.
line778 Pgd_addr_end () ensures that the [Addr,next] address does not span 2MB of the maximum memory space that can be mapped by a L1 table item (why 2MB instead of 1MB). This is a Linux processing skills, and then expand in detail later .
The line780 alloc_init_pud () function establishes a mapping table for the Level two page table (L2) that is located to the L1 page catalog entry PGD .
line784 pdg++ L1 Page directory entry PGD to map the virtual address of the next 2MB space to the corresponding 2MB physical space.
in this case, why L1 page Catalog entry PGD can map 2MB of virtual address space.
In the first figure of this article, he is the arm's typical MMU mapping framework, but it's not Linux, and the Linux mapping framework is tuned and optimized for its foundation.
the tuning that Linux has done is described below (excerpt from the Linux kernel: a note provided in arch/arm/include/asm/pgtable-2level.h):
/*
* Hardware-wise, we have a two Level page table structure, where the
* Level has 4096 entries, and the second level has 256. Each entry
* is one 32-bit word. Most of the bits in the second level entry are used
* By hardware, and there aren ' t "accessed" and "dirty" bits.
*
* Linux on the other hand has a three level page table structure, which can
* Is wrapped to fit a two Level page table structure easily-using the PGD
* and PTE only. However, Linux also expects one "PTE" Table per page, and
* At least a "dirty" bit.
*
* Therefore, we tweak the implementation slightly-we tell Linux so we
* Have 2048 entries in the the ' the ' which is 8 bytes (IOW, two
* Hardware pointers to the second level.) The second level contains two
* Hardware PTE tables arranged contiguously, preceded by Linux versions
* which contain the state information Linux needs. We, therefore, end up
* With the entries at the "PTE" level.
*
* This leads to the page tables has the following layout:
*
Important adjustments are described below:
The L1 page table changes from 4,096 item to 2048 item, but the size of each item changes from 4 bytes to 8 bytes.
In a page, place 2 L2 page tables, each with 256 entries, each 4 byte, so the totals are 256*2*4=2kb, placed in the lower half of the page, and the upper part places the page table used by the corresponding Linux memory management system, and the MMU hardware does not use it. So it just fills a page size (4KB) so it doesn't waste space.
With the above foundation, the following detailed analysis of the above line780 function Alloc_init_pud, the function will eventually call to the ALLOC_INIT_PTE function:
The line598 Early_pte_alloc function determines whether the L2 page table that the corresponding PMD points to exists, assigns the L2 page table if it does not exist, and returns the virtual address of the page on which the L2 table is located if it exists.
line572 determines whether the L2 page table that the PMD points to exists, does not exist, assigns Pte_hwtable_off (512*4=2KB) +pte_hwtable_size by the Early_alloc function (512*4= 2KB) Totals A physical page of 4KB to store 2 Linuxpet page tables + 2 hwpte page tables.
line574 returns the virtual address of this physical page
Back to the line599 of the ALLOC_INIT_PTE function:
line183 Pte_index is used to determine the offset of the virtual address in the L2 page table. Virtual Address bit[12~21] Total 9 bit, just for addressing two L2 page tables (total 512 items)
Back to the Alloc_init_pte function, where line605 rows, is to set the page table entry (that is, Pte) to which addr is positioned in the L2 page table, and the main task is to populate the physical address of the corresponding physical page for the MMU hardware to implement the translation of the address.
The line604~line607 loop fills up two Hwpte page tables and completes the creation of a 2M physical memory mapping table.
LINE608 will eventually call the following functions: static inline void __pmd_populate (pmd_t *PMDP, phys_addr_t Pte, pmdval_t Prot)
before you perform this function, the 2 L2 page table has been established, and the function is to set the corresponding table entry for the L1 page table to point to the 2 L2 page table (hwpte0,hwpte1) that has just been established, as mentioned earlier, because Linux L1 page table entries are 8 byte sizes, so:
line133 points The first 4 bytes to the HWPTE0 page table.
line135 The last 4 bytes to the Hwpte1 page table, so the association of the L1---〉l2 page tables has been established.
line137 is refreshing the TLB buffer so that the system's CPU can see the changes in the map
This completes the mapping of the virtual address to the physical address specified by the struct MAP_DESC *MD structure, for hardware MMU to automatically translate virtual to physical addresses.
The above process, there are selected to omit some of the details, limited to space, and if you understand the process, very detailed to see the relevant code. For example, the Set_pte_ext function above will invoke the Assembly function to implement the setting of the PTE table entry.