Linux Memory Paging

Source: Internet
Author: User
Tags parent directory

Starting with Linux 2.6.11, the kernel uses a four-level page table that is independent of the hardware architecture. But support for a few pages table should be hardware support as a standard, how does Linux achieve level four page table?

Let's look at the initial code for a page table.
Pkmap page table initialization for fixed memory parts

First of all, the following analysis is based on the values of some macros that are configured with a X86 schema that is larger than 1G of memory and does not have PAE turned on.
Some of the views listed are also default to the above conditions.

Callstack:page_table_range_init (arch\x86\mm\init_32.c) permanent_kmaps_init (arch\x86\mm\init_32.c) Pagetable_ Init (arch\x86\mm\init_32.c) paging_init (arch\x86\mm\init_32.c) setup_arch (arch\x86\kernel\setup.c) Start_kernel ( INIT\MAIN.C) startup_32 (head_32.s) static void __initpage_table_range_init (unsigned long start, unsigned long end, pgd_t *pgd_base) {int pgd_idx, pmd_idx;unsigned long vaddr;pgd_t *pgd;pmd_t *pmd;vaddr = Start;pgd_idx = Pgd_index (vaddr);p md_ IDX = Pmd_index (vaddr);p gd = Pgd_base + pgd_idx;for (; (Pgd_idx < PTRS_PER_PGD) && (vaddr! = end); pgd++, pgd_idx++) {PMD = One_md_table_init (PGD);p MD = PMD + pmd_index (vaddr); (Pmd_idx < PTRS_PER_PMD) && (vaddr = end);p md++, pmd_idx++) {one_page_table_init (PMD); vaddr + = pmd_size;} Pmd_idx = 0;}}

+------------------------------+| PGD | PUD | PMD | PTE | PAGE |+------------------------------+

Each level corresponds to the offset of the virtual address, plus a mask to calculate the corresponding level of value
Pgdir_shift 22
Pud_shift 22
Pmd_shift 22
Page_shift 12

The number of digits in the virtual address corresponding to the corresponding level
PGD PUD, PMD PTE PAGE
10, 0, 0, 10, 12

The number of items contained in each level (also the upper limit of indexes in each level [0, X))
PTRS_PER_PGD 1024 (2^10)
Ptrs_per_pud 1 (2^0)
PTRS_PER_PMD 1 (2^0)
Ptrs_per_pte 1024 (2^10)
For example, if the PGD has 10 bits, it can represent 2^10 table entries, and the PUD corresponds to 0 bits, then there is only one directory entry in the top-level directory.

When initializing a page table for a virtual address, the first item of the corresponding PGD array is obtained based on the virtual address, and Pgd_inex (VADDR) is doing the work, which is a macro,
#define PGD_INDEX (Address) >> Pgdir_shift & (Ptrs_per_pgd-1))
This gives the virtual address the index of the item in the PGD, which gives the page global catalog entry, which records the physical address of the parent directory of the page.

So the problem is, X86 only recognize the Level two page table, and the Linux Code Distribution management is implemented in the four-level page table, how to achieve this, first of all, the code is always built on the basis of hardware implementation, so that the Linux four page table is actually a virtual four-level page table, that is, in the code implementation, Seems to be a four-level distribution, in fact, is a four-level paging code to fill the two-level page table, see the implementation of Pmd_index know.
#define PMD_INDEX (Address) >> Pmd_shift & (Ptrs_per_pmd-1))
In fact, the calculation of the page directory is a set of these formulas, only the specified bit offset and length is different, take out the PMD in the virtual address of the corresponding value, and then according to the range of the value of the item mask, you can get the offset of the item in that level, because PTRS_PER_PMD is 1, that the index of the item should be less 1, that is, only 0, but know the index value of the item is not enough to understand how to build a two-level page table, then look down.
When you know the index of the items at each level, you can initialize the entire page table, first the outer loop fills the page global catalog, and then populates each page with a global catalog item corresponding to the PUD, PMD, implemented by the function One_md_table_init, to the virtual address of the page global Catalog item, Returns the virtual address of the middle of the page because it is only equivalent to a Level three page table, so the initialization of PUD is the equivalent of omitting it, and its implementation has only two statements.
PUD = Pud_offset (PGD, 0);
pmd_table = Pmd_offset (pud, 0);
Pud_offset gets the offset of PUD, #define PUD_OFFSET (PGD, start) (PGD), which directly returns the address of the PGD,
Pmd_offset also returns the virtual address of the PUD. That is, the address of the PGD item, and the function returns.
Then start to initialize the page intermediate directory, loop set each item, in fact, there is only one item, set these items to the value of the page table, one_page_table_init to set the page table, first request a page table, and then assign to the corresponding PMD, then the Page_table_range_ The page table for this virtual address that was introduced by Init is set up.
To summarize, since the schema supports only level two page tables, when you calculate PUD and PMD, it is the virtual address of the incoming parent directory item that is returned, that is, the directory item virtual address of PGD.

According to the level of four, the following:

PGD        PUD                   PMD                 PT +-----+    +-----+               +-----+             +-----+ |  A  |--->|  A  |-------------->|  A  |------------>| t0  | +-----+    +-----+  +-----+      +-----+  +-----+  +-----+ | b  |------------>|  b  |-------------->|  b  |    | T1  | +-----+    +-----+  +-----+      +-----+  +-----+ +    -----+ |  C  |--->|  C  |-------------->|  C  |             | t2  | +-----+    +-----+               +-----+             +-----+ |                                              ... |

PUD and PMD Each directory has only one item, and the value is the same as the corresponding table entry for PGD.


Since the address of the PUD directory is obtained through the PGD entry, and when the PMD base is obtained through the PUD, the return is actually the address of the top level directory entry, see Pud_offset Pmd_offset, that is, the PUD is actually a PGD, PMD is the corresponding directory entry, that is, PGD, the resulting page table structure should be:
PGD                           PT +-------------+               +-----+         +------+ |a (PUD) (PMD)  |-------------->| t0  |-------->| Page | +-------------+               +-----+         +------+ |  b          |               | T1  | +-------------+               +-----+ |  C          |               | t2  | +-------------+               +-----+ |               ... |

The Global page catalog page, also PUD, is also PMD.
Linux is configured with a macro to implement the code of the Level four page table, but the true Level two page table. So when the real multi-level page table, only need to configure the corresponding level of the offset (xxx_shift) and the corresponding levels of the directory item range (PTRS_PER_XXX) can be reused code.


Linux Memory Paging

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.