First, on the processor of the x86 architecture, a normal page size of 4KB, non-PAE mode,CR3 holds the physical address of the page directory page,thePde and PTEs format the same size of 4 bytes. At this point, each page table page contains 1024 PTEs, which can map 1024 pages. While the x86 4GB address space contains 1M 4KB of pages, mapping all of these pages requires a total of 1024 page table pages, mapping the 1024 page table page requires 1 page catalog page. At first glance, we need 1025 pages to manage the mapping of 1M pages, but it is not.
We might as well try to straighten out the idea. First, there are 1m=1024x1024 pages in our address space, and a page table page can map 1024 pages. So, we divide the 1M pages of the 4GB address space by 1024 groups per 1024 pages. Mapping these 1024 sets of pages exactly requires 1024 page table pages, so we removed a set of pages from the 1024 groups just now as a page table. There are 1023 groups left on the page, which will consume 1023 of the pages in our set of page tables for mapping. So we have one of the pages in this set of page tables, and we use it to map just 1023 page table pages, which consumes 1023 entry, and the remaining 1 entry? Used to map the last page itself.
so the distribution sounds a lot clearer, can it be achieved? According to the above idea, the last Page table page is undoubtedly to be used as a page directory. The x86 processor MMU Access linear address is determined by two lookups plus an offset (non-PAE mode) to determine the physical address. Based on this, the page directory, page table, plus offset access to the 1023 sets of pages very well understood, but how to access the linear address of the 1023 page table pages and page Directory page? That is, how to map 1023 page table pages and page directory pages themselves to linear addresses through the page catalog page. In fact, this is the 32-bit Windows self-mapping mechanism of the page table.
In 32-bit Windows, the starting linear address of the page table is constant 0xc0000000, and the entire page table area occupies a contiguous 4MB linear address space. One of the most critical design is that the page directory's 0x300 item entry points to the page directory itself, which in turn determines the linear address of the page directory page is 0xc0300000. This design allows the MMU to use the page catalog page as a page table page when locating the physical address of any linear address in the linear address space of the 0XC0000000~0XC03FFFFF, which means that the page catalog page is accessed two times. For example, Microsoft 2004 at the University of Tokyo held the Windows Kernel Internals Lecture explaining the getpteaddress principle, a diagram given:
In fact, the getpteaddress function is given a linear address, the address of the PTE used to map the page to which the linear address belongs. The principle is clear, the 4GB address space contains 1M pages, and the 4MB page table area contains 1MB PTEs, which correspond to all of these pages in order one by one. So the entire address space is considered an array of pages, it is easy to calculate the index of the page to which a linear address belongs, such as 0xe4321000>>12=0xe4321, whereas the linear address of the PTEs in the PTE array is 0xc0000000+ (0xe4321 <<2) =0xc0390c84.
As can be seen, the high 10 bits of a given linear address are 1100000000=0x300, the first step in the MMU translation linear address: The index used to find the page table pointer in the page catalog page is 0x300, so that the MMU gets the physical address of the page directory page itself. So the MMU in the second step: when looking for the final page pointer in a page table page, the page catalog page is used as the page table page for the lookup. All 1024 entry of the page catalog page store the physical address of itself and the other 1023 page table pages, so the MMU finds the physical address of a page table page after two lookups. When the high 10 bits of a linear address are fixed to 1100000000, the remaining low 22 bits can only represent a range of 4MB, so the linear address it can represent is limited to the 0XC0000000~0XC03FFFFF range, It is also the range of linear address space occupied by all page table pages. Because the diagram given by Microsoft is intended to explain the PTE, the Pte size is 4 bytes and its address is also 4-byte aligned, so the last 12-bit in-page offsets can also be understood as using the first 10 bits as the PTE array index, with a minimum 2-bit constant of 0.
It is not difficult to think of the second step of the MMU if you still use the 0x300 index, that is, the linear address high 20 bits is 11000000001100000000, then the MMU after two lookups of the final physical address is the physical address of the page directory page, such as:
Plus a minimum of 12-bit page offset, the MMU final translation of the address falls in the page directory page, that is, the linear address 0XC0300000~0XC0300FFF represents the range of the interval is the page directory page in the linear address of the range of address ranges. If, in the 12-bit page offset, the high 10 bits are continued to be used as the PTE index, then the PTEs that point to the page directory itself, and also the PDE, have a linear address of 0xc0300c00.
Summary, use a page as the page directory, so that its first 0x300 entry point to itself, the remaining 1023 entry point to the Other 1023 page table page. Thus, the page table interval is cleverly mapped to the linear address range of the 0XC0000000~0XC03FFFFF.