Detailed paging mechanism for memory addressing in Linux systems

Source: Internet
Author: User

The paging mechanism is performed after the segment mechanism to complete the linear-physical address conversion process. The segment mechanism converts the logical address into a linear address, and the paging mechanism further converts the linear address into a physical address.

Paging in hardware

The paging mechanism is enabled by the PG bit in CR0. such as pg=1, enable the paging mechanism, and use the mechanism described in this verse to convert the linear address to physical address. such as pg=0, disable paging mechanism, directly to the segment mechanism generated by the linear address as a physical address. The paging mechanism manages objects that are fixed-size blocks of memory, called pages (page). The paging mechanism regards the whole linear address space and the entire physical address space as the page composition, any page in a linear address space can be mapped to any page in the Physical address space (we call a page or page frame in the physical space).

80386 use a 4K byte size page. Each page has 4 K bytes long and is aligned on a 4K byte boundary, i.e. the starting address of each page is divisible by 4 K. Therefore, 80386 4G byte of linear address space, divided into 1G pages, each page has 4 K byte size. Paging mechanism by putting pages in a linear address space, relocate to the physical address space for administration because the entire 4K byte of each page is mapped as a unit, and each page is aligned to the 4K byte boundary, so the low 12 bits of the linear address are used directly as the lower 12 bits of the physical address.

Why use a Level two page table

Assuming that each process consumes 4G of linear address space, the page table contains 1M table entries and each table entry occupies 4 bytes, then the page table for each process occupies 4M of memory space. To save space on the page table, we use a Level two page table. Each process is assigned a page directory, but it is allocated to memory only if the page table is actually used. The first Level page table needs to allocate all the page table spaces at once, and the Level two page table can redistribute the page table space when needed.

Level two page table structure

The first level of the two-level table structure is called the page directory, which is stored in a 4K-byte page. The page catalog table has a total of 1K entries, each of which is 4 bytes and points to the second level table. The highest 10 bits of the linear address (the ascended 31~ bit 32) are used to produce the first level of the index, in which the index is specified and selected in a table in the 1K two level table.

The second level of the level two table structure, called the page table, is also stored in a 4K-byte page containing a 1K byte table entry, each containing a physical base address for a page. The second-level page table is indexed by the median 10 bits (the ascended 21~ bit 12) of the linear address. To get the page table entry that contains the physical address of the page, the physical address of the high 20 bit and the low 12 bits of the linear address form the final physical address, which is the physical address of the output of the page transformation process.

Page Catalog Entry

31st to 12th is a 20-bit page table address, because the page table address of the low 12-bit total is 0, so with a high 20 digits to indicate 32-bit page table address. Therefore, a page directory contains up to 1024 page table addresses.

The No. 0 bit is the presence of a bit, if p=1, which indicates that the page table address points to the page in memory, and if P = is not in memory.

The 1th bit is the read/write bit, and the 2nd bit is the user/admin bit, which provides hardware protection for the page directory entry. Two bits. When a process with a privilege level of 3 wants to access a page, a page protection check is required, and a process with a privilege level of 0 can bypass page protection.

The 3rd bit is the PWT (Page write-through) bit, which indicates whether the write-through method is written in memory (RAM) as well as the cache, which is 1 for the write-through method

The 4th bit is the PCD (Page cache Disable) bit, which indicates whether caching is enabled, and the bit 1 indicates that caching is enabled.

The 5th bit is the access bit, and when the page directory entry is accessed, a bit = 1.

The 7th bit is the page size flag, which is only available for pages directory entries. If you set it to 1, the page directory entry refers to a 4MB page, please see the extended paging below.

The 9th to 11th place is dedicated by the operating system, and Linux has no special use.

Page Items

Each page directory entry for 80386 points to a page table that contains up to 1024 page entries, each 4 bytes, containing the starting address of the page and information about the page. The starting address of the page is also an integer multiple of 4 K, so the lower 12 bits of the page are also reserved for it.

31st to 12th is a 20-bit physical page address, in addition to the 6th position No. 0 to 5th and 9~11 bit of use and page directory entries, 6th bit is unique to the page, when the page involved in the write operation, D-bit was placed 1.

4GB of memory has only one page directory, it has up to 1024 page catalog items, each page catalog item also contains 1024 page items, therefore, the memory can be divided into 1024x1024=1m pages altogether. Because each page is 4K bytes, the size of the storage is exactly 4GB.

Conversion of linear address to physical address

32-bit linear address to the conversion of the physical address

1.CR3 contains the starting address of the page directory, using the top 10 bits of the 32-bit linear address as the index of the page catalog entry, multiplying it by 4, and adding the page directory's starting address in the CR3 to form the address of the page table.

2. Remove the 32-bit page directory entry from the specified address, which has a low 12-bit 0, which is the starting address for the page table. Using the A21~A12 bit in the 32-bit linear address as the index of the page in the page table, multiply it by 4, add the starting address of the page table, and form a 32-bit page address.

3. Add the a11~a0 as an offset from the page address and add the 32-bit page address to form a 32-bit physical address.

Extended Paging

Starting with the Pentium processor, the Intel microprocessor introduced extended paging, which allows for a page size of 4MB.

In the case of extended paging, the paging mechanism divides 32-bit linear addresses into two domains: the maximum 10-bit directory domain and the remaining 22-bit offset.

Page cache

Because of paging, each memory access has to access the Level two page table, which greatly reduces the access speed. So, to improve speed, set a cache hardware mechanism for the most recent access page in 386, which automatically maintains the most recently used page address of 32 processors, so it can overwrite the 128K byte memory address. When making memory access, first check whether the page to be accessed is in the cache, if it is, you do not have to go through two levels of access, if not, then two levels of access. On average, the page cache has a 98% hit rate, which means that only 2% of each access to the storage must access the two-level paging mechanism. This greatly accelerates the speed.

The paging mechanism in Linux

Linux uses a paging mechanism for 32-bit and 64-bit systems.

Page Global Catalog

Page top-level directory

Page Middle Directory

Page table

The page global catalog contains the addresses of several pages of parent directories, which in turn contain the addresses of several pages of intermediate directories, and the page's middle directory contains the addresses of several page tables. Each page table entry points to a page box. The linear address is therefore divided into five sections. The number of digits is not shown in the figure because the size of each part is related to the specific computer architecture.

For a 32-bit system with no physical address extensions enabled, the Level two page table is sufficient. In essence, Linux completely cancels the page ancestor directory and the page intermediate catalog fields by making the page ancestor directory bit and the page middle directory all 0. However, the position of the page ancestor directory and the page middle directory in the pointer sequence is retained so that the same code can be used under 32-bit systems and 64-bit systems. The kernel retains a location for the page ancestor directory and the page intermediate directory by setting their page catalog entries to 1 and mapping the entries to a suitable directory entry for the page global catalog.

A 32-bit system with physical Address extensions enabled uses a Level three page table. The Linux page global catalog corresponds to the 80x86 page Catalog pointer table (PDPT), cancels the page superior catalog, the page middle directory corresponds to the 80x86 page catalog, the Linux page table corresponds to the 80x86 page table.

Finally, the 64-bit system uses level three or four level paging depending on the hardware's partitioning of the bits of the linear address.

Summarize

Here we do not discuss code implementation, only focus on the principle. From the discussion above, you can see that the paging mechanism relies primarily on hardware implementations. Linux uses the Level four page table just to maximize compatible with different hardware implementations, single on the IA32 architecture of the CPU, there are a variety of paging implementation, regular paging mechanism, PAE mechanism.

While we're talking about the paging mechanism of Linux, we've actually spent most of our space discussing the paging mechanism implementation of Intel CPUs. Because Linux's paging mechanism is based on the hardware, different platforms need to have different implementations. Linux at the software level of the construction of virtual address, eventually to be converted through MMU to physical address, that is, regardless of how the Linux paging mechanism is implemented, the CPU only according to its paging implementation to interpret the linear address, so Linux to the CPU's linear address is necessarily to meet the hardware implementation. For example, Linux on a 32-bit CPU, its level four page table structure will be compatible with the hardware of the two-level page table structure. It can be seen that Linux has done a layer of abstraction on the software level, with a Level four page table compatible with the different hardware implementations of 32-bit and 64-bit CPU memory addressing.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.