Page table, TLB, Cache introduction (x86), tlbx86
Paging Model
Taking x86 as an example, intel processor's paging Unit processes 4 kb pages. 32-bit linear addresses (virtual addresses) are divided into three domains:-Directory, Directory, up to 10-Table, page Table, 10-Offset in the middle, Offset, 12-bit lower-the address of the page Directory in use is placed in the control register S3, the directory Field determines the directory items in the page directory, while the table points to an appropriate page table item, which contains the physical address pointing to the page box,
Offset determines the specific address in the page.
-Address bit (31-12 ). The base address that points to the page Directory item/page table item/page. -P (0) -- Present ). Indicates that the page table or page Directory is in or not in the physical storage. 0 indicates that the master database is no longer in progress.
If the conversion of a virtual address is required for the page Directory item or the P bit in the page table item is 0, the paging unit places the virtual address in the control register CR2, and generate a page missing exception. -R/W (1) -- Read/Write bit (Read/Write ). Indicates whether the page is readable or writable. However, R/W-bit page write protection only works when the processor is at the user privilege level.
When the processor is at the system privilege level, the R/W Bit is ignored, that is, it can be read, written, or executed. -U/S (2)-User/Administrator (User/Supervisor ). Indicates that the page can be used by both the user layer (privileged Layer 3) and administrator-layer programs. It can only be used by the Administrator layer (privileged layer 0 ~ 2) program usage. -PWT (3) -- Page-level Write Through (Page-level Write Through ). Controls whether to use a direct write or write-back cache write policy. -PCD (4) -- Page-level Cache Disable ). Disable or use cache. -A (5) -- Accessed ). When a page table or page is read or written, the processor places the position. Once the processor is set to a bit, it is no longer cleared, and only the software can reset it. -D (6) -- dirty bit. It is only used for page table items. When writing to the pages involved, the D-bit of the page table item is set. Similar to the access bit, only the software can reset. -PS (7) -- used only for page Directory items. If the value is 1, the page Directory items point to the 2 MB/4 MB page box. -PAT (7) -- (Pentium III processor) Select a PAT (page attribue table) item, select a PAT table item with the pcd pwt, and then select the memory type of the page. -G (8) -- (introduced by Pentium Pro) 1 indicates a global page. This prevents common pages from being refreshed from TLB. It takes effect only when the page global entry (PGE) flag in CR4 is set. -AVL (D9 ~ D11) -- Available for system's programmer use ).
-The paging unit first compares the 20-bit height of the linear address with all 32 items in TLB.
• If there is an address match (TLB hit), the base address of the page is obtained directly. If the base address of the page is added with the low 12-bit offset of the linear address, the 32-bit physical address is calculated.
• If no address match exists, the processor performs Level 2 lookup. At the same time, the 20-bit high linear address read from the page table is stored in TLB.
TLB
-TLB (translation look-aside buffer) is actually a buffer that caches page table items in the memory to speed up the conversion from virtual addresses to physical addresses.
If no TLB exists, two access requests are required for each data retrieval (querying the page table to obtain the physical address and data ).
32-bit address division and cache
Memory address is divided into three parts: tag, index, and block offset.
-Index corresponds to the cache row, and tag corresponds to the highest bit of the memory address. block offset indicates the position of the required data in the data block of a cache row. -Index length = log2 (cache rows)-Block offset length = log2 (bytes_per_data_block). 4 K indicates that the page is 12.
-Tag Length = 32-index-block offset-the 32 KB, 4-Way Group Association, the row size is 16 Bytes Cache as an example, when the normal operation of the address division is as follows: -The Cache is divided into several rows in Line size. Line is the data storage and management unit of Cache. A typical Cache row structure is as follows:
-It consists of the Tag domain, Status domain, and data domain.-The Tag domain stores the high address of the corresponding row of data. After the CPU is indexed, compare the corresponding address with the Tag of all rows in the group to distinguish specific rows. -The number of bytes that the data field can accommodate is the Line Size, which is the unit of data exchange between Cache and memory. -The Status field contains some control bit information (such as Valid, Lock, and Parity check bit). The Status fields of different Cache implementations are slightly different for different Cache types. -For example, if a 32-kb 4-way group is associated and the row size is 16 bytes Cache, 32*1024/(16*4) = 512 group-when the CPU access group is associated with the Cache, first, index the group with the index, and then match the Tag in the group for routing.