There are two basic ways to manage memory in a computer: Segment management and page management. When using a 80x86 microprocessor, the memory address is divided into three different addresses: logical address, linear address, physical address. What is the relationship between them, memory is how to address, this article is mainly about memory addressing.
1. Basic concept:
CPU Segment Management: The basic principle of segment management is to store a program into several segments (segment), each of which is a logical entity (logical entities). A user job or process contains a segment corresponding to a two-dimensional linear virtual space, the program is divided into multiple modules (segmentation), it is possible to write and compile each module of the program. The segment manager allocates memory in segments, and then translates the segment virtual address into a virtual address through an address mapping mechanism.
CPU Page Management: The basic principle of page management divides the virtual space of each process into several pages of equal length (typically 4K), page management divides the memory space into pages or pages (page frame), and then sets the page virtual address and memory address to one by one corresponding page table. and the corresponding hardware address transform mechanism is used to solve the problem of discrete address transformation.
Logical address: An address that is included in a machine language instruction to specify an operand or an instruction, each of which consists of a segment and an offset, expressed as a [segment identifier: offset within paragraph]. For example, we use pointers to the variable address operation in a C + + program, which is the logical address (exactly the offset within the segment of the logical address). Corresponding to the above-mentioned paragraph management, the logical address is the segment management before the conversion of the program address.
Linear address: Also known as the virtual address, it is a 32-bit unsigned integer, so it can be used to express up to 4GB address. The linear address is the same as the logical address, which is not true. For the above page management, the linear address is the address before the page-managed conversion.
Physical Address: Used for memory chip-level memory unit addressing, which corresponds to the address bus that the processor and CPU are connected to. In general, we say that the memory in the computer's memory bar is it (though inaccurate).
With these basic concepts in place, it is clear that the CPU translates the address in a virtual memory space into a physical address, which requires two steps: First, given a logical address, the CPU uses its segment memory management unit to convert each logical address into a thread address and then use its page memory management unit. Converted to the final physical address. This is what we know about section-page management, so the benefits of two conversions can overcome the drawbacks of segment management and page management.
2. CPU Segment Memory Management: logical address translated to linear address
A logical address consists of two parts: a segment identifier and an offset for a relative address within a specified segment (the short offset), [segment identifier: Offset within a segment].
A segment identifier consists of a 16-bit long field called a segment selector, which is provided by the processor with segment registers to hold the segment identifier, and the segment register has 6 types:
(1) CS Code Segment register, pointing to the segment containing the program instruction;
(2) SS Stack Register, pointing to the segment containing the current program;
(3) DS data segment registers, pointing to contain static data or global data segment;
(4) Other three registers es, FS, GS called additional segment registers, for general purposes, can point to arbitrary data segments.
The offset indicates the distance from the beginning of the segment to the actual address, with an offset of 32 bits.
As shown above, the segment identifier (segment selector) format, where the most critical part is the index number.
The segment identifier, by its literal meaning, can be used to identify a segment and how the segment is represented.
This extends to another concept: segment descriptors, each represented by a 8-byte segment descriptor, which describes the characteristics of a segment (segment descriptor is a segment). The segment identifier can find its corresponding segment (segment descriptor) by index number (13-bit), and the format of the segment is shown in the following figure:
The fields we need to focus on in the segment descriptor are: Base, which describes the linear address where a segment begins.
The segment descriptor is placed in a Global Descriptor list (GDT, stored in a GDTR register) or a local descriptor list (LDT, stored in the LDTR register), usually defining only one GDT, and each process, in addition to being stored in a GDT, can have its own LDT if it learns to create additional segments.
Well, through the above explanation, we can focus on how a logical address is translated into the corresponding linear address, the following steps:
(1) Check the Ti field of the segment selector of the logical address first to determine which descriptor table the segment describes characters. (ti=0 indicates that in GDT, Ti=1 indicates the existence of the LDT)
(2) Calculates the address of the lookup segment descriptor, based on the index number of the segment selector, by: index number *8 + GDTR or LDTR register = Base.
(3) The corresponding linear address can be obtained by adding the offset of the logical address to the value of the base field obtained in step (2).
The conversion diagram for the logical address is shown in the following figure:
3, CPU page-type memory management: Linear address translation to physical address
First, we need to know something about the linear address.
(1) A linear address is divided into a fixed-length group called a page. The continuous linear address within the page is mapped to a contiguous physical address.
(2) The pagination segment divides all physical addresses into fixed-length page frames, called physical pages.
(3) A data structure that maps a linear address to a physical address is called a page table.
(4) 32-bit linear address, divided into 3 domains: directory (directories) high 10 Bits, page tables (table) middle 10 bits, offset (offset) Low 12 bits, by the offset of 12bit, each page contains 4096 bytes of data.
The conversion of the linear address is done in two steps, each based on a conversion table, the first conversion table is called a page catalog table conversion, and the second conversion is called a page table conversion. The purpose of using this level two pattern is to reduce the amount of RAM required for each Process page table. Just like we have a book directory, convenient and quick. The exact conversion is shown in the following figure:
Conversion steps:
(1) Remove the page directory address of the process from the CR3 (the operating system is responsible for loading this address into the corresponding register when scheduling the process);
(2) According to the first ten linear address, in the array, find the corresponding index entry, because the introduction of the level two management mode, the page directory of the item, no longer the address of the page, but the address of a page table. (An array is introduced, and the address of the page is placed in the page table.)
(3) The starting address of the page is found in the page table (also the array) according to the middle 10 bits of the linear address;
(4) Add the starting address of the page and the last 12 bits of the linear address to get the corresponding physical address that we want to end up with.
Reference:
(1) Deep understanding of the Linux kernel
(2) Lonely nine cheap http://bbs.chinaunix.net/thread-2083672-1-1.html