Logical Address, linear address, physical address, and virtual address that I understand

Source: Internet
Author: User

[To] http://bbs.chinaunix.net/thread-2083672-1-1.html

I am going to post a year-end summary post for the Chinese New Year. It's just my personal understanding. It's incorrect.
The hardware platform involved in this post is x86. If it is another platform, you may not be able to perform one-to-one check. However, I think it is completely feasible.

I. Concepts

Physical address)

It is used for memory chip-level unit addressing, which corresponds to the address bus connected to the processor and CPU.

-- This concept should be better understood among these concepts, but it is worth mentioning that although physical addresses can be directly understood as the memory itself inserted on the machine, consider the memory as a large array of serial numbers from 0 bytes to the maximum number of NULL bytes, and then call this array a physical address. But in fact, this is just a figure provided by hardware to the software, the memory addressing method is not like this. Therefore, it is more appropriate to say that it is "corresponding to the address bus", but aside from considerations of the physical memory addressing method, it directly maps the physical address to the physical memory one by one, it is also acceptable. Maybe wrong understanding is more conducive to the metaphysical image extraction.

Virtual Memory)

This is the image description of the entire memory (do not insert the upper number with the machine. It is relative to the physical memory, and can be directly understood as "not straight", "fake" memory, for example, a 0x08000000 memory address, it is not the address element of 0x08000000-1 in the large array on the physical address;

This is because the modern operating system provides a kind of memory management image, namely virtual memory ). The process uses the address in the virtual memory, and the operating system assists the relevant hardware to "convert" it into a real physical address. This "conversion" is the key to all discussions.

With this image, a program can use a much larger address space than a real physical address. (The east wall, the west wall, and the bank do the same), and even multiple processes can use the same address. It's not surprising because the converted physical addresses are not the same.

-- You can decompile the connected program and check that the connector has allocated an address to the program. For example, to call a function A, the code is call 0x0811111111 instead of call, that is to say, the address of function a has been fixed. Without such a "Conversion", there is no virtual address concept, and this is simply not feasible.

The problem persists.

Logical Address)

Intel retains the segmented memory management methods of the ancient times for compatibility purposes. A logical address refers to the address in a machine language instruction that is used to specify an operand or an instruction. In the above example, the address 0x08111111 allocated by the connector for a is the logical address.

-Sorry, it seems that it violates Intel's middle-stage management requirements for logical addresses, "a logical address, it is expressed as [segment identifier: Intra-segment offset] by adding an offset of the relative address in the specified segment to a segment identifier, that is, the 0x08111111 in the preceding example, it should be represented as [A code segment identifier: 0x08111111], so that it is complete"

Linear address or virtual address)

Similar to the logical address, it is also an invalid address. If the logical address is the address before the segment Management Switch on the hardware platform, the linear address corresponds to the address before the hardware page memory conversion.


The CPU needs to take two steps to convert the addresses in a virtual memory space to physical addresses: first, we need to give a logical address (in fact, it is a segment offset. This must be understood !!!), The CPU needs to use its segmented Memory Management Unit to convert a logical address into a thread address, and then use its webpage Memory Management Unit to convert it to the final physical address.

The two conversions are indeed very troublesome and unnecessary, because linear addresses can be directly extracted to the process. The reason for this redundancy is that intel is completely compatible.

2. CPU memory management, how to convert logical addresses to linear addresses

A logical address is composed of two parts. The segment identifier is the intra-segment offset. A segment identifier is a 16-bit long field, which is called a segment selector. The first 13 digits are an index number. The last three digits contain some hardware details,

The last two items involve permission checks, which are not included in this post.

Index number, or directly understood as an array subscript -- it always corresponds to an array. What is its index? This is "segment descriptor (segment descriptor)". Haha, the specific address of the segment descriptor describes a segment (I think of it as an understanding of the word "segment, take a knife and cut the virtual memory into several blocks ). In this way, many segment descriptors are grouped into an array called the "segment descriptor table". In this way, the first 13 digits of the segment identifier can be used, find a specific segment descriptor directly in the segment descriptor table. This descriptor describes a segment. The image of the segment is not accurate just now, let's take a look at what is in the descriptor-that is, how it describes it, and understand what the segment actually has. Each segment descriptor consists of 8 bytes, for example:

These things are very complicated. Although we can use a Data Structure to define them, I only care about the same here, that is, the base field, which describes the linear address at the beginning of a segment.

Intel designed some global segment descriptors to be placed in the global segment descriptor table (gdt). Some local descriptors, such as, put it in the so-called "local segment descriptor table (LDT. So when should we use gdt and LDT? This is represented by the T1 field in the segment selection operator, = 0, indicating gdt, = 1 indicating LDT.

The gdt address and size in the memory are stored in the GDTR control register of the CPU, while the LDT is in the ldtr register.

Many concepts, like tongue twisters. This figure looks more intuitive:

First, give a complete Logical Address [segment selector: Intra-segment offset address],

1. Check whether the segment selection operator T1 is 0 or 1. Check whether the CIDR block to be converted is in gdt or in LDT. Then, based on the corresponding register, obtain the address and size. We have an array.

2. Extract the first 13 digits of the segment selection character. You can find the corresponding segment descriptor in this array. In this way, the base address is known.

3. Set base + offset to the linear address to be converted.

It is quite simple. In terms of software, in principle, you need to prepare the information required for hardware conversion so that the hardware can complete the conversion. OK. Let's see how Linux works.

3. Linux segment Management

Intel requires two conversions. Although this is compatible, it is redundant. Oh, no way. If the hardware requires this, the software can only do the same thing. It is also formalistic.

On the other hand, some other hardware platforms do not have the concept of secondary conversion. Linux also needs to provide a high-level image to provide a unified interface. Therefore, the Linux segment management is actually just a "scam" of hardware.

According to Intel's intention, gdt is used globally, and each process uses LDT itself-but Linux uses the same segment for all processes to address commands and data. That is, the user data segment, the user code segment, corresponds to the kernel data segment and the kernel code segment. There is no such thing as taking the form, just like writing a year-end summary.

Include/asm-i386/segment. h

# Define gdt_entry_default_user_cs 14

# DEFINE _ user_cs (gdt_entry_default_user_cs * 8 + 3)

# Define gdt_entry_default_user_ds 15

# DEFINE _ user_ds (gdt_entry_default_user_ds * 8 + 3)

# Define gdt_entry_kernel_base 12

# Define gdt_entry_kernel_cs (gdt_entry_kernel_base + 0)

# DEFINE _ kernel_cs (gdt_entry_kernel_cs * 8)


# Define gdt_entry_kernel_ds (gdt_entry_kernel_base + 1)

# DEFINE _ kernel_ds (gdt_entry_kernel_ds * 8)


Replace the macro with a numeric value:

1. # DEFINE _ user_cs 115 [00000000 1110 0 11]

2. # DEFINE _ user_ds 123 [00000000 1111 0 11]

3. # DEFINE _ kernel_cs 96 [00000000 1100 0 00]

4. # DEFINE _ kernel_ds 104 [00000000 1101 0 00]


Square brackets are the 16-bit binary representation of the four segment delimiters. Their index numbers and T1 field values can also be calculated.

1. _ user_cs Index = 14 T1 = 0

2. _ user_ds Index = 15 T1 = 0

3. _ kernel_cs Index = 12 T1 = 0

4. _ kernel_ds Index = 13 T1 = 0

Copy code

If T1 is 0, gdt is used. Check the corresponding 12-15 items (ARCH/i386/head. s) in the initialized gdt ):

1. Quad 0x00cf9a000000ffff/* 0x60 kernel 4 GB code at 0x00000000 */

2. Quad 0x00cf92000000ffff/* 0x68 kernel 4 GB Data at 0x00000000 */

3. Quad 0x00cffa000000ffff/* 0x73 user 4 GB code at 0x00000000 */

4. Quad 0x00cff2000000ffff/* 0x7b user 4 GB Data at 0x00000000 */

According to the descriptions in the previous segment descriptor table, you can expand them and find that the 16-31 bits are all 0, that is, the base address of the four segments is all 0.

In this way, given an intra-segment offset address, according to the preceding conversion formula, the offset in the 0 + segment can be converted to a linear address, and an important conclusion can be drawn: "in Linux, the logical address and the linear address are always the same (they are the same, not some people say the same), that is, the offset field value of the Logical Address is always the same as the linear address value .!!!"

Too many details are ignored, such as segment permission check. Haha.

In Linux, LDT is not used in most processes unless wine is used to simulate Windows programs.

4. Page memory management of CPU

The page Memory Management Unit of the CPU is responsible for translating a linear address into a physical address. From the perspective of management and efficiency, linear addresses are divided into groups with a fixed length, called pages. For example, a 32-bit machine can provide a maximum of 4 GB linear addresses, 4 kb can be used as a page. on this page, the entire linear address is divided into a large array of tatol_page [2 ^ 20], with a total of 20 to the power of 2 pages. This large array is called a page Directory. Each directory item in the directory is an address-the address of the corresponding page.

Another type of "page" is called a physical page, or a page box or page layout. It is a paging unit that divides all the physical memory into a fixed-length Management Unit. Its length is generally one-to-one correspondence with the Memory Page.

It is noted that the total_page array has 2 ^ 20 members, each of which is an address (32-bit host, and one address is 4 bytes). to represent such an array, it takes up 4 MB of memory space. To save space, a second-level management mode machine is introduced to organize paging units. The text description is too tired to look at the image intuitively:

For example,

1. In the paging unit, the page Directory is unique, and its address is placed in the CR 3 register of the CPU, which is the start point of address conversion. The long journey began.

2. Every active process has its own virtual memory (the page Directory is also unique), which corresponds to an independent page Directory address. -- To run a process, you need to put its page Directory address in the register of and save the other addresses.

3. Each 32-bit linear address is divided into three parts. Surface Directory Index (10 bits): page table index (10 bits): offset (12 bits)

Perform conversion by following these steps:

1. Retrieve the page Directory address of the process from the process (the operating system is responsible for loading the address into the corresponding register when scheduling the process );

2. Find the corresponding index item in the Array Based on the top 10 linear addresses. Because the second-level management mode is introduced, items in the page Directory are no longer page addresses, the address of a page table. (An array is introduced), and the page address is put into the page table.

3. Locate the start address of the page in the page table (also an array) based on the middle 10 of the linear address;

4. Add the start address of the page and the last 12 digits of the linear address to get the final hoist we want;

This conversion process should be very simple. All of them are completed by hardware. Although there is one more procedure, it is worthwhile to save a lot of memory. Then let's simply verify:

1. Can this second-level mode still represent 4G addresses;

Page Directory: 2 ^ 10 items, that is, there are so many page tables

Each table corresponds to 2 ^ 10 pages;

Addressable on each page: 2 ^ 12 bytes.

Or 2 ^ 32 = 4 GB

2. Does this second-level mode actually save space;

That is, the total space occupied by the Directory items and table items on the next page is calculated (2 ^ 10*4 + 2 ^ 10*4) = 8 KB. Ah ,...... How can this problem be solved !!!

Red error. Mark it, which is discussed later ......

According to the explanation in <in-depth understanding of computer systems>, the second-level mode space saving is achieved in two aspects:

A. If a page table entry in the level-1 page table is empty, the level-2 page table does not exist at all. This shows a huge potential savings, because for a typical program, most of the 4 GB virtual address space will be unallocated;

B. Only the first-level page table must always be in the primary storage. The virtual memory system can be created as needed and transferred to or out of the second-level page table on the page, which reduces the pressure on the primary storage. Only the second-level page tables that are most frequently used need to be cached in the primary storage. -- However, Linux does not fully enjoy this benefit. Its page table directory and page table related to the allocated page are all resident memory.

It is worth mentioning that although the items in the page Directory and page table are both 4 bytes and 32 bits, they only use 20 bits in height, the 12-bit low block is 0. It is easy to understand that the 12-bit Low Block of the page table is 0, because in this way, it corresponds to the size of a page, and everyone is an integer. It is much easier to compute. However, why does the page Directory still need to be shielded with 12 characters lower? Because, in the same way, we only need to block the lower 10 bits, but I think, because 12> 10, we can make the page Directory and page table use the same data structure, convenience.

This post only introduces the principle of general conversion. It will not be too long to extend paging, page protection mechanism, and PAE mode paging ...... You can refer to other professional books.

5. Linux page-Based Memory Management

In principle, Linux only needs to allocate the required data structure to each process and put it in the memory. Then, when the process is scheduled, it switches to the Register c3.3, the rest is handed over to the hardware (haha, in fact it is much more complicated, but I only analyze the most basic process ).

I mentioned above the i386 second-level page management architecture. However, some CPUs have third-level or even fourth-level architecture. Linux provides a unified interface for each CPU to provide images at a higher level. Provides a layer-4 page management architecture to be compatible with the CPU of these level-2, level-3, and level-4 Management architectures. The four levels are:

Page global directory PGD (corresponding to the page Directory just now)

Page parent directory pud (new)

Page center Directory (new)

Page table Pt (corresponding to the page table just now ).

Based on the hardware conversion principle, the entire conversion only requires secondary array indexes, such:

So how can I coordinate the 32-bit hardware that uses the second-level management architecture with level-4 conversion? Well, let's see how to divide linear addresses in this case!

From the hardware point of view, the 32-bit address is divided into three parts-that is to say, the three leaders are only known when the software is not managed and finally implemented to the hardware.

From the software perspective, two parts are introduced, that is, there are five parts. -- It is easy to understand the hardware of the layer-2 architecture. When dividing the addresses, set the length of the upper-level directory on the page and the middle directory on the page to 0.

In this way, the operating system sees five parts, and the hardware is still divided by the rigid three parts, and there will be no errors, that is to say, we have built a harmonious computer system.

In this way, we will not set the two in the middle to 0 if we use a 64-bit address and a layer-4 CPU conversion architecture, software and hardware are in harmony again-the image is powerful !!!

For example, a logical address has been converted to a linear address, 0x08147258, and is converted to a binary address, that is:

0000100000 0101000111 001001011000

The kernel divides the address.

Pgd= 0000100000

Pud = 0

PMD = 0

PT = 1, 0101000111

Offset = 001001011000

Now we can understand Linux's Hardware-targeted tricks, because the hardware does not see the so-called pud, PMD, so in essence, the PGD index is required to directly correspond to the PT address. Instead of going to pud and PMD to check the Array (although the two of them are in a linear address, the length is 0, 2 ^ 0 = 1, that is, they are all arrays with an array element ), then, how can we reasonably arrange the address of the kernel?

From the software point of view, because the item has only one 32-bit, it can store the address pointer with the same length as that in PGD. So the so-called first pud, to do the ing conversion to the PMD, it becomes to keep the original value unchanged, one by one can be changed. In this way, "logically pointing to a pud and then to a PDM, but physically pointing to the corresponding PT image, because the hardware doesn't even know pud or PMD ".

Then hand it over to the hardware. The hardware divides the address and you can see the following:

Page Directory = 0000100000

PT = 1, 0101000111

Offset = 001001011000

Well, first index in the page Directory Array Based on 0000100000 (32), find the address in its element, get its 20-bit height, and find the address of the page table, the address of the page table is dynamically allocated by the kernel. Then, the final physical address is added with an offset.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.