Linux memory Management-the difference between a virtual address, a logical address, a linear address, and a physical address (a) "Turn"

Source: Internet
Author: User

This article was reproduced from: http://blog.csdn.net/yusiguyuan/article/details/9664887

The analysis of Linux memory management mechanism, can not leave the above several concepts, before introducing the above concepts, first from the "deep understanding of the Linux kernel," the book to extract a few paragraphs on the above noun explanation:

An explanation of the deep understanding of the Linux kernel

Logical addresses (Logical address)

Contains the address (a bit esoteric) used in machine language directives to specify an operand or an instruction. This type of addressing is particularly specific in 80x86 's well-known segmented structure, prompting Windows programmers to divide the program into several segments. Each logical address consists of a segment and an offset, which indicates the distance from the beginning of the segment to the actual address.

Linear address (linear address) (also known as virtual addresses)

is a 32-bit unsigned integer that can be used to represent addresses up to 4GB, and linear addresses are usually expressed in hexadecimal digits, ranging from 0x00000000 to 0xFFFFFFFF.

Physical addresses (physical address)

For memory chip-level memory unit addressing. They correspond to the electrical signals sent to the memory bus from the address pins of the microprocessor. A physical address is represented by a 32-bit or 36-bit unsigned integer. (In fact, this is the best understanding, is the real address)

PS: As you can see in the explanations below, logical addresses are sometimes treated as virtual addresses, but in "deep understanding of the Linux kernel" the linear addresses are considered virtual addresses.

First of all say: Linux about memory addressing can be divided into several stages, first by the segmentation mechanism, then have paging mechanism.

The paging mechanism is performed after the segment mechanism to complete the linear-physical address conversion process. The segment mechanism transforms the logical address into a linear address page mechanism to further convert the linear to a physical address.
Here is what I learned from the Internet to find information, but also add their own understanding
Ii. explanation of the second

Logical addresses (Logical address)

Refers to the part of the offset address that is generated by the program that is related to the segment. For example, you can read the pointer variable itself (& operation) in the C language pointer programming, which is actually the logical address, which is relative to the address of the data segment of your current process and is not coherent with the absolute physical address. Only in Intel Real mode will the logical address be equal to the physical address (because there is no fragmentation or paging mechanism for the real mode, the CPU does not perform automatic address translation), and the logic is the offset address of the code snippet in Intel protected mode (assuming that the code snippet, the data segment is exactly the same). The application programmer only has to deal with logical addresses, and the segmentation and paging mechanisms are completely transparent to you, and are only covered by system programmers. While the application programmer can manipulate the memory directly, it will only operate on the memory segments assigned to you by the operating system. (That is, the addresses we see in our application are logical addresses.) )
If it is a programmer, then the logical address should be easy for you to understand. When we write C code, we often say that we define the offset of the first address of the struct, the entry offset of the function, the first address of the array, and so on. When we are focusing on these concepts, it is actually relative to your program. Not for the entire operating system. That is, the logical address is relative to the specific program you are compiling (or the process, which is actually executed as a process at run time). The entry address of your compiled program can be considered as the first address, and the logical address we can usually think of is in this program, the compiler assigns us the offset relative to this first address, or the first address as the starting point of a relative address value. (PS: In this sense, the logical address is an intra-paragraph offset, but that is a violation of the logical address definition, in the Intel segment is management, a logical address, is a segment identifier plus a specified segment within the offset of the relative address, expressed as [segment identifier: offset within paragraph])

When we double-click an executable program, it gives the operating system the entry address that the program runs on. The shell then passes the address of the executable file to the kernel. After entering the kernel, a new process is forked, and the new process allocates the corresponding memory area first. A well-known concept here is called copy on Write, which is a copy-on-write technique. This is not explained in detail, in short, after the new process has been fork out, the new process will get the entire PCB structure, and then call the EXEC function to go to the disk to load the code into the memory area. At this point, the PCB of the process is added to the queue of the executable process, and when the CPU is dispatched to this process, it is actually executed.

We can interpret the entry address of the program as the starting address of the logical address, that is, the start address of a program. And the related data of the program used later or the location of the code relative to the starting address (which is arranged by the compiler beforehand), constitutes what we call the logical address. A logical address is relative to a specific program (in fact a process, that is, the relative address of the program when it is actually run). There is a certain deviation in the details so long as you understand it.
In a word, the logical address is relative to the application. Historical background of logical address generation:

seeked, Intel's 8-bit machine 8080CPU, data bus (DB) is 8 bits, address bus (AB) is 16 bits. Then this 16-bit address information is also to be transmitted through the 8-bit data bus, but also in the data channel of the Scratchpad, as well as in the CPU registers and memory storage, but because AB is exactly the number of DB multiples, it will not create contradictions!

However, when the ascent to 16-bit machine, INTEL8086/8088CPU's design due to the current year IC integration technology and external packaging and PIN technology limitations, can not exceed 40 pins. But also felt that 8-bit machine original address addressing ability 2^16=64kb too little, but directly increased to 16 of the integer times even if ab=32 bit is not up to. So we can only temporarily increase the AB 4 to become 20 article. The

2^20=1MB has increased its addressing capacity by 16 times times. This, however, creates a contradiction between the 20-bit and DB 16-bit of AB, where the 20-bit address information cannot be transmitted on the DB or stored in 16-bit CPU registers and memory units. So the emergence of a CPU segment structure of the principle. Intel for compatibility, the ancient period of memory management has been preserved, there is a logical address

Linear addresses (Linear address)
Is the middle tier between the logical address and the physical address transformation. The program code generates a logical address, or an offset address in a segment, and a linear address is generated with the base address of the corresponding segment. If the paging mechanism is enabled, the linear address can then be transformed to produce a physical address. If the paging mechanism is not enabled, then the linear address is directly the physical address. Intel
80386 of the linear address space capacity is 4G (2 of the 32-time-32 address bus addressing).

We know that each computer has a CPU (we do it from a single CPU.) Multi-CPU should be the same, and eventually all the operation of instructions or data, and so on the operation of this CPU, and the CPU-related register is the storage memory device to hold some relevant information. Therefore, from the perspective of the CPU, we can easily divide the computer related devices or components into two categories: first, data or instructions to store memory devices (such as registers, memory, etc.), a data or instruction path (such as address lines, data lines, etc.). The essence of a linear address is "the address that the CPU sees". If we traced it, we would find that the linear address was the result of the development of Intel's X86 architecture. When the 32-bit CPU appears, it has an addressable range of 4GB, which is quite a huge number relative to the memory size, and we don't usually use that much memory. So this time the CPU can see the 4GB space and the actual capacity of the memory to create a gap. The linear address is used to describe the 4GB space that is visible to the CPU. We know that in a multi-process operating system, each process has a separate address space and has a separate resource. But for a particular moment, only one process runs on top of the CPU. At this point, the CPU sees the 4GB space occupied by this process, which is the linear address. What the CPU does is also for this linear space. It's called linear space, presumably because people think it's easier to understand how a continuous space is lined up. is actually the addressable range of the CPU.
For Linux, the CPU divides 4GB into two parts, 0-3GB for user space (also known as outer space), and 3-4GB as kernel space (also known as nuclear space). Operating system-related code, the kernel part of code data, is mapped to kernel space, and user processes are mapped to user space. As for how the system translates the linear address into the actual physical memory, the next article explains that there is no outside-section management and page management.

Physical addresses (physical address)
is the address signal addressing physical memory that appears on the external address bus of the CPU, which is the final result address of the address transformation. If the paging mechanism is enabled, the linear address is transformed into a physical address using the items in the page directory and the page table. If the paging mechanism is not enabled, then the linear address becomes the physical address directly.

Iii. Third Interpretation

Virtual Memory (Vsan)
Refers to the amount of memory that the computer presents that is much larger than the actual memory. So it allows programmers to compile and run programs that are much larger in memory than the actual system. This enables many large projects to be implemented on systems with limited memory resources. A very proper analogy is that you don't need a long track to get a train from Shanghai to Beijing. You only need long enough rails (say 3 km) to complete this task. The way to do this is to put the rear rails immediately in front of the train, as long as your operation is fast enough to meet the requirements, the train will be able to run like a complete track. This is the task that virtual memory management needs to accomplish. In Linux
0.11 cores, each program (process) is divided into a total capacity of 64MB of virtual memory space. So the program's logical address range is 0x0000000 to 0x4000000.


Sometimes we also refer to logical addresses as virtual addresses. Because similar to the concept of virtual memory space, the logical address is independent of the actual physical memory capacity. (This is a little bit different from the explanation above, and the next explanation goes as follows)
The "gap" between the logical address and the physical address is 0xc0000000, due to the exact difference between the virtual address, the linear address, and the physical address mapping. This value is specified by the operating system.

The conversion method of virtual address to physical address is architecture-related. Generally, there are two ways of segmenting and paging. Take the current x86 CPU as an example, the segmented paging is supported. The Memorymangement unit is responsible for converting from a logical address to a physical address. The logical address is the form of the segment identifier + offset within the segment, and the MMU can convert the logical address into a linear address by querying the segment table. If the CPU does not turn on paging, then the linear address is the physical address, and if the CPU turns on paging, the MMU also needs to query the page table to translate the linear address into a physical address:

Logical address----(segment table)---> Linear address-Physical address-(page table)
Different logical addresses can be mapped to the same linear address, and different linear addresses can be mapped to the same physical address; so it's a many-to-one relationship. In addition, the same linear address may be reloaded to another physical address after a page change occurs. So this many-to-one mapping also changes over time.

Iv. explanation of the fourth
    1. Virtual address and logical address of the program (process)

The logical address (logicaladdress) refers to the offset address within the segment generated by the program. Applications only deal with logical addresses, and fragmented paging is transparent to the application. In other words, the & in C language, the symbolic address in assembly language, and the "M" of the embedded assembly in C are all logical addresses.

The logical address is Intel in order to be compatible, the ancient period of memory management to preserve the way. A logical address is a machine language instruction used to specify an operand or an address of an instruction. In the example above, we say that the 0x08111111 of the connector is the logical address of the assigned address. But I'm sorry to say that, it seems to be against the Intel Middle-management, the logical address requirements,"a logical address, is a segment identifier plus a specified paragraph within the offset of the relative address, expressed as [segment identifier: Intra-segment offset ], that is, The 0x08111111 in the example above should be represented as [A's code snippet identifier : 0x08111111], so that it's complete "
Linear address or virtual address : Similar to logical address, it is also an unreal address, if the logical address is the corresponding hardware platform segment management pre-conversion address, Then the linear address corresponds to the pre-conversion address of the hardware page memory.

    1. Actual Physical memory address

The Physical Address (physicaladdress) is the addressing signal on the external address bus of the CPU, which is the final result of the address transformation, and a physical address always corresponds to a storage unit in real memory. For the 80386 protection mode, if the paging mechanism is turned on, the linear address is transformed from the page to the physical address. If the paging mechanism is not turned on, the linear address corresponds directly to the physical address. The page Catalog table entries and page table entries correspond to the physical addresses.

is the address signal addressing physical memory that appears on the external address bus of the CPU, which is the final result address of the address transformation. If the paging mechanism is enabled, the linear address is transformed into a physical address using the items in the page directory and the page table. If the paging mechanism is not enabled, then the linear address becomes the physical address directly.

The physical address is used for memory chip-level cell addressing, which corresponds to the address bus that the processor and CPU are connected to. This concept should be one of the best understanding of these concepts, but it is worth mentioning that although the physical address can be directly understood into the machine on the memory itself, the memory as a large array from 0 bytes to the maximum empty byte-by-bit number, and then the array is called the physical address, but in fact, This is just a hardware-to-software image, and the memory is not addressed in this way. So, it is "relative to address bus ", is more appropriate, but aside from the physical memory addressing mode of consideration, the physical address directly to the physical memory one by one corresponds, is also acceptable. Perhaps the wrong understanding is more favourable to the metaphysical image.

Linux0.11 the kernel data segment, the kernel code snippet base address is 0, so for the kernel, logical addresses are linear addresses. And because the 1-page Catalog table and 4 page tables fully map 16M of physical memory, the linear address is also the physical address. Therefore, for the linux0.11 kernel, logical address, linear address, physical address coincident.

========================================================

A virtual address is a description of the entire memory (not the one that is plugged into the machine). It is relative to the physical memory, it can be directly understood as" Unreal" ,"false " memory, for example, a 0x08000000 memory address, which does not 0x08000000-1 that address element in the large array on the physical address; Because modern operating systems provide a memory-managed image, virtual memory. The process uses the address in virtual memory, which is assisted by the operating system to "transform " it into a real physical address. This "Conversion " is the key to all the issues discussed. With such a pump, a program can use a much larger address space than the real physical address. (Rob Peter, pay Paul, banks do the same), and even multiple processes can use the same address. Not surprisingly, since the converted physical address is not the same. You can decompile the connected program and see that the connector has assigned an address to the program, for example, to invoke a function A, the code is not call a, it's called 0x0811111111, which means that the function The address of a has been fixed. There is no such thing as a "transform ", there is no concept of virtual address, and this does not work at all. Hold on, the question goes on again, and it won't hold up.
V. Summary

The CPU converts an address in a virtual memory space into a physical address, which takes two steps: First, given a logical address (in fact, the offset within the paragraph, this must be understood!!!). ), the CPU uses its segment memory management unit to convert a logical address into a thread address, and then use its page-based memory management unit to convert to the final physical address.


Linear address: Is the space or range that the CPU is capable of addressing.
Physical Address: Is the actual memory address in the machine. In other words, it is the range of memory capacity in the machine.
Logical address: is for the program. Generally expressed as seg:offset. (Programmers see the address themselves)
Therefore, to really compare the three, you should have the following relationship: The linear address is greater than or equal to the physical address (PS: But the address space is the same), and the logical address is greater than the linear address. The logical address is transformed into a linear address through the segment table, at which point the logical address is converted directly to the space the CPU can address if the paging mechanism is not turned on. If enabled, the transformation of the linear address to the physical address is done through the page table.
Therefore, the most accurate relationship of the three is: the logical address through the linear address to complete the mapping of physical address, the linear address in the three is completely the role of "bridge."

Either way, it's about the same, but the virtual address belongs to the three remaining questions.

Linux memory Management-the difference between a virtual address, a logical address, a linear address, and a physical address (a) "Turn"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.