Analysis of the Linux memory management mechanism, not from the above concepts, before introducing the above several concepts, first from the "in-depth understanding of the Linux kernel," the book excerpt from a few paragraphs on the above nouns explanation: first , "deep understanding of the Linux kernel" explanation
Logical addresses (Logical address)
An address (somewhat esoteric) that is contained in a machine language instruction to specify an operand or an instruction. This method of addressing is particularly specific in 80x86 's well-known segmented structure, prompting Windows programmers to break programs into segments. Each logical address consists of a segment and an offset that indicates the distance from the beginning of the segment to the actual address.
Linear address (also known as virtual address linear)
is a 32-bit unsigned integer that can be used to represent up to 4GB of addresses, linear addresses are usually represented in hexadecimal digits, and values range from 0x00000000 to 0xFFFFFFFF.
Physical addresses (physical address)
For memory-chip-level memory cell addressing. They correspond to the electrical signals sent from the microprocessor's address pins to the memory bus. The physical address is represented by a 32-bit or 36-bit unsigned integer. (In fact, this is the best understanding, is the real address)
(PS: As you can see in the explanations below, you sometimes think of a logical address as a virtual address, but the linear address is considered a virtual address in the deep understanding of the Linux kernel)
first of all say: Linux about memory addressing can be divided into several stages, first by the segmentation mechanism, then there are paging mechanisms. The paging mechanism is performed after the segment mechanism to complete the linear-physical address conversion process. The segment mechanism converts the logical address to the linear address page mechanism to further convert the linear to the physical address
Below is what I found on the internet and added my own understanding
Two, second explanation
Logical address (Logical addresses)
is the part of the offset address that is generated by the program that is related to the segment. For example, in the C language pointer programming, you can read the value of the pointer variable itself (& operation), in fact, the value is the logical address, which is relative to your current process data segment address, does not correspond to the absolute physical address. Only in Intel real mode, logical addresses are equal to physical addresses (because real mode does not have a segmented or paging mechanism, the CPU does not perform automatic address translation); The logic is that the program executes the offset address within the code snippet length in Intel protection mode (assuming that the code snippet, data segment is identical). Application programmers only need to deal with logical addresses, and the segmentation and paging mechanisms are completely transparent to you and are only covered by system programmers. Although the application programmer can manipulate the memory directly, it can only operate on the memory segment assigned to you by the operating system. (That is, the addresses we see in our applications are logical addresses.)
If it is a programmer, then the logical address should be easy for you to understand. When we write C code, we often say that we define the offset of the first address of the structure, the entry offset of the function, the first address of the array, and so on. When we are in the research of these concepts, it is actually relative to your program. Not for the entire operating system. In other words, the logical address is relative to the specific program that you are compiling (or process, which is actually executed as a process at run time). The entry address of your compiled program can be considered as the first address, and the logical address we can usually assume is that in this program, the compiler assigns us an offset from this first address, or a relative address value that starts with this first address. PS: In this view, the logical address is an offset within a paragraph, but that violates the definition of the logical address, where in the Intel segment is administration, a logical address that consists of a segment identifier plus an offset of the relative address within a specified segment, represented as [segment identifier: offset within segment]
When we double-click an executable program, we provide the operating system with the entry address for the program to run. The shell then passes the address of the executable file to the kernel. After entering the kernel, a new process is fork, and the new process allocates the corresponding memory area first. Here comes a famous concept called Copy on Write, which is the write-time replication technique. Not in detail here, in short, the new process in the fork out, the new process will get the entire PCB structure, and then call the EXEC function in turn to the disk of the code loaded into the memory area. At this point, the process of PCB is added to the executable process queue, when the CPU scheduling to the process of the real implementation. We can interpret the entry address of the program running as the starting address of the logical address, that is, the address of the beginning of a program. and the relevant data for the program to be used later, or the location of the code relative to the starting address (which was arranged beforehand by the compiler), constitutes what we call the logical address. A logical address is one that is relative to a specific program (in fact, a process, that is, the relative address of the program when it is actually run). So understand the details of a certain deviation, as long as the understanding can.
In a word, the logical address is relative to the application. Historical background of logical address generation:
Suegen, Intel's 8-bit machine 8080CPU, data bus (DB) is 8 bits, address bus select、read (AB) is 16 bits. Then this 16-bit address information is also to be transmitted through the 8-bit data bus, but also to the data channel in the register and in the CPU registers and memory storage, but because AB is exactly the db integer times, it will not create contradictions. However, when the rise to 16-bit machine, intel8086/8088cpu design due to the current IC integration technology and external packaging and PIN technology restrictions, not more than 40 pins. But also feel that the original address addressing ability of 8-bit machine 2^16=64kb too little, but directly increased to 16 of the whole number of times ab=32 bit is not up to. So you can only temporarily add AB 4 to become 20. The
2^20=1MB has increased the addressing capacity by 16 times times. The move, however, resulted in a contradiction between the 20-bit and DB-16 bits of AB, where 20-bit address information could not be transmitted on DB or stored in 16-bit CPU registers and memory units. Thus came into being, the principle of the CPU segment structure was produced. Intel, in order to be compatible, retains the ancient memory management of the segment, and there is a logical address
Linear addresses (Linear address)
is the middle tier between logical addresses and physical address transformations. The program code produces a logical address, or an offset address in the paragraph, and a linear address is generated by adding the base address of the corresponding segment. If the paging mechanism is enabled, then the linear address can be transformed to produce a physical address. If the paging mechanism is not enabled, then the linear address is directly the physical address. Intel
80386 has a linear address space capacity of 4G (2 of 32 times 32 address bus select、read addressing). we know that each computer has a CPU (let's say it from a single CPU). Multi-CPU should be the same as the case, the final operation of all instructions or data and so on the operation of the CPU to be carried out, and CPU-related registers are temporary storage of some related information memory equipment. Therefore, from the point of view of the CPU, we can simply divide the computer's related devices or components into two categories: data or instructions to store memory devices (such as registers, memory, etc.), a data or instruction path (such as address lines, data lines, etc.). The essence of a linear address is "the address that the CPU sees." If we traced the trace, we would find that the linear address was the result of the development of Intel's X86 architecture. When the 32-bit CPU appears, it's addressable range of 4GB, and relative to the size of memory, this is a very large number, we also generally do not use such a large amount of memory. So this time the CPU visible 4GB space and the actual capacity of memory generated a gap. The linear address is the 4GB space that is used to describe the CPU's visibility. We know that in a multiple-process operating system, each process has a separate address space and a separate resource. But for a particular moment, only one process is running on top of the CPU. At this point, the CPU is looking at the 4GB space occupied by this process, which is the linear address. And the CPU does the operation, also for this linear space. It's called linear space, presumably because people think it's easier to understand a line of contiguous space. is actually the addressable range of the CPU.
for Linux, the CPU divides 4GB into two parts, 0-3GB for user space (also known as the nuclear space), 3-4GB for kernel space (also known as inner space). Operating system-related code, that is, the kernel part of the code data will be mapped to the kernel space, and the user process will be mapped to user space. As for how the system translates linear addresses into actual physical memory, in the next article, there is no outside management and page-style management.
Physical addresses (physical address)
Refers to the address signal of the addressing physical memory appearing on the external address bus select、read of the CPU, which is the final result address of the address transformation. If the paging mechanism is enabled, linear addresses are transformed into physical addresses using items in the page directory and page tables. If the paging mechanism is not enabled, then the linear address becomes the physical address directly. Iii. Interpretation of the third
Virtual Memory (Memory)
Refers to the amount of memory that the computer presents that is much larger than the actual memory. So it allows programmers to compile and run a much larger program than the actual system has. This allows many large projects to be implemented on systems with limited memory resources. A proper analogy: you don't need a long track to get a train from Shanghai to Beijing. You only need a long enough track (say 3 km) to complete the task. The way to do this is to put the back rails immediately ahead of the train, as long as your operation is fast enough to meet the requirements, the train can be like in a complete track of the operation. This is the task that virtual memory management needs to accomplish. In Linux
0.11 kernel, each program (process) is divided into a total capacity of 64MB of virtual memory space. Therefore, the logical address range of the program is 0x0000000 to 0x4000000.
Sometimes we also refer to the logical address as a virtual address. Because similar to the concept of virtual memory space, logical addresses are not related to the actual physical memory capacity. (This is a little different from the above explanation, and the explanation goes on as follows)
The "gap" between the logical address and the physical address is 0xc0000000, due to the fact that the virtual address-> the linear address-> the physical address map is exactly the same value. This value is specified by the operating system. The transformation method of virtual address to physical address is related to architecture. In general, there are two ways of segmenting and paging. With today's x86 CPU as an example, segmented paging is supported. The Memorymangement unit is responsible for the conversion from the logical address to the physical address. The logical address is the form of segment identification + paragraph offset, MMU can convert the logical address into linear address by querying the section table. If the CPU does not turn on paging, then the linear address is the physical address, and if the CPU turns on paging, MMU also needs to query the page table to convert the linear address to the physical address:
Logical address----(segment table)---> Linear address-(page table)-> Physical Address
Different logical addresses can be mapped to the same linear address, and different linear addresses can be mapped to the same physical address; so it's a many-to-many relationship. In addition, the same linear address can be reloaded to another physical address after a page change occurs. So the mapping of this many-to-many relationship will change over time. Iv. Interpretation of the fourth
Virtual address and logical address of program (process)
A logical address (logicaladdress) refers to an offset address within a segment generated by the program. Applications only deal with logical addresses, and segmented pagination is transparent to the application. In the C language, the symbol address in the assembly language, the "M" of the embedded assembly in C corresponds to the logical address.
The logical address is Intel, in order to be compatible, to preserve the ancient segment memory management. A logical address is an address in a machine language instruction that specifies an operand or an instruction. For the above example, we say that the connector is a 0x08111111 for a and this address is the logical address. But I'm sorry to say that, it seems to violate the logic address requirement in Intel middle management, "a logical address is represented by a segment identifier plus an offset of the relative address within a specified segment, expressed as [segment identifier: offset within paragraph], that is, the 0x08111111 in the example above, should be represented as [a code snippet identifier: 0x08111111], so that's complete.
A linear address, or also called a virtual address, is a linear address, similar to a logical address, if the logical address is the corresponding hardware platform segment management conversion before the address, Then the linear address corresponds to the hardware page memory before the conversion address.
Actual Physical memory address
The Physical Address (physicaladdress) is the addressing signal on the external address bus select、read of the CPU, which is the final result of the address transformation, and a physical address always corresponds to a storage unit in the actual memory. For 80386 protection modes, if the paging mechanism is turned on, the linear address is generated by the page transformation to produce the physical address. If the paging mechanism is not turned on, the linear address directly corresponds to the physical address. Page catalog table entries, page table entries correspond to physical addresses.
Refers to the address signal of the addressing physical memory appearing on the external address bus select、read of the CPU, which is the final result address of the address transformation. If the paging mechanism is enabled, linear addresses are transformed into physical addresses using items in the page directory and page tables. If the paging mechanism is not enabled, then the linear address becomes the physical address directly.
The physical address is used for cell addressing at the memory chip level, and corresponds to the address bus of the processor and CPU connections. This concept should be one of the best understood in these concepts, it is worth mentioning, though, that the physical address can be directly interpreted into the machine on the memory itself, the memory as a large array from 0 bytes to the maximum number of bytes per byte, and then the array called the physical address, but in fact, This is just a hardware-provided image of the software, and memory is not addressed in this way. So, say it is "with address bus select、read", is more appropriate, but aside from the physical memory addressing mode of consideration, directly to the physical address and physical memory one by one corresponding, is acceptable. Perhaps the mistaken understanding is more conducive to the metaphysical image.
Linux0.11 kernel segment, the kernel code segment base address is 0, so for the kernel, the logical address is the linear address. Also because 1 page catalog tables and 4 page tables fully map 16M physical memory, the linear address is the physical address. So for the linux0.11 kernel, the logical address, the linear address, the physical address coincide.
========================================================
A virtual address is an image description of the entire memory (not the number on the machine). It is relative to physical memory, can be directly understood as "unreal", "false" memory, for example, a 0x08000000 memory address, it is not on the physical address that large array of 0x08000000-1 that address element; This is because modern operating systems offer a memory-managed image, virtual memory. The process uses the address in virtual memory, the operating system assists the related hardware, and transforms it into a real physical address. This "transformation" is the key to all the issues discussed. With such a pumping image, a program can use a much larger address space than a real physical address. (Rob Peter, robbing, banks do the same), and even multiple processes can use the same address. Not surprisingly, because the converted physical address is not the same. You can reverse-compile the connected program and find that the connector has assigned an address to the program, for example, to invoke a function A, the code is not call a, it is called 0x0811111111, that is, the address of function A is already fixed. There is no such "conversion", there is no concept of virtual address, this is not feasible. Hold on, the question goes on, and it can't be closed.
v. Summary
The CPU converts an address in a virtual memory space to a physical address, which requires two steps: First, a logical address (in fact, the offset within the paragraph) must be understood ... , the CPU uses its segment memory snap-in to convert a logical address into a thread address, and then use its page memory snap-in to convert to the final physical address.
Linear address: The space or range in which the CPU can be addressed.
Physical Address: Is the actual memory address in the machine. In other words, it is the range of memory capacity in the machine.
Logical address: is for the program. Generally expressed in seg:offset. (The address that the programmer sees for himself)
Therefore, to really compare the three, you should have the following relationship: The linear address is greater than or equal to the physical address (PS: But the address space is the same), and the logical address is greater than the linear address. The logical address is transformed into a linear address through a segment table, and if the paging mechanism is not turned on, the logical address is converted directly to the space that the CPU is capable of addressing. If opened, the transformation of the linear address to the physical address is done through the page table.
Therefore, the most accurate relationship between the three is: the logical address through the linear address to complete the mapping of physical address, the linear address in the three is completely acting as a "bridge" role.
Either way, it's pretty much the same, just putting the virtual address in the three of the remaining questions.