Physical address and linear address

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Linear Space & physical space

In the view of hardware engineers and common users, memory is the memory stick inserted or solidified on the motherboard. They have a certain capacity, for example, 64 MB. But in the eyes of application programmers, they don't care too much about the memory size inserted on the motherboard, but the memory space they can use-they can develop a program that needs to occupy 1 GB of memory, and let it run on the OS platform, even if this running host only has 128 MB of physical memory. For OS developers, there is a gap between the two. They both need to know the details of the physical memory and also need to provide a mechanism to provide another memory space for application programmers, the size of the memory space does not have any relationship with the actual physical memory size.

We define the memory space provided by the physical memory stick on the motherboard as physical memory space, and the memory space seen by application programmers as linear space. The physical memory size can be different on different hosts. The physical memory size inserted on the motherboard varies with the physical memory size. However, the linear space provided for application programmers is fixed, it does not change with the physical memory, so as to ensure the portability of applications. Although the size of the physical memory can affect the performance of the application program running, and in many cases there is a minimum requirement on the size of the physical memory, these factors are only for the normal operation of an OS.

The size of the linear space is fixed to 4 GB on the 32-bit platform, which is true for every process (an application can be a multi-process, in the eyes of the OS, in process units ). In other words, linear space is not shared by processes, but isolated by processes. Each process has a 4 GB linear space of the same size. A process's access to a memory address never conflicts with other processes's access to the same memory address. For example, a process can read an integer 8 from the linear space address 1234abcdh, while another process can read an integer 20 from the linear space address 1234abcdh, depending on the process's own logic.

At any time, only one process is running on one CPU. So for this CPU, at this moment, the entire system has only one linear space, which is oriented to this process. When a process is switched, the linear space also changes. Therefore, the conclusion is that each process has its own linear space, and its linear space is known only when the process is running. At other times, its linear space is unknown to the CPU. Therefore, although each process can have a 4 GB linear space, there is only one linear space in the eyes of the CPU. Linear Space changes with process switching.

Although there is no relationship between the size of the linear space and the size of the physical memory, applications using the linear space will eventually run in the physical memory. Any linear address provided by the application must be converted to a physical address to truly access the physical memory. Therefore, the linear memory space must be mapped to the physical memory space. This ing relationship needs to be established by using the data structure specified by the hardware architecture. We may call it a ing table first. The content of a ing table is the ing between a linear memory space and a physical memory space. Once the OS Kernel tells a CPU the location of a ing table, the CPU needs to access a linear space address based on the content of the ing table, this linear space address is converted into a physical space address and sent to the address line. After all, the address line only knows the physical address.

Therefore, we can easily draw a conclusion that if we give different ing tables, the physical addresses that the CPU converts to a certain level-1 space address will also be different. Therefore, we create a ing table for each process to map the linear space of each process to the physical space as needed. Since only one application can be running on a certain CPU at a time, when the task is switched, replace the ing table with the response ing table so that each process has its own linear space without affecting each other. Therefore, at any time, a CPU only needs a ing table to convert the linear space of the current process to the physical space.

--------------------------------------------------------------------------------

2. OS kernel space & process space

Because the OS kernel must exist in the memory at any time, but the process can be switched, there are two parts in the memory at any time, OS kernel and user process. At any time, there is only one linear space for a CPU. Therefore, this linear space must be divided into two parts, one for the OS kernel and the other for the user process. Since the OS Kernel occupies a portion of the linear space at any time, the linear space reserved for the OS kernel can be identical for the linear space of all processes, that is, their respective ing tables are also divided into two parts, one is the process private ing part, and the content of the OS Kernel ing part is identical.

In this sense, we can think that for all processes, they share the linear space occupied by the OS kernel, and each process has its own private linear space. If we divide any 4 GB linear space into 1 gb OS kernel space and 3 GB process space, the 1 gb OS kernel space in the 4 GB linear space of all processes is shared, while the remaining 3 GB process space is private to all processes. Linux does this, while Windows NT makes the OS kernel and process use 2 GB linear space each.

--------------------------------------------------------------------------------

3. segment mapping & page Mapping

Only when the content of all linear spaces is placed in the physical memory can they be truly run and operated. Therefore, although the OS kernel and processes are both placed in linear space, they must eventually be placed in physical memory. Therefore, the OS kernel and all processes eventually share the physical memory. At this stage, the physical memory is far smaller than the linear space-the linear space is 4 GB, and the physical memory space is usually only several hundred megabytes, or even smaller. In addition, even if the physical memory is 4 GB, each process can have 3 GB linear space (if the private linear space of the process is 3 GB ), it is obviously unrealistic to put the linear space of all processes in the physical memory. Therefore, the OS kernel must put some data or code that are temporarily unavailable to some processes out of the physical memory, and provide the limited memory to the most needed processes. In addition
Kernel may run at any time, so it is best to keep the OS kernel in the physical memory forever. We only exchange process data.

Ing from a linear space to a physical space requires a ing table. The ing table maps a linear space to a physical memory space of the same size. Theoretically, we can use two ing methods: variable-length ing and fixed-length ing. Variable-length ing refers to ing a variable-length segment to the physical memory according to different needs. The format can be as follows (Linear Space Segment start address, physical space segment start address, and segment length ). If a process has three segments: 10 m data segments, 5 m code segments, and 8 K stack segments, you can create three items in the ing table, each of which is for one segment. This seems no problem. However, if our actual memory is only 32 MB, of which 10 MB is occupied by the kernel and the physical space left for the process is only 22 MB, then when the process is running, it occupies 10 m + 5 m + 8 K of memory space. Then, when the process is switched, if another process has the same memory requirements, the remaining 22 m-(10 m + 5 m + 8 K) is obviously not enough, at this time, only some segments of the original process can be swapped out, and the entire segment must be swapped out. This means that we have to replace at least one 10 m data segment, and the cost is very high, because we have to copy the 10 m content to the disk, and the disk I/O is very slow.

Therefore, the result of using a variable-length field ing is that a segment is either completely swapped in or out. But in reality, not all code and data in a program can be accessed frequently, and often accessed only accounts for part of all code data, or even a small part. Therefore, a more effective strategy is to replace only those that are not frequently used and retain those that are frequently used. Instead of switching the entire segment. In this way, you can avoid large slow disk operations.

This is the fixed-length ing policy. We divide the memory space into fixed-length blocks, each of which is called a page. The basic format of the ing table is (the starting address of the physical space page). Because the page is fixed, you do not need to specify its length. In addition, we do not need to specify linear addresses in the ing table. We can use linear addresses as indexes to retrieve the corresponding physical addresses from the ing table. When using pages, the policy is as follows: when switching out pages, we only replace those inactive pages, that is, those that are not frequently used, and keep those active pages. During the conversion, only the pages requested for access are swapped in. pages not requested for access will never be swapped in to the physical memory. This is the core idea of the Request page algorithm.

This leads to a page size problem: First, we cannot take bytes as the unit, in this way, the size of the ing table is the same as that of the linear space. If the entire linear space is mapped, we cannot use all the linear space to store this ing table. As a result, we can also know that the smaller the page, the larger the capacity of the ing table. However, the ing table cannot occupy too much space. However, if the page is too large, it will face the same problem as the uncertain length field ing. Each time a page is swapped out, a large number of disk operations are required. In addition, since the minimum unit of memory allocated to a process is page, if our page size is 4 MB, even if a process only needs to use 4 kb of memory, it also has to occupy the entire 4 MB page, which is a big waste. Therefore, we must make a compromise between the two. Generally, the page size specified by the platform is 1.
KB to 8 KB, the page size specified by the IA-32 is 4 kb. (IA-32 also supports 4 MB pages, you can choose based on your OS usage, usually 4 kb pages ).

--------------------------------------------------------------------------------

4. page table

If a 4 kb page is used, 1,048,576 page table entities are required for a 4 GB linear space. Each table item occupies 4 bytes and requires 4,194,304 bytes. A page table occupies 4 MB space. However, if a process needs to use all linear space, the 4 MB page table space investment is also necessary.

However, in reality, few programs need to use such a large space. Generally, the process is very small, from several kb to several Mb. Using such a large page table is a waste. What should we do?

One strategy is to create a variable-length page table-we only create a page table with the required length. However, this policy imposes a lot of restrictions and will still cause a large waste of space. Because the page table mechanism uses a linear address as an index, it is retrieved from the page table. So if we want the OS kernel to use the C0000000h-FFFFFFFFh, that is, the linear space between 3 GB-4 GB, then more than 3 MB of the page table will certainly be used, the page table still has to occupy more than 3 MB space, even if the process only uses 1 kb linear space. Unless we place the OS kernel in 0h-3fffffffh, that is, the first 1 GB linear space. Even so, our page table must occupy at least 1
MB space, although our kernel may only have 4 MB, we only need 1024 table items, that is, the 4 kb table item space-because the process's private linear space starts from H. In addition, for shared libraries, they are generally placed at a location in the physical memory and mapped to a linear space location. This ing relationship is consistent with all processes. Each process places the ing relationship of the required shared library in its own page table. In order to leave sufficient space for the user process, the shared library is usually mapped to a higher linear space, such as a 2 GB location. The page table requires at least 2 MB of space. In short, variable-length page tables cannot truly solve the problem of page table space waste.

Another policy is to use a multi-level page table. The following uses a 4 kb page table as an example to describe how a multi-level page table works.

The first level of a level-2 page table is called the page directory, and the second level is called the page table ).

If a level-1 page table is used, the starting address of the 4 kb page is 4 kb = 2 ^ 12, therefore, a low 12-bit linear address is used for intra-page addressing. A 20-bit high is used for page table indexing. If a two-level page table is used, its 20-bit height is divided into two parts. For example, we can divide it into two 10-bit and the 10-bit height is used to index the page Directory, it is used to locate the page table. The low position is 10-bit for the index of the page table, which is used to locate the page. The lowest 12-bit is used to locate the offset in the page. Therefore, through the combination of three parts, a 32-bit linear address is eventually converted into
32-bit physical address.

In the whole 2-level page table architecture, only one page directory exists, because the page Directory Index is 10-bit, therefore, the page directory contains 2 ^ 10 = 1024 page Directory items (directory entry ). A directory entry occupies 4 bytes, so the page Directory size is 1024*4 = 4 kb, Which is exactly put in one page. Each directory entry points to a page table, so a maximum of 1024 page tables exist, but not all page tables need to exist. The index of each page table is also 10-bit, so a page table contains 2 ^ 10 = 1024 page table items (page-table
Entry); a page-table entry occupies 4 bytes, so the size of a page table is 1024*4 = 4 kb, which is also placed on a page. Therefore, a directory entry with a 20-bit high and a 12-bit low with a value of 0 is the starting address of the page where it points to the page table. A page-table entry with a 20-bit high and a 12-bit low with all 0 values is the starting address of the page to which it points.

In this way, when we give a 32-bit linear address, we first retrieve the high 10-bit as the index of the page Directory and find the corresponding directory entry, find the page where the corresponding page table is located based on the 20-bit height of the directory entry, and then use the 10-bit in the middle of the linear address as the index, find the corresponding page-table entry in this page table, and find the corresponding page based on the 20-bit high page-table entry, at last, the offset is calculated based on the low 12-bit of the linear address, and the page base is added.
Address is converted into a 32-bit physical address.

This is the linear address-to-physical address ing mechanism for two-level page tables. In Level 2 page tables, page Directory occupies 4 kb, which is the minimum memory requirement for Level 2 page tables. Page table is created based on actual needs. If the OS kernel is placed at 3 GB and the size is 4 MB, you only need to create a page table, set the 768th directory entries in the page Directory to the base address of the page where the page table is located. If the process occupies 4 MB and the linear address is 0-4 MB, you only need to create a page table and
Entry is the base address of the page to which the page table is located. Other linear addresses are not used, so you do not need to create other page tables, but leave other directory entries in the page Directory empty. In this case, the space occupied by the page table is 4 kb * 3 = 12 kb. If a level-1 page table is used, it requires 3 m + 4 K even if it is longer. Therefore, the use of level 2 page tables greatly saves page table space.

Based on the fact that most programs are several hundred kb or one or two megabytes, you can use ** or multi-level page tables to further save page table space. However, this also results in too many layers of conversion from a linear address to a physical address. In addition, large programs may occupy more page and table space. Therefore, there are few systems that use more than three levels of page tables.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Physical address and linear address

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Physical address and linear address

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support