Introduction to memory management by the operating system

Last Update:2018-12-07 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

Memory is one of the most important resources in a computer. Generally, physical memory cannot accommodate all processes. Although the increase in physical memory has now reached n GB, It is faster than the increase in physical memory.ProgramTherefore, no matter how the physical memory grows, it cannot keep up with the program growth rate. Therefore, it is especially important for the operating system to effectively manage the memory. This article describes the past and present memory management of the operating system, and some Page ReplacementAlgorithm.

Brief Introduction to processes

Before starting, let's briefly introduce the process from the operating system perspective. A process is the smallest unit that occupies resources, including memory. In modern operating systems, the memory that each process can access is independent of each other (except for some SWAP areas ). The threads in the process share the memory space allocated by the process.

From the operating system perspective, process = program + Data + PCB (Process Control Block ). This concept is a little abstract. Let me use an analogy: for example, if you are cooking in the kitchen, you can watch the recipe and make the ingredients into a dish by following the recipe, your son came in and told you that he had wiped his leg. Now you stop your work, buckle the recipe, and find the first aid book to paste a Band-Aid Post to your son according to the content in the book, after the post, you continue to open the recipe and then continue cooking. In this process, you are like a CPU, recipe is like a program, and the raw material for cooking is like data. You process data according to program instructions, and the first aid work is like a process with a higher priority. It interrupts your current cooking work and you deduct the recipe (to protect the site ), instead, you can focus on the high-priority process. After the process is completed, you can continue to read the recipe from the previous page (restore the site) and then continue to execute the cooking process.

After briefly introducing the concept of a process, we will transfer it to the memory.

Era without memory abstraction

In earlier operating systems, the concept of memory abstraction was not introduced. The program directly accesses and operates on physical memory. For example, when the following command is executed:

 MoV reg1, 1000

This command will unimaginative assign the content in physical address 1000 to the Register. It is not hard to imagine that this memory operation makes it impossible to have multiple processes in the operating system, such as the MS-DOS, You have to execute a command before you can execute the next one. If it is a multi-process, because the physical memory address is operated directly, when a process assigns a value to the memory address 1000, the other process also assigns a value to the memory address, the value assigned to the memory by the second process overwrites the value assigned by the first process. This causes both processes to crash at the same time.

Memory abstraction is usually very simple for memory management, except for the memory used by the operating system, all for user programs. Or leave another area in the memory for the driver, as shown in 1.

Figure 1. memory usage when there is no memory abstraction

In the first case, the operating system is stored in Ram and the memory address is low. In the second case, the operating system is stored in ROM and the memory address is high, in general, the old-fashioned mobile phone operating system is designed in this way.

In this case, if you want the operating system to execute multiple processes, the only solution is to exchange with the hard disk. When a process is executed to a certain extent, it is stored in the hard disk, instead, execute other processes and retrieve the memory from the hard disk when the process needs to be executed. As long as there is only one process in the memory at the same time, this is called swapping) technology. However, this technology still directly operates the physical memory and may cause the process to crash.

Therefore, in general, such memory operations often only exist in some washing machines, microwave oven chips, because there cannot be a second process to acquire memory.

Memory abstraction

In modern operating systems, it is no longer normal to run multiple processes at the same time. To solve various problems caused by Direct Memory operations, the address space is introduced, which allows each process to have its own address. This also requires two registers on the hardware, base register and limit register. The first register stores the starting address of the process, and the second register stores the upper bound, prevent memory overflow. In the case of memory abstraction, when

 MoV reg1, 20

In this case, the actual physical address is not 20, but the actual physical address is calculated based on the base address and offset. The actual address of the operation may be:

 MoV reg1, 16245

In this case, any operation to operate a virtual address is converted to an operation physical address. The memory addresses of each process are completely different, which makes it possible for multiple processes.

However, there is another problem at this time. Generally, the memory size cannot accommodate all concurrent processes. Therefore, swapping technology emerged. This exchange is similar to the previous one, but the current exchange is under multi-process conditions. The basic idea of switching is to swap idle processes out of memory, store them in the hard disk, and switch back to memory when they are executed. For example, in the following example, when the program starts, only process a gradually has process B and C. At this time, process D is generated, but there is not enough space in the memory for process D. Therefore, process B is exchanged out of memory and distributed to process D. 2.

Figure 2. Exchange Technology

As shown in figure 2, we also find that the space between process D and process C is too small to be used by any other process. This is called external fragmentation. One way is to use the compory Compaction Technology to fill up these external fragments by moving the address of the process in the memory. There are also some clever methods, such as memory sorting software, the principle is to apply for a large memory, replace all processes out of the memory, and then release this memory, so that the process is loaded again, eliminate external fragments. This is also the reason why the hard disk will be read after running the memory arrangement. In addition, compact technology will consume a lot of CPU resources. A 2 gb cpu does not have 10 ns and can process 4 bytes. Therefore, it may take several seconds to compact one more 2 GB of memory.

The above theory is based on the assumption that the memory space occupied by the process is fixed, but in practice, the process tends to grow dynamically, so the memory allocated when the process is created is a problem, if the allocation is too large, internal fragments will be generated, which wastes the memory. If the allocation is small, memory overflow will occur. One solution is to allocate a little more memory space than the actual needs of the Process during process creation for process growth. One is to directly allocate more memory space for the growth of processes in the memory, and the other is to differentiate growth into data segments and stacks (for storing return addresses and local variables), as shown in 3.

Figure 3. Reserved space for growth during Process Creation

When the reserved space is not enough to meet the demand, the operating system first checks whether the adjacent memory is idle. If it is idle, it is automatically allocated. If it is not idle, move the entire process to a sufficient memory to accommodate the growth. If such memory space does not exist, the idle process will be replaced.

When the process is allowed to grow dynamically, the operating system must manage the memory more effectively. The operating system uses one of the following two methods to learn the memory usage: 1) bitmap (Bitmap) 2) linked list

Bitmap is used to classify the memory into multiple equal-Size Blocks. For example, if a 32 K memory 1 K can be classified as 32 blocks, 32-bit (4 bytes) is required to indicate its usage, use bitmap to mark used blocks as 1 and bits as 0. if you use a linked list, the memory is linked to multiple segments by usage or unused segments. This concept 4 is shown.

Figure 4. bitmap and linked list indicate memory usage

Use P in the linked list to indicate the process. 0-2 indicates the process, h indicates the process is idle, and 3-4 indicates the process is idle.

The usage of Bitmap indicates that the memory is simple and clear, but one problem is that a large amount of continuous 0 space must be searched in the memory when the memory is allocated, which is a resource-consuming operation. In contrast, using a linked list for this operation will be better. Some operating systems also use a two-way linked list, because when a process is destroyed, the adjacent process is often empty memory or another process. Using a two-way linked list makes it easier to combine linked lists.

In addition, when using the linked list to manage memory, it is also a problem to allocate idle space when creating a process. Generally, there are several algorithms to allocate space when a process is created.

Next fit: searches for the first memory space that meets the process requirements from the current position.
Best fit: searches the entire linked list to find the memory space that meets the process requirements
Maximum adaptive algorithm (wrost fit) --- find the maximum free space in the current Memory
First fit: Find the first memory space that can meet the process requirements from the first in the chain table.

Virtual Memory)

Virtual Memory is a widely used technology in modern operating systems. The abstraction mentioned above satisfies the requirements of multi-process, but in many cases, the existing memory cannot meet the memory requirements of only one large process (for example, many games are at the 10g + level ). In the early days, the operating system used overlays to solve this problem and divided a program into multiple blocks. The basic idea is to add Block 0 to the memory first. After Block 0 is executed, add Block 1 to the memory. In turn, the biggest problem with this solution is that programmers need to block the program, which is a time-consuming, laborious, and painful process. Later, the corrected version of this solution was virtual memory.

The basic idea of virtual memory is that each process uses an independent logical address space. The memory is divided into multiple blocks of the same size, called pages. Each page is a continuous address. For processes, there seems to be a lot of logical memory space, some of which correspond to a piece of physical memory (called a page box, usually with the same page size as the page box ), there are also some corresponding hard disks that are not loaded into the memory, as shown in Figure 5.

Figure 5. ing between virtual memory and physical memory and disk

Figure 5 shows that the virtual memory is actually larger than the physical memory. When accessing the virtual memory, the MMU (Memory Management Unit) is accessed to match the corresponding physical address (0, 1, 2 of 5 ), if the page of the virtual memory does not exist in the physical memory (3, 4 of 5), there will be page-missing interruptions, and missing pages will be obtained from the disk into the memory. If the memory is full, some algorithms are used to swap out pages on disks.

The matching between the virtual memory and physical memory is achieved through the page table. The page table exists in MMU, and each item in the page table is usually 32 bits, which is 4 bytes, in addition to the virtual address and page box address, some flag spaces are also stored, such as whether pages are missing, whether they have been modified, and write protection. You can think of MMU as a method to receive virtual addresses and return physical addresses.

Because each entry in the page table is 4 bytes, the current 32-bit virtual address space of the operating system is 2 to the power of 32, even if each page is divided into 4 K, it also requires a space of 2 to the power of 20x4 bytes = 4 MB. it is unwise to create a 4 m page table for each process. Therefore, we promote the concept of page tables to generate Level 2 page tables. Each Level 2 page table corresponds to a virtual address of 4 m, and the Level 1 page table indexes these Level 2 page tables, therefore, a 32-bit system requires 1024 second-level page tables. Although the number of page table entries is not reduced, only the second-level page tables and first-level page tables to be used can be stored in the memory, greatly reducing the memory usage.

Page Replacement Algorithm

In computer systems, it usually takes several milliseconds to read a small amount of data from a hard disk, while in memory it only takes several nanoseconds. A cpu command is usually several nanoseconds. If several page-missing interruptions occur when executing the CPU command, the performance can be imagined. Therefore, minimizing the number of read requests from the hard disk will undoubtedly greatly improve the performance. As we know before, the physical memory is extremely limited. When the page requested by the virtual memory is not in the physical memory, we need to replace the page in the physical memory, it is especially important to choose which pages to replace. If the algorithm is not good enough to replace the pages to be used in the future, it is necessary to replace them in the future, which undoubtedly reduces the efficiency, let's look at several Page Replacement algorithms.

Optimal Page Replacement Algorithm)

The best replacement algorithm is to replace the pages that are not used for the longest time in the future. It sounds simple but cannot be implemented. However, this algorithm can be used as a benchmark for measuring other algorithms.

Algorithms are not frequently used recently (not recently used replacement algorithm)

This algorithm provides a flag for each page. r indicates that it has been recently accessed, and M indicates that it has been modified. Regularly clears R. The idea of this algorithm is to first eliminate those pages that have not been accessed r = 0, followed by pages that have been accessed r = 1 and not modified m = 0, finally, the R = 1, m = 1 page.

First-in, first-out Page Replacement Algorithm)

The idea of this algorithm is to eliminate the longest pages in the memory, and the performance of this algorithm is close to random elimination. Not good.

Improved FIFO algorithm (Second Chance page replacement algorithm)

This algorithm is based on FIFO. To avoid replacing frequently used pages and adding a flag R, if R is set to 1 recently, the page will be eliminated, if R is 1, the page is not eliminated and R is set to 0. those pages with r = 0 will be eliminated directly. This algorithm avoids frequent page elimination.

Clock replacement algorithm (clock page replacement algorithm)

Although the improved FIFO algorithm avoids replacement of frequently used pages, it is inefficient to move pages frequently. Therefore, on the basis of the improved FIFO algorithm, connect the first queue to form a loop. When a page is interrupted, find the r = 0 page from the current position, the page that passes through R = 1 is set to 0 and does not need to be moved. 6.

Figure 6. Clock replacement algorithm

Least-time unused algorithm (LRU page replacement algorithm)

The LRU algorithm aims to eliminate the longest unused pages recently. This algorithm has good performance, but is difficult to implement.

The following table compares the preceding algorithms:

Algorithm	Description
Optimal Replacement Algorithm	Unable to implement, use the most test Benchmark
Algorithms are not frequently used recently.	Similar to LRU Performance
First-in-first-out Algorithm	May replace frequently used pages
Improved FIFO Algorithm	It is greatly improved compared with the first-in-first-out method.
Algorithm not used for a long time	Very good performance, but difficult to implement
Clock replacement algorithm	Very practical algorithms

The above algorithms have more or lessLocality Principle. The Locality Principle is divided into temporal and spatial locality.

1. In terms of time, recently accessed pages will be accessed in the near future.

2. In space, pages around pages accessed in memory are also likely to be accessed.

Summary

This article briefly introduces the memory management of the operating system. These basic concepts are very helpful for many developers. In memory management, there is also a kind of segmented management, that is, a process can have multiple independent logical addresses, and the previous article will be supplemented later.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More