Operating System Concepts Learning note 15 memory management (i) background
Memory is the center of modern computer operation. Memory has a very big group of words or bytes, and each word or byte has its own address. The CPU extracts instructions from memory based on the value of the program counter (PC), which can cause further reads and writes to a specific memory address.
A typical instruction execution cycle in which instructions are read from memory first. The instruction is then decoded, and the operand may need to be read from memory. After the instruction executes on the operand, its result may be stored back into memory. Memory cells see only the address stream, not until the addresses are generated (by instruction counters, indexes, indirection, real addresses, and so on) or what addresses (directives or data) they are.
Basic hardware:
The memory that is directly accessible to the CPU is only the RAM and the registers within the processor. The machine instruction can use the memory address as the parameter, but cannot use the disk address as the parameter. If the data is not in memory, then the CPU must first move the data into memory before using it.
CPU built-in registers can typically be accessed within a single CPU clock cycle. For the contents of a register, the vast majority of CPUs can parse and execute one or more instructions in a single clock cycle, but not for memory. Multiple CPU clock cycles are required to complete memory access, and the CPU usually needs to be paused (stall)because there is no data to complete the executing instruction. Because memory accesses are frequent, this situation is intolerable, and the workaround is to increase high-speed memory between the CPU and memory. This coordinated speed difference in memory buffering goes, called cache.
In addition to guaranteeing the relative speed of accessing physical memory, ensure that the operating system is not accessed by user processes and that user processes are not accessed by other user processes.
One of the possible scenarios is:
First, make sure that each process has a separate memory space, and to do this, you need to determine the scope of legitimate addresses that the process can access, and ensure that the process can only access its legitimate addresses. This protection can be achieved through the base Register and the Boundary address register (limit register) . Base Register contains the smallest physical memory address, and the limit register determines the size of the range. For example: If the base address register is 300040 and the threshold register is 120900, then the program can access all addresses from 300040 to 420940.
The implementation of memory space protection is accomplished by comparing each address of the CPU hardware to the address process of the register generated by the user mode. If you access an address that you should not access, you will be caught in the operating system and treated as a fatal error.
Operating system in kernel mode, unlimited access to the operating system and user memory. Therefore, the operating system can load the user program into user memory, output these programs on error, access and modify the parameters of the system call, and so on.
Address bindings:
Typically, a program is stored on disk as a binary executable file. In order to execute, the program is transferred into memory and placed in the process space.
Depending on the memory management scheme used, the process can move between disk and memory as it executes. The process that waits on the disk to be transferred into memory for execution to form an input queue.
The usual step is to pick a process from the input queue and load it into memory. When the process executes, it accesses the in-memory instructions and data. Finally, the process terminates and its address space is freed.
Many systems allow user processes to be placed anywhere in the physical address. This combination affects the address space that the user program can use. In most cases, the user program will go through several steps before execution, where the address may have different representations, and the addresses in the source program are usually represented by symbols (such as count), which the compiler typically binds (BIND) At the address that can be relocated (for example, the 14th byte from the beginning of this module). The linker or loader then binds these relocatable addresses to an absolute address (such as 74014). Each binding is a mapping from one address space to another address space.
In general, there are several scenarios in which you bind directives and data to memory addresses:
compile time Compile: If you know at compile time that the process will reside in memory address, then you can generate absolute code (absolute). If the address changes in the future, the code must be recompiled.
load time: when the compiler does not know where the process will reside in memory, the compiler must generate a relocatable code (reloadable). The binding is deferred until it is loaded. If the start address has changed. Only reload user code has been introduced to change the value.
Execute (Execution time): If a process can move from one memory segment to another in execution, the binding must be deferred until execution occurs. Most general-purpose computer systems use this approach.
Logical address space and physical address space:
The CPU-generated address is often referred to as the logical address (logical addresses), and the address that the memory unit sees (that is, the address that is loaded into the memory address register (memory-address Register) ) is often referred to as Physical addresses (physical address).
The address binding method at compile and load generates the same logical address and physical address. However, the address binding scheme at execution time results in a different logical address and physical address. In this case, the logical address is usually called the virtual address. All logical addresses generated by the program are called Logical address spaces (logicaladdresses space), and the collection of physical addresses corresponding to those logical addresses is called the Physical address space (physical.
The mapping of the runtime from the virtual address to the physical address is done by a hardware device called the Memory Management unit (Memory-management Unit,mmu) . There are many alternative ways to accomplish this mapping, such as using a simple MMU scheme to implement this mapping, which is a generalization of the base site Register scheme, where the base address register is called the Relocation Register (relocation register), The address generated by the user process is prefixed with the value of the relocation register before it is sent to memory.
If the base address is 14000, then the user's access to address 346 will be mapped to address 14346.
The user program will never see a real physical address. For example, the program can create a pointer to position 346, save it in memory, use it, compare it to other addresses, and so on, all of which are based on 346. When a user program processes a logical address, the memory-mapped hardware transforms the logical address into a physical address. The referenced memory address is only last positioned when referenced.
The logical address space is bound to a separate set of physical address spaces.
Dynamically loaded (dynamic loading):
If the entire program and data for a process must be in physical memory, the size of the process is limited by the size of the physical memory.
For better memory space usage, use dynamic loading, where a subroutine is loaded only when it is called.
All subroutines are saved on disk in a relocatable form. The main program loads the memory and executes it. When a subroutine needs to call another subroutine, the calling subroutine first checks to see if another subroutine has already been loaded. If not, the RELOCATABLE linker will be used to load the required subroutines and update the program's address table to reflect this change. The control is then passed to the newly loaded subroutine.
The advantage of dynamic loading is that this method is particularly useful if the subroutine is never loaded and if most of the code needs to handle exceptions, such as error handling. In this case, although the overall program is relatively large, but the use of the part may be much smaller.
Dynamic loading does not require the operating system to provide special support. Using this method to design a program is primarily a user's responsibility.
Dynamic links (dynamically linking) and shared libraries:
Some operating systems only support * static linking * this time the system language library is processed in the same way as other target modules, and the loader is merged into the binary image.
The concept of dynamic linking is similar to dynamic loading. But instead of delaying the load to run time, the link is deferred to the runtime. This feature is commonly used in system libraries, such as language sub-Libraries. Without this, all programs on the system require a copy of the language library, a requirement that wastes disk space and memory space.
If there is a dynamic link, the application of each library program in the binary image has a stub. A stub is a small piece of code that indicates how to locate the appropriate memory-resident library program, or how to install the storage if the program is not in memory. Either way, the stub replaces itself with the subroutine address and starts executing the subroutine. Therefore, the next time the subroutine code is executed, it can be done directly without any overhead associated with dynamic linking. With this scenario, all processes that use the language library require only one copy of the library code.
Dynamic connections can also be used for library updates. A library can be replaced by a new version, and all programs that use the library automatically use the new version. Without dynamic linking, all of these programs must be re-linked for access.
In order not to make the program wrong with the new, incompatible version of the library, the program and library will include version information. Multiple versions of a library can be loaded into memory, and the program uses version information to determine which library copy to use.
Therefore, only programs compiled with the new library will be affected by incompatible changes to the new library. Other programs that are linked before the new program is loaded can continue to use the old library. This system is also known as a shared library.
Unlike dynamic loading, dynamic linking often requires operating system help. If the in-memory processes are protected from each other, only the operating system can check whether the required subroutines are in other process memory space, or allow multiple processes to access the same memory address.
Exchange
The process needs to be in memory for execution. The process can also temporarily swap (swap) from memory to the backup store (backing store) and back in memory when it needs to be executed again.
Variants of the switching strategy are used in priority-based scheduling algorithms. If you have a higher-priority process and require services, the memory manager can swap out low-priority processes for loading and executing higher-priority processes. When the high-priority process finishes executing, the low-priority process can be swapped back into memory to continue execution, which is sometimes referred to as roll-In/swap in and roll-out (rolls Out/swap out).
Typically, a swapped-out process needs to swap back the memory space it originally occupied. This restriction is determined by the way the address is bound. You cannot move to a different location if the binding is determined at assembly time or when it is loaded. If the binding is determined at run time, the process can be moved to a different address space because the physical address is determined at run time.
The Exchange requires backup storage. Backup storage is typically a fast disk. This must be large enough to accommodate a mirrored copy of the memory of all the different users, and it must also provide direct access to these memory images. The system has a ready queue that includes all the processes waiting to run in the backup store or in memory. When the CPU scheduler decides to execute the process, it invokes the scheduler. The scheduler checks if the next process in the queue is in memory, if it is not in memory, and there is no free memory space, the scheduler tells a process that is already in memory to swap out and swap in the required process. It then reloads the register and transfers the control to the selected process.
The context switching time of the switching system is longer. In order to use the CPU effectively, you need to make each process take longer to execute than the swap time.
Note the main part of the Exchange time is the transfer time. The total transfer time is proportional to the total amount of memory exchanged. So you can reduce the switching time if you just swap the memory that you really use. In order to use this method effectively, the user needs to tell the system its memory requirements. Thus, a process with dynamic memory requirements notifies the operating system of changes in its memory requirements through system calls (Request memory and free memory).
Exchange is also limited by other factors. If the process is to be swapped in, you must ensure that the process is completely idle. If I/O asynchronously accesses the I/O buffer of the user's memory, the process cannot be swapped out. Assume that the I/O operation is queued because the device is busy. If the swap process P1 into the process P2, the I/O operation may attempt to use memory that is now part of the process P2.
There are two ways to solve this problem:
One is that you cannot swap out processes that are pending I/O.
Second, the execution of I/O operations can only use the operating system buffers. The data transfer between the operating system buffer and the process memory is performed only when the process is swapped in.
Swap space is often used as a whole block of disks, and is independent of the file system, so it may be very fast to use.
Swapping is not usually performed, but when there are many processes running and memory space is tight, the interchange starts. If the system load is reduced, the interchange is paused.
Continuous memory allocation (contiguous allocation)
Memory must accommodate the operating system and various user processes, so the parts of memory should be allocated as efficiently as possible.
Memory is typically divided into two zones: one for hosting the operating system and one for user processes. The operating system can be in low memory or high memory, and the main factor that affects this decision is the position of the interrupt vector. Because interrupt vectors are usually located in low memory, programmers usually put the operating system into low memory.
It is often necessary to put multiple processes into memory at the same time, so you need to consider how to allocate memory space for the processes in the input queue that need to be transferred into memory.
With continuous memory allocation (contiguous allocation) , each process is located in a contiguous area of memory.
Memory Mapping and protection
Protection can be achieved by using the reposition register and the limit address register.
The relocation register contains the minimum physical address value, and the boundary address register contains the range value of the logical address.
Each logical address must be less than the limit address register. MMU dynamic First, the logical address is mapped to a physical address after the value of the relocation register is added. The mapped physical address is then sent to the inner deposit cell.
When the CPU scheduler chooses a process to execute, as part of the context switch work, the scheduler initializes the relocation register and the limit address register with the correct values, since each address generated by the CPU needs to be checked against the register process, So you can ensure that the operating system and other user programs and data are not affected by the process running.
The relocation register mechanism provides an effective method to allow the dynamic change of the operating system. If a driver (or other operating system service) is not used frequently, it is not necessary to be in memory, which is sometimes referred to as temporary (transient) operating system code, which is called in or paged out as needed. Therefore, using this code can dynamically change the size of the operating system when the program executes.
Memory allocation
One of the simplest ways to allocate memory is to divide the memory into multiple fixed-size partitions (partition). Each partition can hold only one process. The extent of the multi-channel program is limited by the number of partitions. If you use this multi-partitioning method (Multiple-partition), when a partition is idle, you can enter a queue to select a process to tune into the free partition. When a process terminates, its partition can be used by other processes. This method is now no longer in use. For the generalization of a fixed partitioning scheme (known as MVT), it is primarily used for batch processing environments. It can also be used for pure segmented memory management of time-sharing operating systems.
In a variable partitioning (variable-partition) scenario, the operating system has a table that records which memory is available and which memory is occupied. At first, all memory is available to the user process, so it can be used as a large chunk of available memory, called a hole (hole), when the new process needs memory, to find a hole large enough for the process, and if found, can allocate the required memory from the hole process, and the unallocated memory in the hole can be used for the next reuse.
As the process enters the system, they are added to the input queue. The operating system sorts the input queues according to the scheduling algorithm. The memory is continuously allocated to the process until the memory requirements of the next process are not met, and if there are not enough holes to mount the process, the operating system can wait until there is enough space, or scan the input queue down to determine if other processes with smaller memory requirements can be satisfied.
Typically, a set of holes of different sizes is scattered in memory. When the new process requires memory, the system looks for a hole that is large enough for the process. If the hole is too large, it is divided into two pieces: one is allocated to the new process, the other is returned to the hole collection, and when the process terminates, it frees its memory, and the memory is returned to the hole collection. If the holes are adjacent to other holes, merge the holes into large holes. At this point, the system can check whether a process is waiting for memory space and whether the newly merged memory space satisfies the waiting process.
This approach is a case of a general-purpose dynamic Storage allocation problem (allocating a request of size n based on a set of free holes), which has many workarounds. The most common method of selecting a free hole from a set of available holes is the first adaptation (First-fit)(the first hole that is large enough), the best Fit (best-fit)(the smallest large hole), the Worst Fit (worst-fit) (the largest hole allocated).
first-time adaptation (First-fit): allocates the first hole that is large enough to start from scratch or from the end of the last first adaptation. Once a large enough free hole is found, it can be stopped.
Best Fit (best-fit): allocates the smallest large enough holes. You must find the entire list, unless the list is sorted by size. This method can produce the smallest remaining holes.
worst-Fit (worst-fit): Assign the largest hole, and you must also find the entire list, unless the list is sorted by size. This method can produce the maximum remaining holes. The hole may be more useful than the smallest remaining hole produced by the best fit method.
The simulation results show that the first adaptation and the best adaptive method are better than the worst adaptive method in the execution time and the space utilization. First-time adaptation and best-fit methods are difficult to use in space, and the first adaptation method is faster.
Fragment (Fragmentation)
Both first-time adaptation and best-fit algorithms have external fragmentation issues (external fragmentation). As the process loads and moves out of memory, the free memory space is split into small segments,
When all the sum of the total empty memory is satisfied with the request, but is not contiguous, there is an external fragmentation problem. In the worst case, there is a free block (or waste) between each of the two processes. If the memory is a whole block, then you can run more than one process.
The choice between the first adaptation and the best fit may affect the amount of fragmentation. Another factor of influence is the allocation from which end of the free block. Regardless of which algorithm you use, external fragmentation is always a problem.
The importance of external fragmentation differs depending on the total size of the memory and the average process size. For example, for the statistical description of the first adaptation method, if the first adaptation method is optimized, assuming n assignable blocks, then 0.5N blocks may be external fragments. That is, 1/3 memory may not be available, and this feature is called a 50% rule.
Memory fragmentation can be internal or external. If the memory is allocated in a fixed-size block, the process may allocate more memory than is required. The difference between these two numbers is called internal fragmentation (internal fragmentation), which is part of the memory within the partition, but it is not available.
One way to solve an external fragmentation problem is to tighten (compaction), which is to move the contents of the memory so that all free space is merged into one whole block. But austerity is not always possible. If the relocation is static, and it is performed during assembly or loading, it cannot be condensed. Austerity is only dynamic in relocation and can be used at run time. If the address is relocated dynamically, you can move the program and data first, and then change the base address register with the value of the new base address. If you take austerity and evaluate its overhead, the simplest merging algorithm is simply to move all the cities to one end of the memory and move all the holes to the other end of the memory to generate a large free block. This scenario is expensive.
Another way to resolve the external fragmentation problem is to allow the physical address to be noncontiguous. This allows the process to be allocated as long as there is physical memory. There are two complementary implementations of this approach: Paging and segmentation . These two technologies can also be combined.
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Operating System Concepts Learning note 15 memory management (i)