Nineth Chapter Virtual Memory
Virtual memory is one of the most important concepts in computer system.
Virtual memory is a perfect interaction between hardware anomalies, hardware address translation, main memory, disk files and kernel software, which provides a large, consistent, private address space for each process. With a clear mechanism, the virtual memory provides three capabilities to weigh:
- It sees main memory as a cache of address space stored on disk, storing only active areas in main memory, and transmitting data back and forth between disk and main memory as needed, in this way, efficiently using main memory
- It provides a consistent address space for each process, simplifying memory management
- It protects the address space of each process from being destroyed by other processes
9.1 Physical and virtual addressing
1. The main memory of a computer system is organized into an array of cells consisting of m contiguous byte sizes. Each byte has a unique physical address.
2. Older PCs use physical addressing, and systems such as digital signal handlers, embedded microcontrollers, and Cray supercomputers continue to use this approach. The modern processor for general-purpose computer designers is using virtual addressing. When using virtual addressing, the CPU accesses main memory by generating a virtual address , which is converted to the appropriate physical address before being sent to the memory (this process is called address translation , and the associated hardware is the memory Management Unit MMU)
9.2 Address Space
1. The address space is an ordered collection of non-negative integer addresses: {0,1,2, ... }, if the address is contiguous, it is called a linear address space . If the computer has n-bit address bus, in a system with virtual memory, the CPU generates a virtual address from an address space with n = 2n addresses, which is called the virtual address space : {0,1,2,...,n-1}
2. The size of an address space is described by the number of bits required to represent the maximum address. Example: N=2^n:n bit address space
3. A system also has a physical address space that corresponds to the M-bytes of physical memory in the system: {0,1,2, ... M-1}. M does not require a power of 2.
4. each byte in main memory has a virtual address selected from the virtual address space and a physical address selected from the Physical address space.
9.3 Virtual memory as a caching tool
Conceptually, virtual memory is organized into an array of n contiguous byte-sized cells stored on disk. Each byte has a unique virtual address, and this unique virtual address is used as an index to the array. The contents of the array on disk are cached in main memory. As with other caches in the memory hierarchy, the data on the disk (lower layer) is partitioned into chunks that act as a transmission unit between the disk and main memory (higher layer). The VM system deals with this problem by splitting the virtual memory into blocks of fixed size called virtual pages (Vsan PAGE,VP). The size of each virtual page is P = 2p bytes. Similarly, the physical memory is divided into physical pages (physical page,pp), and the size is P-byte (the physical page is also called the page frame , which is referred to as the pages).
At any one time, the collection of virtual pages is divided into three disjoint subsets:
- Unassigned: The VM system has not yet allocated/created the page and does not occupy any disk space.
- Cached: Allocated pages in physical memory currently in slow existence
- Not cached: Allocated pages in physical memory are not present
1.DRAM Cache Organizational Structure
- The penalty for not hitting is very large
- is fully connected-any virtual page can be placed in any physical page.
- Replacement algorithm Precision
- Always use write-back instead of straight write.
2. Page Table
A data structure stored in physical memory, called a page table . A page table maps a virtual page to a physical page.
The page table is an array of PTEs (page table entry, page sheet entries). Each page in the virtual address space has a PTE at a fixed offset in the page table.
NULL: Not assigned.
VP3,VP5: Allocated, but not yet cached.
VP1: Allocated, cached.
3. Page Hits
- Missing pages: This means that the DRAM cache is not hit.
- Page Fault exception: Call the kernel of the fault handler, select a sacrifice pages.
- Page: The habit of virtual memory, is the block
- swap = page Scheduling: The activity of transferring pages between disk and storage
- On-Demand page scheduling: Policies that are not swapped into the page until a miss is hit, which is used by all modern systems.
4.Missing pages
DRAM cache misses are called missing pages.
The principle of locality ensures that at any point in time, the program will often work on a smaller set of active pages called the Working set/resident set .
bumps : The size of the working set exceeds the size of the physical memory.
9.4 Virtual memory as a tool for memory management
- The operating system provides a separate page table for each process, which is a separate virtual address space.
- Shaking a virtual page can be mapped to the same shared physical page.
- Memory mapping: A representation that maps a contiguous set of virtual pages to any location in any file.
VMS simplify linking and loading, code and data sharing, and storage allocation for applications.
9.5 Virtual memory as a tool for memory protection
Three license bits for PTE:
- SUP: Indicates whether the process must be running in kernel mode to access the page
- READ: Reading permissions
- Write: Writing Permissions
9.6 Address Translation
1. Address Translation
Address translation is a mapping between elements in the virtual address space (VAS) of an n element and elements in the Physical address space (PAS) of an M element.
2. Page Table Base Register
A control register in the CPU, called the page Table base Register (REGISTER,PTBR), points to the current page table. The virtual address of the N-bit contains two parts: a P-bit VPO (Virtual page offset, a VM shift) and a n-p-bit VPN. The MMU uses a VPN to select the appropriate Pte. such as VPN0 select PTE0. Because both the physical and virtual pages are P-bytes, the PPO (physical page offset) and VPO are the same, so the PPN in the page table entry (physical page number, physical page numbers) and VPO in the virtual address are concatenated together, is the corresponding physical address.
3. Page hits are completely hardware-processed, while processing pages requires hardware and OS kernel collaboration to complete. 4. Combining cache and virtual memory
Most systems access the cache in a way that chooses physical addressing. With physical addressing, it is easy for multiple processes to have storage blocks in the cache and share blocks from the same virtual page. Also, the cache does not need to deal with protection issues because access checks are part of the address translation process.
5. Using TLB to accelerate address translation
A small cache of PTEs is included in the MMU, called TLB. The TLB is a small, virtual-addressing cache in which each row holds a block of a single Pte.
6. Multi-level page table
Multi-level page table-hierarchical structure, used to compress page tables.
(1)以两层页表层次结构为例,好处是: 如果一级页表中的一个PTE是空的,那么相应的二级页表就根本不会存在 只有一级页表才需要总是在主存中,虚拟存储器系统可以在需要时创建、页面调入或调出二级页表,只有最经常使用的二级页表才缓存在主存中。(2)多级页表的地址翻译:
9.7 Case Study: Intel Core i7/linux Memory system
Processor package: Four cores, one large all-core shared L3 cache and one DDR3 memory controller.
First, Core i7 address translation Two, Linux virtual memory system
Linux maintains a separate virtual address space for each process. Kernel memory contains code and data structures in the kernel. A subset of the physical pages that are mapped to all processes share the other part contains data that is not the same for each process.
1. Linux Virtual memory Area
Zone: is the contiguous slice of the allocated virtual memory, which is associated with the pages.
Each virtual page that exists is saved in a region. The kernel maintains a separate task structure for each process in the system task_struct:
The regional structure of a specific region includes:
•vm_start:指向起始处•vm_end:指向结束处•vm_prot:描述这个区域包含的所有页的读写许可权限•vm_flags:是共享的还是私有的•vm_next:指向下一个区域
2. Linux pages exception handling
(1) is virtual address a legal?
不合法,触发段错误,终止进程 合法,进入下一条
(2) is the memory access legal? That is, do you have permission?
不合法,触发保护异常,终止程序 合法,进入下一条
(3) At this time, is the legitimate virtual address for the legitimate operation. So: Select a sacrifice page and change the new one and update the page table if it is modified.
9. 8 Memory mapping
Linux is called a memory map by associating a virtual memory area with an object on a disk to initialize the contents of the virtual memory area.
Mapping objects:
1.Unix文件系统中的普通文件2.匿名文件(由内核创建,全都是二进制0)
One, shared objects and private objects
Shared objects
• 共享对象对于所有把它映射到自己的虚拟存储器进程来说都是可见的• 即使映射到多个共享区域,物理存储器中也只需要存放共享对象的一个拷贝。
Private objects
• 私有对象运用的技术:写时拷贝• 在物理存储器中只保存有私有对象的一份拷贝
The fork function is the application of the write-time copy technique, as for the EXECVE function:
Ii. user-level memory mapping using the MMAP function 1. Create a new virtual storage area
#include <unistd.h> #include <sys/mman.h> void *mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset); 成功返回指向映射区域的指针,若出错则为-1
Parameter meaning:
• start:这个区域从start开始• PROT_EXEC:由可以被CPU执行的指令组成• PROT_READ:可读• PROT_WRITE:可写 PROT_NONE:不能被访问• flag:由描述被映射对象类型的位组成,具体如下:•MAP_ANON:匿名对象,虚拟页面是二进制0•MAP_PRIVATE:私有的、写时拷贝的对象 MAP_SHARED:共享对象
2. Delete the virtual storage:
include include <sys/mman.h> int munmap(void *start, size_t length); 成功返回0,失败返回-1
Delete from start, the region consisting of the next length byte.
9.9 Dynamic Memory allocation
Heap: Dynamic memory allocation maintains the virtual memory area of a process, called a heap.
Suppose it is a region that requests a binary 0, immediately after the uninitialized BSS region, and grows upward (higher address). There is a variable brk point to the top of the heap
Two basic styles of dispensers:
a.显示分配器-malloc和free b.隐式分配器/垃圾收集器
A, malloc and free functions:
The system calls the malloc function to allocate blocks from the heap:
#include <stdlib.h>void *malloc(size_t size); 返回:成功则为指针,指向大小至少为size字节的存储器块,失败则为NULL
The system calls the free function to release the allocated heap block:
#include <stdlib.h>void free(void *ptr); 无返回值
The PTR parameter must point to a starting position for an allocated block obtained from malloc, Calloc, or Reallov.
Second, why to use dynamic memory allocationThey know the size of some data structures because they often know that the program is actually running.
Third, the requirements and objectives of the Distributor:
Constraint conditions
• 处理任意请求序列• 立即响应请求• 只使用堆• 对齐块• 不修改已分配的块
Goal:
• 最大化吞吐率(吞吐率:每个单位时间里完成的请求数)• 最大化存储器利用率——峰值利用率最大化
Iv. fragmentsThis behavior occurs when there is unused memory, but cannot be used to satisfy the allocation request.
Internal fragmentation
Occurs when a allocated block is larger than the payload.
Easy to quantify.
External fragments
Occurs when the free memory is aggregated enough to satisfy an allocation request, but there is not a single space block sufficient to handle this request.
Difficult to quantify, unpredictable.
V. Implementation issues
An actual allocator needs to be considered in order to strike a balance between throughput and utilization rates:
Idle fast Organize, place, split, merge
Vi. Implicit idle List
Heap block format: consists of a word's head, valid loads, and possible extra padding.
Organize the heap into a sequence of contiguous allocated and free blocks:
The free block is implicitly connected by the size field in the head, and the allocator can traverse the entire set of free blocks indirectly by traversing all the blocks in the heap.
Required: The end block of the special tag.
System alignment requirements and allocator-to-block format selection impose a mandatory requirement on the minimum block size on the allocator.
Seven, placing allocated blocks--placement policy
First time adaptation
Search the free list from the beginning and select the first appropriate free block
Next time to fit
Start search from the end of the last search
Best Fit
Retrieve each free block and select the smallest free block that fits the desired request size
Eight, split free blocks
Ix. obtaining additional heap memory
Use the SBRK function:
#include
Vid *sbrk (intptr_t incr);
Successful returns the old BRK pointer with an error of-1
Expands and shrinks the heap by adding incr to the BRK pointer of the kernel.
X. Merging of free blocks
Merging is a matter of false fragmentation, and any actual allocator must merge adjacent free blocks.
There are two kinds of strategies:
• 立即合并• 推迟合并
Xi. Merging with borders
This merger means that the backward merging is simple because of the presence of the head, but it is inconvenient to merge forward, so at the end of the block a foot is added as a copy of the head, it is convenient to merge, the specific four cases are as follows:
The free block always needs the foot part.
12, the realization of a simple dispenser
Attention:
• 序言块和结尾块:序言块是初始化时创建的,而且永不释放;结尾块是一个特殊的块,总是以它为结束。• 有一个技巧,就是将重复使用的,操作复杂又有重复性的,这些可以定义成宏,方便使用也方便修改。• 需要注意强制类型转换,尤其是带指针的,非常复杂。• 因为规定了字节对齐方式为双字,就代表块的大小是双字的整数倍,不是的舍入到是。
13. Explicit Idle link list
- Difference
(1) Time allotted
隐式的,分配时间是块总数的线性时间但是显式的,是空闲块数量的线性时间。
(2) Linked list form
隐式——隐式空闲链表显式——双向链表,有前驱和后继,比头部脚部好使。
Sorting policy:
• 后进先出• 按照地址顺序维护
14. Separated Idle link list
Separating storage is a popular way to reduce allocation time. The general idea is to divide all possible block sizes into equivalent class/size classes.
The allocator maintains an array of idle lists, one for each size class, in ascending order of size.
There are two basic ways of doing this:
1. Simple separation of storageThe free list of each size class contains blocks of equal size, and each block is the size of the largest element in the size class.
(1) Operation
If the list is non-empty: Assigns all of the first block
If the list is empty: The allocator requests a fixed-size extra memory slice to the operating system, dividing the slice into equal-sized blocks and connecting them to a new free-form list.
(2) Advantages and disadvantages
Advantages: Fast time, low overhead
Cons: Easy to create internal and external debris
2. Separating and fitting
Each free list is associated with a size class and is organized into a type of display or implicit linked list, each containing a potentially different block size that is a member of the size class.
This method is fast and efficient for memory use.
3. Partner Systems-a special case of separation adaptation
Each of these size classes is a power of 2
Thus, given the size of the address and block, it is easy to calculate the address of its partner, that is to say: The address of a block and its partner's address only a different.
Pros: Quick Search, quick merge.
9. 10 garbage collection
The garbage collector is a dynamic storage allocator that automatically frees the allocated blocks that the program no longer needs, called garbage, and the process of automatically reclaiming heap storage is called garbage collection.
First, the basic knowledge
The garbage collector sees memory as a forward-reachable graph, only if there is a forward path from any root node and reaches p, it is said that node p is reachable, and the unreachable point is rubbish.
Second, mark&sweep garbage collector
There are two stages:
• 标记:标记出根节点的所有可达的和已分配的后继 • 清楚:释放每个未被标记的已分配块。
Related functions:
PTR defined as typedef void *PTR
PTRisptr (PTR p): if p points to a word in an allocated block, it returns a pointer B to the starting position of the block, otherwise returns null int blockmarked (PTR B): If block B is already marked , then return true int blockallocated ( PTR b): If block B is allocated, then long return ture void markblockb int length Unmarkblocknextblock
Third, c conservative mark&sweep
--Balanced binary tree
C's Mark&sweep collector must be conservative, and the root cause is that the C language does not tag the memory location with type tags.
9. Memory-related errors common to all C programs
Indirectly referencing bad pointers
Common error--SCANF errors
Read Uninitialized memory
Common error--assuming that the heap memory is initialized to 0
Allow stack buffer overflow
Common error--Buffer overflow error
Assume that the pointers and the objects they point to are the same size
Working in the distance action at distance
Cause dislocation errors
Reference pointer, not the object it points to
Misunderstanding pointer arithmetic
Referencing a non-existent variable
Referencing data in an empty heap block
Cause memory leaks
9th Chapter Virtual Memory Learning Notes