Nineth Chapter
Virtual Memory
One, virtual memory provides a three key competencies:
1, the main memory as a memory stored on the disk address space cache, in main memory only protect the active area, and according to the need to transfer data between the disk and main memory;
2, to provide a consistent address space for each process, thus simplifying the memory management;
3. The address space of each process is protected from other processes.
Second, understand the reasons for virtual memory:
1, virtual memory is central: it is hardware anomaly, hardware address translation, main memory, disk files and kernel software interaction Center;
2, virtual memory is powerful: it can create and destroy memory slices, can map memory slices mapped to a portion of the disk and so on;
3, virtual memory if improper operation is very dangerous.
9.1 physical and virtual addressing
One, Physical address
Physical Addressing (PA): Each byte in main memory has a unique physical address, which is called physical addressing.
Second, virtual address
Virtual Addressing (VA): The CPU generates a virtual address and then accesses the main memory with that address, which is converted to the appropriate physical address before it is sent to the memory (this process is called address translation).
9.2 address Space
One, address space: The address space is an ordered set of non-negative integer addresses: {0,1,2,......}
Two, linear address space: the integer in the address space is continuous.
Third, virtual address space: The CPU generates a virtual address from an address space with n=2^n addresses, which becomes known as the virtual address space.
Iv. the size of the address space: described by the number of bits required to represent the maximum address. N=2^n:n bit address space.
Each byte in main memory has a virtual address selected from the virtual address space and a physical address selected from the Physical address space.
9.3 virtual memory as a tool for caching
First, the VM is organized as an array of n contiguous byte-sized cells stored on disk. Each byte has a unique virtual address, which is used as an index to the array. Second, the VM system splits the virtual memory into a fixed-size virtual page, with each virtual page having a size of p=2^p bytes; Similarly, the physical memory is split into a physical page (also called a page frame) with a size of p bytes.
At any one time, the collection of virtual pages is divided into three disjoint subsets:
1. Unallocated: The VM system has not yet allocated/created the page and does not occupy any disk space.
2. Cached: The allocated pages in the physical memory are currently being slowed.
3. Not cached: The allocated pages in the physical memory are not being slowed down.
Three, page table
1. Role: Map a virtual page to a physical page. Each time the address translation hardware translates a virtual address into a physical address, the page table is read. The operating system is responsible for maintaining the contents of the page table.
2, Structure: the page table is a page table entry (PTE) array; Each page in the virtual address space has a PTE at a fixed offset in the page table. For our purposes, we assume that each PTE consists of a valid bit and an n-bit address field. A valid bit indicates the starting position of the physical page in which the virtual page is cached.
Four, missing pages
1. DRAM cache misses are called missing pages.
2, the processing process:
- The page fault is called by the kernel to call the fault handler, the program will select a sacrifice, and swap it out of memory;
- The kernel copies the required entries from the disk to the location before the page is sacrificed, and then returns;
- When the exception handler returns, it restarts the instruction that caused the missing pages, which re-sends the virtual address that caused the missing pages to the address translation hardware;
- At this point, the page hits
3. Concept Supplement:
- In the customary parlance of memory, blocks are called pages;
- The activity of transferring pages between disk and storage is called switching or paging;
- The page is swapped into DRAM and swapped out of the disk from the DRAM, and has been waiting until a hit occurs before swapping in the page; This strategy is called on-demand page scheduling
V. Locality in virtual memory
1. The principle of locality ensures that at any moment, the program will often work on a smaller set of active pages called the Working Set/resident set.
2. bumps : The working set size exceeds the size of the physical memory.
9.4 virtual memory as a tool for memory management
1, the operating system provides a separate page table for each process, that is, a separate virtual address space.
2, shaking a virtual page can be mapped to the same shared physical page.
3. Memory mapping: The representation of mapping a contiguous set of virtual pages to any location in any file.
VMS simplifies linking and loading, code and data sharing, and memory allocation for applications.
9.5 virtual memory as a tool for memory protection
Three license bits for PTE:
1. SUP: Indicates whether the process must be running in kernel mode to access the page
2. READ: Reading permission
3. Write: Writing permission
9.6 Address Translation
Address Translation: Formally speaking, address translation is a mapping between an element in the virtual address space (VAS) of an n element and the Physical address space (PAS) of an M element;
Second, the process:
- A control register in the CPU, and the page table base register points to the current page table;
- The virtual address of N-bit consists of the following two parts: a virtual page offset of P-bit and a virtual page number of (n-p) bit;
- The MMU chooses the appropriate PTE with the latter, and then concatenates the physical page number and the VPO in the virtual address to get the physical address;
- Because both physical and virtual pages are P-bytes, the physical page offsets and VPO are the same
Iii. CPU Execution Steps (page hit)
- The processor generates a virtual address and passes it to the MMU;
- The MMU generates a PTE address and requests it from cache/main memory;
- Cache/Main Memory returns PTE to the MMU;
- The MMU constructs the physical address and passes it to cache/main memory;
- Cache/Main memory returns the requested data Word to the processor
Iv. CPU Execution steps (missing pages)
- The processor generates a virtual address and passes it to the MMU;
- The MMU generates a PTE address and requests it from cache/main memory;
- Cache/Main Memory returns PTE to the MMU;
- The significant bit in PTEs is 0, triggering an exception, passing the control of the CPU to the fault handling program in the operating system kernel;
- The page fault handler determines the sacrifice page in the physical memory, and if it has been modified, swap it out to disk;
- The page fault handler is paged into the new pages and updates the PTEs in the memory;
- The fault-pages handler returns to the original process and executes the instruction that caused the missing pages again. The CPU will resend the virtual address that caused the page fault to the MMU, because the virtual pages are now slow to exist in the physical memory, so a hit occurs.
9.7 Case Study
First, Core i7 Address Translation
PTEs have three privilege bits:
1, r/w bit: Determine whether the content is read or write
2. U/S bit: Determines whether the page can be accessed in user mode
3, XD bit: Prohibit execution bit, introduced in 64-bit system, can be used to prohibit from some memory pages to take instructions
There are also two of the bits that are involved in a fault-handling program:
1, a bit, reference bit, implement page replacement algorithm
2, D-bit, dirty bit, tell if the sacrifice page must be written back
Second, Linux Virtual Memory System
1. Linux maintains a separate virtual address space for each process,
2, Linux for each process to maintain a separate virtual address space, wherein the kernel virtual memory is located on the user stack;
3. Kernel virtual memory contains code and data structures in the kernel, and some are mapped to a contiguous set of physical pages (mainly for convenient access to specific locations, such as where I need to perform I/O operations)
4. Linux organizes virtual memory into a collection of areas (also called segments). An area is a continuous slice of the already existing (allocated) virtual memory;
(1) Meaning: Allow the virtual address space to have gaps, the kernel does not have to record those pages that do not exist, such pages do not occupy memory;
(2) Regional structure:
Vm_start: Point to start
Vm_end: Point at end
Vm_prot: Describes read and Write permission permissions for all pages contained in this zone
Vm_flags: Is it shared or private?
Vm_next: Point to Next area
9.8 Memory Mapping
First, the concept: Linux (and some other forms of Unix) by associating a virtual memory area with an object on a disk to initialize the contents of this virtual memory area;
Second, type:
1, UNIX File system of ordinary files: a region can be mapped to a regular disk file of a continuous part, such as an executable target file;
2, anonymous files: An area can also be mapped to an anonymous file, anonymous files are created by the kernel, contains all binary 0 (the first time the CPU refers to a virtual page in such a region, the kernel in the physical memory to find the appropriate sacrifice page and then use binary zeros to overwrite it)
Iii. shared objects and private objects
1. Shared objects
(1) The shared object is visible to all the virtual memory processes that map it to itself
(2) Even if it is mapped to multiple shared areas, only one copy of the shared object needs to be stored in the physical memory.
2. Private objects
(1) techniques used by private objects: copy-on-write
(2) Only one copy of the private object is stored in the physical memory
9.9 Dynamic Memory allocation
A heap: a region that requests a binary 0, immediately after the uninitialized BSS region, and grows upward (higher address). There is a variable brk point to the top of the heap
Two basic styles of distributor:
1. Display splitter-malloc and free
2. Implicit distributor/garbage collector
Third, why use dynamic memory allocation?
They know the size of some data structures because they often know that the program is actually running.
IV. Requirements and objectives of the Distributor:
1. Requirements:
(1) Processing arbitrary request sequences
(2) Respond to requests immediately
(3) Use only heap
(4) Aligning the blocks
(5) Do not modify allocated blocks
2. Target:
(1) Maximize throughput (throughput: Number of requests completed per unit of time)
(2) Maximizing memory utilization-maximizing peak utilization
V. Fragments
This behavior occurs when there is unused memory, but cannot be used to satisfy the allocation request.
1. Internal fragments
(1) Occurs when a allocated block is larger than the payload
(2) easy to quantify.
2. External fragments
(1) Occurs when free memory is aggregated enough to satisfy an allocation request, but there is not a single space block sufficient to handle this request.
(2) difficult to quantify, unpredictable.
Vi. Implicit idle List
Heap block format: consists of a word's head, valid loads, and possible extra padding.
Seven, placing allocated blocks--placement policy
1. First adaptation
Search the free list from the beginning and select the first appropriate free block
2, next time to fit
Start search from the end of the last search
3. Best Fit
Retrieve each free block and select the smallest free block that fits the desired request size
VIII. Application for additional heap storage
Use the SBRK function:
#include <unistd.h>
Vid *sbrk (intptr_t incr);
Successful returns the old BRK pointer with an error of-1
Expands and shrinks the heap by adding incr to the BRK pointer of the kernel.
Ix. Merging of free blocks
Merging is a matter of false fragmentation, and any actual allocator must merge adjacent free blocks.
There are two kinds of strategies:
1. Immediate merger
2. Postponement of merger
Ten, the realization of a simple dispenser
To implement a simple allocator design, there are a few points to note:
1. Prologue Block and End Block: The prologue block is created at initialization and never released; the end block is a special block that always ends with it.
2, there is a skill, will be repeated use, operation complex and repetitive, these can be defined macro, easy to use and easy to modify.
3, need to pay attention to coercion type conversion, especially with pointers, very complex.
4, because the specified byte alignment is a double word, it means that the size of the block is a double-word integer multiple, not rounded to Yes.
Xi. Explicit Idle-list
1. Difference
(1) Time allotted
Implicitly, the allocation time is the linear time of the total number of blocks
Explicit, is the linear time of the number of free blocks.
(2) Linked list form
Implicit--Implicit idle list
Explicit-doubly linked list, with a precursor and successor, better than the head foot.
2. Sorting strategy:
Last in, first out
Maintenance by Address Order
12. Separated Idle link list
Separating storage is a popular way to reduce allocation time. The general idea is to divide all possible block sizes into equivalent class/size classes.
The allocator maintains an array of idle lists, one for each size class, in ascending order of size.
There are two basic ways of doing this:
1, simple separation of storage
The free list of each size class contains blocks of equal size, and each block is the size of the largest element in the size class.
2. Separation and fitting
Each free list is associated with a size class and is organized into a type of display or implicit linked list, each containing a potentially different block size that is a member of the size class.
This method is fast and efficient for memory use.
A special case-----partner system for separation adaptation
Each of these size classes is a power of 2. Thus, given the size of the address and block, it is easy to calculate the address of its partner, that is to say: The address of a block and its partner's address only a different.
Pros: Quick Search, quick merge.
9.10 Garbage Collection
The garbage collector is a dynamic storage allocator, which automatically frees the allocated blocks that the program no longer needs, called garbage , and the process of automatically reclaiming heap storage is called garbage collection . The garbage collector periodically identifies the garbage blocks and calls free accordingly, putting the blocks back into the idle list
Second, the garbage collector sees the memory as a forward-reaching graph, only when there is a path from any root node to reach p, it is said that node p is reachable, and the unreachable point is rubbish.
9.11 C memory-related errors common in the program
1. Indirect reference to bad pointers
2. Read Uninitialized memory
3. Allow stack buffer overflow
4. Assume that the pointers and the objects they point to are of the same size
5, causing dislocation error
6. Reference the pointer, not the object it points to
7. Misunderstanding pointer arithmetic
8. Referencing a non-existent variable
9. Referencing data in an empty heap block
10. Cause Memory leakage
Learning progress Bar
|
lines of code (new | /Cumulative)
Blog volume ( | new/cumulative)
Learning time (new/cumulative) |
Important growth |
Goal |
5000 rows |
30 Articles |
400 hours |
|
First week |
0/0 |
1/2 |
20/20 |
Learn about virtual machine installation and basic Ubuntu operation |
Second week |
56/56 |
1/3 |
20/40 |
Learned the C language writing under Ubuntu terminal |
Third week |
110/166 |
1/4 |
30/70 |
Familiar with the basic operation of GDB, understand the computer information representation and processing |
Week Four |
0/166 |
1/5 |
10/80 |
Review the knowledge of the previous weeks |
Week Five |
42/208 |
2/6 |
30/110 |
Learn the content of assembly language under Linux |
Week Six |
216/424 |
1/7 |
30/140 |
Linux under Y86 instruction set |
Seventh Week |
71/495 |
1/8 |
20/160 |
Learn the application of local principle and caching idea |
Eighth Week |
0/495 |
2/10 |
20/180 |
Review the previous study content and summarize |
Nineth Week |
133/628 |
2/12 |
20/200 |
Learn the contents of system level I/O and understand the intrinsic function |
Tenth Week |
407/1035 |
1/13 |
30/230 |
Analyze and debug your code, system-level I/O content |
11th Week |
714/1749 |
2/15 |
40/270 |
Understanding anomalies and their types |
12th Week |
0/1749 |
4/19 |
30/300 |
Review the contents of the previous weeks |
13th Week |
797/2728 |
2/21 |
20/320 |
Learn the basic content of multithreading, concurrent programming
|
14th Week |
0/2728 |
1/22 |
15/335 |
Virtual memory |
20145235 "Information Security system Design Fundamentals" 14th Week Study Summary