20135306 14 Weeks Study Summary

Last Update:2015-12-13 Source: Internet

Author: User

Tags intel core i7

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Nineth Chapter Virtual Memory

Virtual memory is one of the most important concepts in computer system.

virtual memory (Vsan ), is hardware exception, hardware address translation, main memory, disk files and perfect interaction of kernel software, it provides a large, consistent, private address space for each process. With a clear mechanism, the virtual memory provides three capabilities to weigh:

It sees main memory as a cache of address space stored on disk, storing only active areas in main memory, and transmitting data back and forth between disk and main memory as needed, in this way, efficiently using main memory
It provides a consistent address space for each process, simplifying memory management
It protects the address space of each process from being destroyed by other processes

9.1 physical and virtual addressing

1. Physical Address: The main memory of a computer system is organized into an array of cells composed of m consecutive bytes to school. Each byte has a unique physical address. The first byte of the ground CPU is generated at 0, the next byte address is 1, the next is 2, and so on. Given this simple structure, the most natural way for the CPU to access the memory is to use the physical address.

2. Virtual Address: Older PCs use physical addressing, and systems such as digital signal handlers, embedded microcontrollers, and Cray supercomputers continue to use this addressing method. The modern processor for general-purpose computer designers is using virtual addressing. When using virtual addressing, the CPU accesses main memory by generating a virtual address , which is converted to the appropriate physical address before being sent to the memory (this process is called address translation , and the associated hardware is the memory management Unit MMU)

9.2 address Space

Address space: An ordered collection of non-negative integer addresses.

Linear address space: integers in the address space are contiguous

Virtual address space: In a system with virtual memory, the CPU generates a virtual address from an address space with a n=2^n address, which is called the virtual address space.

Size: Described by the number of bits required to represent the maximum address.
Modern systems support 32-bit or 64-bit.

The concept of an address space distinguishes between data Objects (bytes) and their properties (addresses).

The basic idea of virtual memory: we allow each data object to have multiple independent addresses, each of which is selected from a different address space. Each byte in main memory has a virtual address selected from the virtual address space and a physical address selected from the Physical address space.

9.3 virtual memory as a tool for caching

The virtual memory is organized into an array of n contiguous byte-sized cells stored on disk, each of which has a unique virtual address, which is the index to the array.

The VM system divides the virtual memory into a fixed-size block called a virtual page as a transmission unit between the disk and main memory, with each virtual page size of p=2^p. Physical memory is divided into physical pages of size and p bytes called frames.

At any one time, the collection of virtual pages is usually divided into three disjoint subsets:

Unassigned
of the cache
Non-Cached

The organizational structure of the DRAM cache is driven by huge misses overhead.

Fully-connected: Any virtual page can be placed on any physical page
Because the access time to the disk is long, write-back is always used instead of straight write
More complex and precise replacement algorithms

Page table:

An array of page table entries
The ability to map a virtual page to a physical page.

Each page in the virtual address space has a single PTE that has a fixed offset in the page table.

Assume that each PTE is generally comprised of a valid bit and an n-bit address field. A valid bit indicates whether the virtual page is cached in DRAM.

If a valid bit is set, the Address field represents the starting position of the corresponding physical page in the DRAM, which is cached in this physical page.
If a valid bit is not set, an empty address indicates that the virtual page is not assigned, otherwise the address points to the starting position of the virtual page on disk.

1.DRAM the organizational structure of the cache

The penalty for not hitting is very large
is fully connected-any virtual page can be placed in any physical page.
Replacement algorithm Precision
Always use write-back instead of straight write.

2. Page Table

A data structure stored in physical memory, called a page table . A page table maps a virtual page to a physical page.

The page table is an array of PTEs (page table entry, page sheet entries). Each page in the virtual address space has a PTE at a fixed offset in the page table.

NULL: Not assigned.
VP3,VP5: Allocated, but not yet cached.
VP1: Allocated, cached.

3. page Hits

Missing pages: This means that the DRAM cache is not hit.
Page Fault exception: Call the kernel of the fault handler, select a sacrifice pages.
Page: The habit of virtual memory, is the block
swap = page Scheduling: The activity of transferring pages between disk and storage
On-Demand page scheduling: Policies that are not swapped into the page until a miss is hit, which is used by all modern systems.

4. Missing pages

DRAM cache misses are called missing pages.

The principle of locality ensures that at any point in time, the program will often work on a smaller set of active pages called the Working set / resident set .

bumps : The size of the working set exceeds the size of the physical memory.

9.4 virtual memory as a tool for memory management

The operating system has a separate page table for each process and therefore a separate virtual address space.

The combination of on-demand page scheduling and independent virtual address space has a far-reaching impact on the use and management of memory in the system. In particular, VMS simplify linking and loading, code and data sharing, and memory allocation for applications.

Simplified linking: A separate address space allows each process to use the same basic format for its memory image, so that regardless of where the code and data actually reside in the physical memory, its consistency greatly simplifies the design and implementation of the linker, allowing the generation of a fully-linked executable file, These executables are independent of the final location of code and data in physical memory.
Simplified loading: Virtual storage makes it easy to load executables and shared file objects into memory. The system loads only the contiguous virtual page regions of those data and code regions, identifies them as invalid, and the page entry address points to the appropriate location in the destination file.
Simplified sharing: The operating system maps individual private data and code to different physical pages through a page table of different processes, while the shared code and data map the appropriate virtual pages to the same physical pages, arranging for multiple processes to share a copy of this part of the code.
Simplifies memory allocation: Due to the way the page table works, the operating system simply allocates an appropriate number of contiguous virtual memory pages, but can be mapped to any distributed physical memory.
A concept: The notation for mapping a contiguous set of virtual pages to any location in any file is called a memory map.

9.5 virtual memory as a tool for memory protection

Three license bits for PTE:

SUP: Indicates whether the process must be running in kernel mode to access the page
READ: Reading permissions
Write: Writing Permissions

Add some additional license bits on the PTE to control access to the content of a virtual page. For example, the SUP bit indicates whether the process must run in kernel mode to access the Web page; Read and write bits are accessed. If an instruction violates these license conditions, then the CPU triggers a general protection failure, passing control to an exception handler in the kernel.

9.6 Address Translation

1. Address Translation

Address translation is a mapping between elements in the virtual address space (VAS) of an n element and elements in the Physical address space (PAS) of an M element.

2. Page Table Base Register

A control register in the CPU, called the page Table base Register (REGISTER,PTBR), points to the current page table. The virtual address of the N-bit contains two parts: a P-bit VPO (Virtual page offset, a VM shift) and a n-p-bit VPN. The MMU uses a VPN to select the appropriate Pte. such as VPN0 select PTE0. Because both the physical and virtual pages are P-bytes, the PPO (physical page offset) and VPO are the same, so the PPN in the page table entry (physical page number, physical page numbers) and VPO in the virtual address are concatenated together, is the corresponding physical address.

3. page hits are completely hardware-processed, while processing pages requires hardware and OS kernel collaboration to complete.

4. combining cache and virtual memory

Most systems access the cache in a way that chooses physical addressing. With physical addressing, it is easy for multiple processes to have storage blocks in the cache and share blocks from the same virtual page. Also, the cache does not need to deal with protection issues because access checks are part of the address translation process.

5. using TLB to accelerate address translation

A small cache of PTEs is included in the MMU, called TLB. The TLB is a small, virtual-addressing cache in which each row holds a block of a single Pte.

6. Multi- level page Table

Multi-level page table-hierarchical structure, used to compress page tables.

(1) In the case of a two-tier page table hierarchy, the benefits are:

If a PTE in a page table is empty, then the corresponding Level two page table does not exist at all

Only a single-level page table is required to always be in main memory, the virtual memory system can be created when needed, the page calls into or bring up the Level two page table, only the most frequently used level two page table in main memory.

(2) Address translation of multi-level page table:

9.7 Case Study: Intel Core i7/linux Memory System

Processor package: Four cores, one large all-core shared L3 cache and one DDR3 memory controller.

First, Core i7 Address Translation

Second, Linux Virtual Memory System

Linux maintains a separate virtual address space for each process. Kernel memory contains code and data structures in the kernel. A subset of the physical pages that are mapped to all processes share the other part contains data that is not the same for each process.

1 . Linux virtual memory Area

Zone: is the contiguous slice of the allocated virtual memory, which is associated with the pages.

Each virtual page that exists is saved in a region. The kernel maintains a separate task structure for each process in the system task_struct:

The regional structure of a specific region includes:

Vm_start: Point to start

vm_end: Point at end

Vm_prot: Describes read and Write permission permissions for all pages contained in this zone

Vm_flags: Is it shared or private?

Vm_next: Point to Next area

2 . Linux pages exception handling

(1) is virtual address a legal?

Illegal, triggering segment error, terminating process

Legal, go to the next article

(2) is the memory access legal? That is, do you have permission?

Illegal, triggering protection exception, terminating program

Legal, go to the next article

(3) At this time, is the legitimate virtual address for the legitimate operation. So: Select a sacrifice page and change the new one and update the page table if it is modified.

9 . 8 Memory mapping

Linux is called a memory map by associating a virtual memory area with an object on a disk to initialize the contents of the virtual memory area.

Mapping objects:

Normal files in 1.Unix file system

2. Anonymous files (created by the kernel, all binary 0)

One, shared objects and private objects

Shared objects

Shared objects are visible to all the virtual memory processes that map it to their own

Even if you map to multiple shared areas, only one copy of the shared object needs to be stored in the physical memory.

Private objects

Techniques used by Private objects: copy-on-write

Only one copy of the private object is saved in the physical memory

The fork function is the application of the write-time copy technique, as for the EXECVE function:

second, the use of Mmap user-level memory mappings for functions

1. Create a new virtual storage area

#include <unistd.h>

#include <sys/mman.h>

void *mmap (void *start, size_t length, int prot, int flags, int fd, off_t offset);

A pointer to the mapped area was successfully returned, or 1 if an error occurred

Parameter meaning:

start: This area starts from start

FD: File descriptor

Length: Continuous object slice Size

offset: Offset from beginning of file

Prot: Access permission bit, as follows:

prot_exec: Consists of instructions that can be executed by the CPU

Prot_read: Readable

Prot_write: Writable

Prot_none: cannot be accessed

flag: Consists of bits that describe the type of object being mapped, as follows:

Map_anon: Anonymous object, virtual page is binary 0

map_private: Private, copy-on-write objects

Map_shared: Shared objects

2. Delete the virtual storage:

Include

Include <sys/mman.h>

int Munmap (void *start, size_t length);

Successful return 0, failure return-1

Delete from start, the region consisting of the next length byte.

9.9 Dynamic Memory allocation

When most C programs require additional virtual storage at run time, a dynamic memory allocator is used that maintains the virtual memory area of a process, called a heap. The heap is an area that requests a binary zero, immediately after the uninitialized BSS region, and grows upward, and for each process, the kernel maintains a variable brk that points to the top of the heap.

The allocator maintains the heap as a collection of blocks of different sizes, each of which is a virtual storage block, allocated or idle. The allocated supply is used or freed by the process, and idle waits are allocated by the application.

There are two basic styles of dispensers:

Explicit allocator: Requires the program to explicitly release any allocated blocks
An implicit allocator: (also called a garbage collector) requires the allocator to detect when an allocated block is no longer in use by the program, and then releases the block.

malloc and the Free function

The C standard library provides an explicit allocator called the malloc package that can be called to allocate blocks from the heap. malloc does not initialize the memory it returns. Free frees the allocated heap block, noting that its arguments must point to a starting position for an allocated block obtained from malloc.

Why to use dynamic allocator assignment:

The most important reason that a program uses dynamic memory allocation is that it often knows the size of some data structures until the program actually runs.

Requirements for dispensers:

Processing arbitrary request sequences
Respond to requests immediately
Use only heap
Snap To block
The allocated block is not modified.
Maximized throughput: The number of requests completed per unit of time, including allocation requests and release requests
Maximize memory utilization: measured by peak Utilization UK, typically the ratio of the aggregate payload PK to the current heap size.

Fragments

The main reason for low heap utilization is that fragmentation occurs when there is unused memory but does not meet the allocation request.

Internal fragmentation: Occurs when an allocated block is larger than the payload
External fragmentation: is when free memory is aggregated enough to satisfy an allocation request, but there is no single free block large enough to handle this request.

Implicit idle list:

This becomes a linked list structure by connecting the blocks and the headers of the allocated blocks until the last one sets the assigned bit and the terminating head of size zero.

To place an allocated block:

When the application requests a block of K-bytes, the allocator searches for the idle list and looks for a large enough way to place the requested free block. There are generally first-time adaptation, next adaptation, and optimal adaptation strategies.

Allocate free Blocks:

Determines how much space in the block is allocated to the allocation request after finding the free block.

Two options:

Use the entire free block: Easy to create internal debris
Split free Blocks: Part becomes split fast and part becomes a new free block.

Gain additional pair of memory

When the allocator cannot find a suitable free block for the request block, it usually merges the free blocks or requests additional heap storage to the kernel.

Merge free segment Blocks

False fragments: Two contiguous free blocks are not combined. Any actual allocator must merge adjacent free blocks, a process called merging.

Merge now
Postpone the merger.

Merge with Border markers

Boundary markers: By adding a foot at the end of each block as a boundary marker, the foot is a copy of the head so that the dispenser can determine the starting position and state of the previous block by examining the foot of the previous block. Note that the foot of the previous block is always at the distance of a word at the beginning of the current block.

Explicit Idle list:

By organizing the free block into some form of explicit data structure, you can place a pointer that implements the data structure inside the body of the free block. This avoids the need to traverse an allocated block as if it were an implicit idle list. This list is maintained in both LIFO and in order of address sequence.

Detach Storage:

The heap is managed by maintaining multiple idle lists, where the blocks in each list are roughly equal in size, and the size of the blocks in different lists is generally different, usually by size class.

Simple separation of storage: The free list of each size class contains blocks of equal size, and the size of each block is the size of the largest element in the class.
Separation adaptation: The allocator maintains an array of idle lists, each of which is associated with a size class. Usually when allocating a block, first determine the requested size class, then make the appropriate idle list for the first time, find the appropriate block and split it, and insert the remainder into the appropriate size class, or search or request to the system in the idle list of a larger size class. When released, the merge is executed and the appropriate free list is inserted.
Partner system: A special case of separation adaptation, its size class is a power of 2, each time it is allocated the original block, until it matches the requested block.

9 . Waste Collection

The garbage collector is a dynamic storage allocator that automatically frees the allocated blocks that the program no longer needs, called garbage, and the process of automatically reclaiming heap storage is called garbage collection.

First, the basic knowledge

The garbage collector sees memory as a forward-reachable graph, only if there is a forward path from any root node and reaches p, it is said that node p is reachable, and the unreachable point is rubbish.

Second, Mark&sweep garbage collector

There are two stages:
• Tag: Mark out all accessible and assigned successors of the root node

• Clear: Release each unmarked allocated block.

Related functions:

PTR defined as typedef void *PTR

ptr isptr (PTR p): if p points to a word in an allocated block, it returns a pointer B to the starting position of the block, otherwise returns null

int blockmarked (PTR b): Returns TRUE if Block B is already marked

int blockallocated (PTR b): If block B is allocated, it is long returned ture

void Markblock (PTR b): Tag block B

int Length (PTR b): Returns the size of block B in words, excluding the head

void Unmarkblock (PTR b): Changes the state of block B from marked to unmarked

ptr nextblock (PTR b): Returns the successor of Block B in the heap

III, C the conservative mark&sweep

--Balanced binary tree

C's Mark&sweep collector must be conservative, and the root cause is that the C language does not tag the memory location with type tags.

9 . memory-related errors common to all C programs

Indirectly referencing bad pointers

Common error--SCANF errors

Read Uninitialized memory

Common error--assuming that the heap memory is initialized to 0

Allow stack buffer overflow

Common error--Buffer overflow error

Assume that the pointers and the objects they point to are the same size

Working in the distance action at distance

Cause dislocation errors
Reference pointer, not the object it points to
Misunderstanding pointer arithmetic
Referencing a non-existent variable
Referencing data in an empty heap block
Cause memory leaks

20135306 14 Weeks Study Summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More