International - English

Cart Console

Topic Center

Contact Sales

Home > Others

Information Security System Design Foundation Seventh Week study summary

Last Update:2015-10-25 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sixth chapter Memory Hierarchy

Overview of this chapter: learn about the types and characteristics of storage devices. Focus on understanding the principle of locality and the application of the caching idea in the storage hierarchy.

6.0 Preface

1. Memory system is a hierarchical structure of storage devices with different capacity, cost and access time. The CPU registers hold the most commonly used data. Small, fast cache memory near the CPU as part of the buffer area of data and instructions stored in relatively slow main memory . Main memory is temporarily stored on large, slow-speed disks, which are often used as buffer areas for data stored on disks or tapes of other machines connected over the network.

2, the data is stored in the CPU registers, then during the execution of the instruction, in 0 cycles can access to them. If stored in the cache, 1-30 cycles are required. If stored in main memory, it takes 50-200 cycles, and if stored on disk, approximately tens of millions of cycles are required.

3. A basic and enduring idea in a computer system: If you can understand how the system moves data stored up and down in the memory hierarchy, you can write your application so that their data items are stored higher in the hierarchy, where the CPU can access them more quickly. The idea revolves around a computer program, a basic attribute called locality.

4. This chapter mentions basic storage technologies: SRAM memory, DRAM memory, ROM memory, rotating and solid-state drives.

5. Cache memory is the cache area between the CPU and main memory as they have the greatest impact on the performance of the application.

Memory Mountain: An interesting way to describe the performance of a memory hierarchy on a machine. , it gives the read access time as a function of locality.

6.1 Storage Technology

This section requires: Learn about three common storage technologies: RAM, ROM, disk, Ram SRAM, DRAM, features and applications, ROM has prom, EPROM, E2prom, FLASH, disk is the focus, involving the back I/O and file system, requires a good practice Disk structure: Platters, tracks, sectors, gaps, cylinders; disk drives; disk capacity; Access time: Seek, rotate, transmit; Logical Disk BLOCK: This is important, memory can be considered as a string array, disk can be seen as a block array.

1. Random Access memory

Random access memory is divided into two categories: static and dynamic. Static RAM (that is, SRAM) is faster than dynamic RAM (that is, DRAM), but it is much more expensive. SRAM is used as a cache memory, either on the CPU chip or under the chip. DRAM is used as a frame buffer for primary and graphical systems.

1) Static RAM

The A.sram stores each bit in a bistable memory unit. Each unit is implemented using a six-transistor circuit. Can be maintained indefinitely in one of two different voltage configurations or states. Any other state is unstable-starting from an unstable state, the circuit is quickly transferred to one of the two stable states.

B. The principle is similar to the "inverted pendulum".

The bistable characteristic of the C.sram memory unit is that, as long as there is electricity, he will always keep its value.

2) Dynamic RAM

DRAM stores each bit as a charge to a capacitor. DRAM memory cells are very sensitive to interference.

3) The traditional DRAM

The units (BITS) in a DRAM chip are divided into D-units, each of which consists of a W DRAM unit.

Each d*w DRAM has DW bit information. The super-unit chant organizes a rectangular array of R-row c columns, here rc=d. Each element is tangible such as (I,J) address, where I represents row, J for column.

The information flows into and out of the chip through an external linker called a PIN, with each pin carrying a 1-bit signal.

Each DRAM chip is connected to a circuit called a storage controller that can transmit a w bit to each DRAM chip at a time or a w bit from each DRAM chip at once. In order to read the contents of the Super Unit (I,J) sent back to the controller as a response.

One reason for the circuit designer to call the DRAM organization a two-dimensional array rather than a linear array is to reduce the number of address pins on the chip.

The disadvantage of two-dimensional array organization is that the address must be sent in two steps, which increases the access time.

4) Memory Module

The DRAM chip is packaged in a memory module and is plugged into the expansion slot on the motherboard. Common packaging includes a 168-pin dual-inline memory module that transmits data from 64 bits to the storage controller and outgoing data from the storage controller, and includes a 72-pin single-line memory module that transmits data in 32-bit blocks.

By connecting multiple memory modules to the storage controller, the main memory can be aggregated.

5) Enhanced DRAM

>> Fast page Mode dram>> Extended data output dram>> sync dram>> double Data rate Sync Dram>>rambusdram (RDRAM) >> video RAM

6) Non-volatile memory

A.ram loss of data, is volatile , ROM is non- volatile, collectively referred to as read-only memory. There are several types:

prom-programmable ROM, can only be programmed once eprom-erasable programmable ROM, the number of times can be erased and written in the order of 1000 EEPROM, electronic erasable Prom, the number of times can be programmed in order to 100,000 times

b Flash Flash

Based on EEPROM, it provides fast and durable nonvolatile storage for a large number of electronic devices. Applications: Digital cameras, mobile phones, music players, PDAs, notebooks, desktops, server computer systems.

C. Programs stored in ROM devices are often referred to as firmware, and when a computer system is powered on, he runs the firmware stored in the ROM.

7) access to main memory

A. The bus is a set of parallel conductors that can carry addresses, data, and control signals.

B. Type of bus:

① system bus-connect CPU and I/O bridge

② Memory Bus--connect I/O bridge and main memory

③I/O bus: The I/O bridge translates the electronic signal of the system bus into the electronic signal of the memory bus and also connects the system bus and the memory bus to the I/O bus.

2. Disk storage

Disk is a widely used storage device for large amounts of data.

1) Disk Construction

>> Discs >> Surfaces: Two surfaces per platter >> spindle: central disc, rotatable >> rotational rate: usually 5400~15000/min>> track: Concentric Circle >> Sectors: Each track is divided into a set of sectors >> data bits: Each sector contains an equal number of data bits, typically 512 bytes >> gaps: Stores the format bits used to identify sectors >> disk drives-disks-Rotating disks >> Cylinder: The set of tracks that are equal to the center of the spindle on all disc surfaces.

2) disk capacity: The maximum number of digits that can be recorded on a disk. the determining factors are as follows:

Recording density: bit/inch track density: Channel/inch surface density: bits per square inch of which, increase the surface density can increase capacity.

Disk Capacity Calculation formula:

3) disk operation

The disk reads and writes data as a block of sector size.

Access time consists of three parts, the average time to access a disk sector content is the average seek time, the average rotation delay and the average transfer time of the sum.

A. Seek time

That is, the time it takes to move the drive arm. Depends on the position of the read/write head and the speed at which the drive arm moves on the disc. Usually 3-9ms, up to 20ms maximum.

B. Rotation time

That is, the drive waits for the first bit of the target sector to rotate to the read/write head depending on the disc position and rotation speed. Maximum rotational delay =1/rpm X 60secs/1min (s) The average rotation time is half the maximum value.

C. Delivery time

dependent on rotational speed and number of sectors per track average delivery time = 1/rpm x 1/(Average sectors/tracks) x 60s/1min

4) Logical Disk block

Disk, track, sector, this ternary group uniquely identifies the corresponding physical sector.

5) Connect to I/O device (I/O bus)

The I/O bus is connected to CPU, main memory and I/O devices.

Universal Serial Bus usb:2.0 maximum bandwidth 60mb/s,3.0 Maximum bandwidth 600mb/s graphics card (adapter) Host Bus Adapter

6) Accessing the disk

DMA: Direct memory access. The device can perform its own read or write bus transactions without the need for CPU interference.

The CPU uses a technique called memory-mapped I/O to issue commands to I/O devices.

3. Solid Disk

SSD is a flash -based storage technology. Unlike spinning disks, solid-state disks have no moving parts.

1) structure

An SSD package consists of one or more flash chips and a flash translation layer:

Flash memory chip-corresponds to a rotating disk in a mechanical drive Flash translation layer (Hardware/firmware device)--corresponding disk controller

2) features

Consisting of semiconductors, the random access time of parts without movement is faster than spinning the disk to lower energy consumption, stronger and more expensive to wear.

4. Storage Technology Advantages

Different storage technologies have different price and performance tradeoffs the price and performance attributes of different storage technologies vary at a different rate to increase density and thus reduce cost compared to lower access times DRAM and disk performance lag CPU performance

6.2 Local Sex

This section requires: Local principle: Temporal locality, spatial locality (p429 the last section of "Memory Mountain"), data reference locality, and locality of command.

Locality principle: A well-written computer program often tends to refer to data items that are adjacent to other recently referenced data items, or to the data item itself that has recently been referenced. It contains spatial locality and temporal locality.

>> applications in different fields

① Hardware layer: The introduction of cache memory to save the most recently referenced instructions and data items, thereby increasing the access speed to main memory. ② OS level: The system uses main memory as the cache of the most recently referenced block in the virtual address space, using main memory to cache the most recently used disk block in the disk File system ③ application: The Web browser places the most recently referenced document on the local disk.

1. Locality of reference to program data

1) The reference pattern of the step is K: in a continuous variable, every k element is accessed, which is referred to as the reference pattern of the step size K.

The 1-Step reference pattern: The sequential access to each element of a vector, sometimes called the sequential reference pattern , is a common and important source of spatial locality in the program.

>> generally, as the step size increases, the spatial locality decreases.

2) multidimensional arrays: see examples in the book.

2. Locality of Instruction

Program instructions are stored in memory, and the CPU must take out (read out) these instructions. But one important attribute of code that differs from program data is that it cannot be modified at run time.

3. Summary of Local

Quantitative evaluation of the simple principle of locality in a program: ① repeated reference to the same variable program has a good time locality ② for a program with a reference pattern with a step size of K, the smaller the step size, the better the spatial locality ③ for the instruction, the loop has good time and spatial locality. The smaller the loop body, the more the loop iteration number, the better the locality.

6.3 Memory Hierarchy

This section requires: System View "1+1>2" (Extrapolate: symmetric asymmetric encryption to form a hybrid encryption system, hybrid vehicles); Central idea: Each layer of storage devices is the next level of "cache".

1. Cache

Cache: is a small and fast storage device that acts as a buffer area for data objects stored in larger, slower devices. Caching: The process of using a cache is called caching. Data is always copied back and forth between the level K and the k+1 layer with the block size as the transmission unit. The block size is fixed between any pair of adjacent layers, but the other hierarchy pairs can have different block sizes. Features: The lower the layer, the larger the block.

1) Cache Hit

When a program needs a data object D in Layer k+1, first look for D in a block that is currently stored in the K layer, and if D is just cached in the K layer, it is called a cache hit. The program reads D directly from level K, faster than reading d from the k+1 layer.

2) Cache Misses

That is, there is no cached data object D in Layer K. The K-tier cache then extracts the block containing d from the k+1 cache. If the level K cache is full, it is possible to overwrite an existing block

<< replace (overwrite, expel) policy:

Random substitution strategy-randomly sacrificing a block has recently been used at least to replace the policy lru-sacrifice the last accessed time distance now to the furthest block.

3) Types of cache misses

A. Mandatory miss/Cold not hit

That is, the K-tier cache is empty (called a cold cache), and access to any data object is not hit.

B. Conflict-not-hit

Because of a placement policy, placing a block limit on the k+1 layer in a small subset of the K-layer block causes the cache to be not full, but the corresponding block is full and will not be hit.

C. Capacity not hit

When the size of the working set exceeds the size of the cache, the cache undergoes a capacity miss, which means that the cache is too small to handle the working set.

4) Cache Management

Some form of logic must manage the cache, while the logic for managing the cache can be either hardware, software, or a collection of both.

2, Memory hierarchy structure concept summary

6.4 Cache Memory

This section requires: Cache architecture (S,E,B,M), cache groups, cache lines, blocks, mappings, hits, cache management.

①L1 cache: Between the CPU register file and main memory, access speed of 2-4 clock cycles ②l2 cache: Between the L1 cache and main memory, access speed 10 clock cycles ③l3 cache: between L2 cache and main memory, access speed 30 or 40 clock cycles

1. General-Purpose cache memory structure

A cache is an array of cache groups whose structure can be described using tuples (s,e,b,m):

S: There are s=2^s cache groups in this array e: Each group contains an E cache line B: Each row is composed of a b=2^b byte block of data m: each memory address has m bits, which forms m=2^m different addresses

In addition, there are markers and valid bits:

Valid bits: Each row has a valid bit indicating whether the row contains meaningful information marker bits: t=m-(b+s), a unique identifier for the block group index bits stored in this cache line: s block shift: B

The cache structure divides m addresses into T-markers, S-group index bits, and B-block offsets.

1) Cache Size/capacity C: Refers to the size of all blocks and, not including the marker bit and the valid bit, so:

C=s*e*b

2) Working process

S,b divides the M address bits into three fields:

First, find out which group the word must be stored in by the S group index bit and then the T tag bit tells us which line in this group contains the word (when and only if a valid bit is set and the tag bit of the row matches the marker phase in the address) B block shifts the word offset in the B-byte block

2. Direct Mapping cache

The cache is divided into different classes according to E (number of cache rows per group), E=1 is called direct mapping cache.

The cache determines whether a request is hit, and then the process of removing the requested word is divided into three steps:

1. Group selection 2. Row matching 3. Word extraction

1) Group selection

The cache extracts S group index bits from the middle of the address of the W: an unsigned integer that corresponds to a group number. Analogy: Cache-about an array of groups, the group index bit is the index to this array.

2) Line Matching

Two sufficient and necessary conditions to determine the cache hit: The row sets a valid bit in the cache line and matches the mark in the address of W

3) Word Selection

Block-an array of bytes, the byte offset is an index to this array.

4) row substitution when cache misses

5) Direct map cache after running

A tag bit and index bit that uniquely identifies the block in memory that is mapped to the same cache group is uniquely identified by the tag bit

>>CPU perform a series of actions read

1. First use the index bit, Determine which group to target 2. Then see if the corresponding group is valid: 1) If the buffer is not valid, the cache is removed from the memory or the lower layer to find the block, stored in the corresponding group, then the valid position 1, return the required value 2) if valid, then according to the tag to find whether there is a matching tag: ① If there is, then cache hit, return the desired value ② if not Yes, the row is replaced and then returned.

6) Collision misses in direct map cache

Jitter: Cache repeatedly loads and expels the same cache block for group reasons: These blocks are mapped to the same cache group. Workaround: Place B-byte padding at the end of each array so that they map to different groups.

3. Group-linked Cache

E-channel group-linked cache: 1<e<c/b

1) Group selection

2) line matching and word selection: The form is (key, value), with key as a token and a valid bit to match, match on the back of the value.

Important idea: Any row in the group can contain any memory block mapped to the group, so tell the cache that each row in the group must be searched.

The criteria for judging the match are still two sufficient and necessary: ①有效②标记匹配

3) Line substitution: A blank line replaces a blank row, there is no blank line, and the substitution policy is applied:

Randomly replace the least frequently used policy LFU: Replace the row that has the least number of references in a previous time window. Least recently used policy LRU: replaces the last line that was visited the longest time.

4. Fully-connected cache (e=c/b)

1) Group selection: Only one group, default group 0, no index bit , address is divided into only one tag and one block offset.

2) line matching and word selection: The same group is associated.

<< is only suitable for small caches.

5. Write

1) When writing hit, update the lower layer of the Copy method:

A. Write directly and immediately bring the cache block of W to the lower layer. But each write will cause the bus traffic. B. Write back, only if the replacement algorithm is to evict the updated block, it is written to the lower layer immediately thereafter. Conform to the principle of locality, significantly reduce bus traffic. However, with added complexity, an additional modification bit must be maintained for each cache line.

2) How to handle the write misses

A. Write allocation---usually write back the corresponding: Load the block in the corresponding lower layer into the cache, and then update the cache block. B. Non-write assignment---usually write-in correspondence: Avoid caching and write this word directly in the lower layer.

6, the real cache hierarchy

The cache saves both data and instructions. Save instruction only: I-cache Save the program data only: D-cache Save the instruction and save the data: Unified cache

7. Performance impact of cache parameters

1) Performance:

No hit = number of misses/number of reference hits = 1-missed hit time not hit penalty: Because of the extra time required to miss

2) Specific impact

Cache Size: Hit ratio +, hit time + block size: Spatial locality +, hitting ratio +, cache line number-, time locality-, not hit penalty + associated degree: E value big, jitter-, Price +, hit Time +, no hit penalty +, control logic + "compromise for not hit penalty low, low degree of connectivity, Hit penalty high, use high-degree-of-coupling "write policy: the farther down, the more likely it is to write back instead of write directly

6.5 Writing Cache-friendly code1. Basic methods

Let the most common scenarios run fast within each loop internal cache misses the minimum number of

2. Important issues

Repeated references to local variables are good (time locality) the reference pattern with step 1 is good (spatial locality)

6.6 Memory Mountain

This is an important concept of understanding. Help us understand the structure of the underlying storage of a computer system.

======================================================================

Problems encountered

6.11/6.12/6.13 is not very able to understand the problem-solving ideas, feel a little puzzled. The previous topic is better understood, the comparison of some graphs, formulas, tables in the summary can be done. This mathematical type of computational problem is still a good one for me. I always think that when learning to understand things will always be better than the knowledge of the memory to be strong. But for the bottom of some of the more abstract concept I still do not understand, and rarely can learn it in peacetime, do not know what the meaning of learning it, so learning has always been distracted, the efficiency is particularly low, completely because the requirements of learning to learn. Hope that there will be some of the code what the perception of practice can go to the hands of the operation let me into the state of it.

In addition, the bus part of the previous learning assembly when the introduction that class teacher has slightly mentioned some, ROM, ram differences in the last semester of the digital logic circuit in the final chapter is also slightly mentioned. But because it is not the focus of the assessment, learned also forget, this time again to learn to let me on the computer subject of the internal synthesis has a number of understanding, understand that some things do not mean that it is not important, did not learn solid words, owed debt is always to the tat. It's still hard to learn in the back.

Reference documents

1, "in-depth understanding of the operating system" PDF

2, Shang students of the blog Bus section of the diagram and the following two sections of the Knowledge Summary (blog link: http://www.cnblogs.com/20135202yjx/p/4907828.html)

3, "Digital logic Circuit" (for reference understanding)

Information Security System Design Foundation Seventh Week study summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Information Security System Design Foundation Seventh Week study summary

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support