Part IV: Storage Management

Last Update:2016-06-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Part IV: Storage Management Chapter 10th file System interface

File system: Provides a program and data mechanism for online storage and access to computer operating systems and all users. File system consists of: file and directory structure.

10.1 Document Concepts

file : The unified interface for the information store provided by the operating system. The operating system abstracts the various properties of the storage device, thus defining the logical unit (file) and then mapping the file to the physical device.

A file is a collection of names that are related to information that is recorded on external memory. From the user's perspective, a file is the smallest allocation unit of logical external memory.

10.1.1 file Properties

file Properties :

Name
Identifier
Type
Position
Size
Protection
Time, date, and user ID

10.1.2 file Operations

Files are abstract data types , related file operations:

Create a file
Write a file
Read the file
Relocate within a file
deleting files
truncated files

File Open Information:

File pointers
File Open counter
File disk Location
Access rights

File Lock: Allows a process to lock a file to prevent other processes from accessing it.

shared lock : Similar to reader lock, can be used for concurrent acquisition of multiple processes;

Private Lock (Exclusive lock): Similar to writer lock, only one process can acquire this lock.

The operating system can provide mandatory (mandatory) or recommended (advisory) file locking mechanisms.

10.2 File access Methods

Sequential access
Direct access (relative access): A file consists of a fixed-length logical record that allows the program to speed and write in any order. The direct access method is a file-based disk model that allows random read and write of arbitrary file blocks. Direct access to files can be a lot of information immediately.
3. Other Access : Create a file index (including each block pointer), search the index first, and then directly access the file according to the pointer to find the required records.

10.3 Directory Structure

Catalog Overview : A catalog can be seen as a symbol table, which converts file names to directory entries.

Common scenarios for defining the logical structure of a directory:

Single-Layer Structure catalog
Two-tiered directory structure: Create a stand-alone directory for each user, that is, each user has their own user file directory (DIRECTORY,UFD). It solves the problem of name conflict, but it still has its drawbacks: the structure can effectively isolate the user. This isolation is an advantage when users need to be completely independent, but it is a disadvantage when users need to collaborate on a task and access other files.
Tree directory structure: Trees are the most common directory structure. The tree has a root directory, and each price in the system has a unique path name. (absolute pathname, relative path name)
No-Ring Diagram directory: Allows directories to contain shared subdirectories and files. Ways to implement Shared files: (1) UNIX: Create a new directory entry called a link, which is a pointer to another file or directory. Second, another way to share a file is to repeat all shared file information in the shared directory.
General Diagram Catalog: The main advantage of a loop-free graph is that simple algorithms can be used to traverse graphs and determine if a file reference exists. When there is a ring, there is a self-reference, which requires garbage collection.

10.5 file shares 10.5.1 Multi-user

For sharing and protection, most systems adopt the concept of a file (or directory) owner (or user) and group . The owner is the user with the highest control of the directory and can change attributes and grant access. A group attribute defines a subset of users who have the same permissions on a file.

10.5.2 Remote File system

How remote files are shared:
1. The user through the program (FTP) can be implemented in the machine between the manual transfer of files.
2. Distributed File System (DFS), remote directory can be accessed directly from this computer.
3. World Wide Web

FTP can be used for anonymous access and authenticated access. Anonymous access allows users to transfer files without a remote system account.
1. Client-Server model
The remote file system allows one computer to install one or more file systems on one or more remote machines. NFS (Network File system)
2. Distributed Information System

To facilitate the management of client-server services, distributed information systems are also known as distributed naming services to provide unified access to the information needed for remote computing. The Domain Name System (DFS) provides a conversion between host name to network address for the entire Internet.
3. Failure mode

10.5.3 consistency semantics

consistency Semantics (consistency semantics): An important guideline for evaluating file system support for file sharing

UNIX semantics
back to utterance meaning : AFS file System
Cannot modify shared file semantics

10.6 Protecting 10.6.1 access types

Through access control (controlled restricts the type of file access that can be controlled.

10.6.2 access Control

The most common way to solve a protection problem is to control it based on the user's identity. The most common way to implement identity-based access is to add an access control list (Access-control List,acl)for each ask price and directory, given each user name and the type of access control it allows.

The 11th Chapter File system Realization

The file system provides a mechanism for online storage and access to file content, including data and programs. File systems reside permanently on external memory, and external memory can store large amounts of data permanently.

11.1 File System Structure

Disk provides a lot of external memory space to maintain the file system, disk two features, making it a convenient medium for storing multiple files:
1. Can be rewritten in situ
2. You can access any piece of information on the disk directly.

To improve I/O efficiency, the I/O transfer between memory and disk is in blocks, one or more sectors per block, depending on the disk drive, where the sector varies from 32~4096b, typically 512B.

The file system consists of different layers: application software--Logical file system--File organization system--Basic file System->I/O control--Equipment

I/O control is the lowest level, consisting of device driver and interrupt handler, which realizes the transmission of information between memory and disk.

Basic File System : You can read and write to the disk by simply sending a generic command to the appropriate device driver.

File Organization Module : Know the file and its logical blocks and physical blocks.

logical File System : Manages metadata, which includes all the structural data of the file system, not the actual data (or file contents). The logical file system maintains the file structure through a file control block. file control block (BLOCK,FCB) Contains information about the file, such as the owner, permissions, and location of the file contents.

11.2 File System Implementation 11.2.1 Overview

File system structure:

boot control block (for each volume): Includes information required by the system to boot the operating system from the volume. UFS is called boot block,NTFS (New Technology file system), which is the filesystem of the WINDOWSNT environment, which is called Partition boot sector (partition boot sector).
(the volume control block for each volume (volume) includes details about the volume (or partition), such as the number of blocks of the partition, the size of the block, the number of free blocks, and the pointer. UFS is called a Super Block, and in NTFS it is stored in the Master File table .
The directory structure of each file system is used to organize files. UFS contains the file name and the first-close index node (inode) Number . It is stored in the Master File table in NTFS.

11.2.2 Partitioning and Installation

disk array (redundant Arrays of independent Disks,raid), with the meaning of "redundant array of independent disks".
The disk array is made up of many inexpensive disks, combined into a large disk group, which uses individual disks to provide data with the added effect to improve the performance of the entire disk system. Using this technique, the data is cut into many sections, which are stored on each hard drive.

The partition can be "raw", that is, there is no file system, or "cooked" (cooked) that contains the file system. the "raw" disk is used for places where there is no suitable file system.

root partition (root partition): Put the occasional OS kernel or other system files in boot fashion into memory.

11.2.3 Virtual File system

Using data structures and subroutines, you can separate the functions and implementation details of basic system calls. Therefore, the file system implementation consists of three levels:

First layer: file System Interface : Open (), read (), write () and close (), and file descriptor waits.

Layer Two: virtual file System (VFS), which has two purposes:

The VFS layer separates the unified operation of the file system from the specific implementation by defining a clear VFS interface. Implementations of multiple VFS interfaces can coexist on the same machine, allowing access to multiple file systems that are already installed locally.
VFS provides a mechanism for uniquely identifying a file on the network. VFS based on a file representation structure called Vnode

Third layer: File system type or remote file system protocol.

The main object types in the 4 Linux VFS definition are:

Index Node object (Inode object): Represents a separate file
File Object: Represents an Open file
Super Block Objects (Superblock object): Represents the entire file system
Directory Entry object (Dentry object): Represents a separate directory entry.

11.3 Directory Implementation

The selection of directory allocation and directory management algorithms has a great impact on the efficiency, performance and reliability of the file system.

Linear list : The simplest method is to use a linear list of stored file names and data block pointers. Features: Simple programming but time-consuming operation.
Hash table: The hash table gets a value from the file name and returns a pointer to the element in the linear table. Therefore, it greatly reduces the directory search time. Inserting deletes is simpler, but requires a preliminary measure to avoid collisions (collision)(two file names hash to the same location). Cons: Fixed size and hash function dependency on size.

11.4 Allocation method

Disk space allocation method:

continuous allocation (contiguous allocation): requires each file to occupy a contiguous set of blocks on disk. Sequential allocations support sequential access and direct access, but there are external fragments and no space to estimate the size of the file allocation.
link Assignment (linked allocation): resolves all issues with continuous allocation. With connection allocations, each file is a linked list of disk blocks. The table of contents includes the first pointer and the last pointer of the file. Each block has a pointer to the next piece. Pros: No external fragmentation, no need to describe file size, no need to merge disk space. Disadvantage: Only valid for the sequential access of the file, cannot effectively support the direct access of the file; second, the pointer needs space. Workaround: Multiple blocks are clustered (cluster)and allocated by cluster instead of block, which effectively improves disk access time but increases internal fragmentation, followed by reliability issues, pointer errors, solution: doubly linked list.
A VARIANT that uses a linked list is the use of a file allocation table (FAT) . For MS-DOS and OS/2 operating systems. The start portion of each volume is used to store the fat. Each block has an entry in the table that can be indexed by block numbers.
Index allocation : The direct access problem is resolved by putting all the pointers together. Each file has an index block, which is an array of disk addresses. Index allocations support direct access, and there is no external fragmentation, but because the index to store each file is fast, it is a waste of space.
How to confirm the size of the index block:
- link Scheme : An index block is usually a block of disk. So it can read and write directly. To handle large files, you can quickly link multiple indexes together.
- Multi -level index: A variant of a link representation is a second-level index block that points to a set of secondary indexes, and a second-level index block that points to a file block.
- Combo Scenario : In UFs, the use of a scenario is to place the first 15 pointers of an index block in the inode of the file. The first 12 pointers point to direct blocks , meaning they include block addresses that can store file data. Therefore, a small file that does not exceed 12 blocks does not require additional index blocks. The other 3 pointers point to an indirect block . The first indirect block pointer is a level indirect block address , followed by a level two indirect block pointer, and finally a level three indirect pointer .

11.5 Free Space Management

Free Space table:

bit vectors : The free space table is implemented as: bitmap (bit map) or bit vectors ( bit vector)
Linked list
Group
Count

11.9 NFS

NFS is the implementation and specification of a software system for accessing remote files over a local area network (or wide area network).
One of the goals of the NFS design is to allow work in heterogeneous environments with different machines, operating systems, and network architectures. The NFS specification is independent of these, so other implementations are allowed. RPC (remote Procedure call Protocol)-a remoting protocol that requests services from a remote computer program over a network without needing to know the underlying network technology.

The NFS specification distinguishes between two types of services:
1. Services provided by the installation mechanism (Installation protocol)
2. Remote File Access Protocol (NFS protocol)

11.9.2 Installation Protocol

Installation Protocol (Mount protocol): establishes an initial logical link between the client and the server.

12th 12.1 Large-capacity memory structure Introduction to the mass storage structure 12.1.1 disk

disks (magnetic disk): Provides a large-capacity external memory for modern computer systems.

A disk drive is connected to a computer through a set of lines called the I/O bus . There are a variety of available buses, including EIDE (Enhanced Integrated Drive Electronics),ATA(Advanced Technology Attachment), Serial ATA (Serial ATA, SATA),
USB (universial

12.6 Swap Space Management

Swap space management : Virtual memory uses disk space as a memory extension. The purpose of the Swap space design and implementation is primarily to provide optimal throughput for virtual memory.

Use of 12.6.1 swap space

Swap space can be used to hold the entire process image, including code snippets and data segments;
The paging system may also use swap space only to store swap out memory pages.

12.6.2 Swap Space Location

Swap space can have two locations:
1. Create on ordinary file systems: simple to implement, but inefficient, all over the directory structure and disk allocation data structures require time and excessive disk access.
2. Swap space can be created on separate raw (RAW) disk partitions. Only a separate Swap storage Manager is required to allocate and free blocks.

12.7 RAID structure

Redundant array of disks (redundant Arrays of Independent Disks,raid): The "Redundant array of independent disks" means. The disk array is made up of many inexpensive disks, combined into a large disk group, which uses individual disks to provide data with the added effect to improve the performance of the entire disk system. Using this technique, the data is cut into many sections, which are stored on each hard drive as a method of storing the same data in different places (and therefore, redundantly) on multiple hard disks. By placing data on multiple hard drives, input and output operations can be overlapped in a balanced manner, improving performance. Because multiple drives increase the mean time between failures (MTBF), storing redundant data also increases fault tolerance.

Improved reliability through redundancy
Improve performance with parallel processing

12.7.3 RAID level

Mirroring provides high reliability, but is expensive; dispersion provides high data transfer rates, but does not improve reliability.

Disk dispersion and "parity" bits provide a variety of scenarios to provide redundancy in a ground-cost environment.

RAID Level 0: Refers to a fragmented disk array at the block level, but without redundancy.
RAID Level 1: Disk mirroring
RAID Level 2: Error correcting code structure in memory mode , the memory system has been implemented by parity-based fault detection.
RAID Level 3: Improved Level 2 based on bit interleaving parity structure . Unlike the memory system, the disk controller can detect whether a sector is read correctly, so that a single parity bit can be used for error detection and error correction.
RAID Level 4: block-interleaved parity structure
RAID Level 5: Based on block interleaving parity structure , but it is the data and parity distributed on all n+1 block disks.
RAID Level 6:p+q Redundancy Scheme , which saves additional redundant information to prevent errors on multiple disks. Not using parity, but using error correction code such as Read-solomon code , every 4 bits of data using 2 redundant bits, which allow two disk error.
RAID level 0+1: First scattered, then mirrored;
RAID level 1+0: First mirror, then scatter.

Chapter 13th I/O input system

The computer has two main tasks: I/O operations and computational processing.

13.2 I/O hardware

Interrupt (Interrupt): The basic interrupt mechanism works as follows, and the CPU hardware has an interrupt request line (Interrupter-request Line,irl). The CPU detects IRL after each instruction is executed. When the CPU detects that an existing controller has sent a signal through the interrupt request line, the CPU will save the current state and jump to the interrupt handler (Interrupt-controller)in the memory fixed position, interrupt handler to determine the cause of the interruption, do the necessary processing, Resumes the state, and finally executes the interrupt return instruction so that the CPU returns to the previous execution state.

The interrupt mechanism accepts an address that is used to select a specific interrupt handler from within a collection. For most system structures, this address is offset in a table called the interrupt vector (interrupt vector) . The vector contains the memory address of the special interrupt handler. The purpose of the vector interrupt mechanism is to reduce the need for a single interrupt processing that searches all possible sources of interruption to determine which interrupt requires service. Break Link (interrupt chain), that is, each element within the interrupt vector points to the interrupt handler list header.

The interrupt mechanism also implements interrupt prioritization (interrupt priority), which enables the CPU to delay processing low-priority interrupts without masking all interrupts, or to preempt high-priority interrupts for low-priority interrupt processing.

Direct Memory Access (Direct-memory ACCESS,DMA) controller : A dedicated processor for devices that require a large amount of transmission, which is used to observe the status bit and feed the controller registers into the data by byte. When the DMA transfer is started, the host writes the DMA command block to memory. The block includes the transmitted source address pointer, the destination pointer transmitted, and the number of bytes transferred. The CPU continues to work after the address of the command block is written to DMA. The DMA controller continues to operate the memory bus directly, and without the help of the main CPU, the address can be placed on the bus to start the transfer.
Hardware differences are hidden.

13.3 I/O application interface

The purpose of the device driver is to hide the differences between device controllers for the kernel I/O subsystem, as if the I/O system calls the generic type encapsulates the device behavior and hides the hardware differences for the application.

13.4 I/O kernel subsystem

The kernel provides many I/O-related services. Many services such as scheduling, buffering, caching, spooling, device reservation, and error handling are provided by the kernel I/O subsystem and are built on top of the hardware and device driver architecture. The I/O subsystem is also responsible for protecting itself against bad processes and malicious users.

I/O scheduling : The operating system maintains a request queue for each device to implement scheduling. The kernel that supports asynchronous I/O also has to be able to track many I/O requests, and the operating system has a wait queue for the Device status table .
Buffer: A buffer is an area of memory used to hold data transferred between two devices or between devices and applications. There are three reasons to use buffering: (1) to deal with the difference in speed between data producers and consumers, (2) to coordinate devices that transmit inconsistent data sizes, and (3) to support replication semantics for application I/O.
* Cache *: is a high-speed memory that can retain copies of data.
Spool and device reservation : A buffer used to hold the output of a device device, resolving multiple concurrent application I/O requests for multiplexing
error Handling : I/O system calls typically return a bit to represent the invocation state information to indicate success or failure.
I/O protection : To prevent users from performing illegal I/O, define all I/O directives as privileged directives, so that users cannot issue I/O directives directly, they must be done through the operating system. Operating system in monitoring mode, check whether the request is legitimate. In addition, all memory mappings and I/O port memory locations are protected by a memory-protected system to prevent user access.

13.5 Converting I/O operations to hardware operations

The modern operating system obtains the port address or memory-mapped address of the device controller through a multilevel look-up table of requests and physical device controller time.

Part IV: Storage Management

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Part IV: Storage Management

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Part IV: Storage Management

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support