[Modern operating system notes] [Chapter fourth Document System]

Source: Internet
Author: User
Tags what file system

Chapter fourth Document System 4.1 files

From the perspective of the 用户 study 文件 . How users use 文件 文件 them, with those features.

4.1.1 File naming

文件An abstraction mechanism that provides a way to keep information on disk and to read it later.

    • Some file systems are case-sensitive, others are not.

      • UnixIs the former, the MS-DOS latter
    • FAT-16, FAT-32 , NTFS .

      • FAT-16(File Allocation Table,文件配置表): Windows 95
      • FAT-32: Windows 98
      • NTFS(New Technology File System,新技术文件系统): After all the file systems of the win series
4.1.2 File Structure

文件There are many ways to construct a list of three common ways.

    • BYTE sequence

      • What the operating system sees is bytes, and any meaning is explained only in the user program.
      • UnixAnd Winodws both are in this way.
    • Record sequence

      • Both read and write are in 记录 units
      • 80The time of the column 穿孔卡片 or the mainstream.

        • 80characters = one record.
      • This is not the way it is now.

    • Tree

      • Similar to map , key-value to, and utilized 平衡树 to maintain key values.
        • So that you can quickly find key values.
      • In some of the mainframe computers that deal with business systems are welcome
4.1.3 File types

Normal file (regulr files): Contains information about the user

    • ASCIIFile

      • Consists of multiple lines of text

        • UnixEach line of the system ends with a newline character ( /n ).
        • WinThe system takes a carriage return ( /r ) and a newline character ( /n ).
      • Advantage

        • Can be displayed and printed, and can be edited with a text editor.
        • As input or output from other programs.
    • 二进制File
      • With a certain internal structure, the program that uses the file only understands this structure.

Directory: A system file that manages the structure of a file system.

4.1.4 File access
    • 顺序存取(sequential access)

      • Early operating systems only have this one way.
        • For example, if you want to read adsadasdsaxdas the 7th place, you must find the 7th place from the beginning.
        • Very inefficient, but suitable for 磁带 the way you work. has now been eliminated.
    • 随机存取文件(random access file)

      • The ability to read a file of bytes or records in any order .
      • There are two ways to indicate where to read the file.
        • The first type:: The read location of the file to begin reading is given in the operation.
        • The second kind : maintain one 文件位置 . Use the seek settings.
4.1.5 File Properties

File-related information, such as creation date, size, and so on. This information 文件属性(attribute) is called, some people call it 元数据(metadata) .

    • The first 4 properties are related to File Protection .
    • Flag bit
    • Time field
    • Size field
4.1.6 File Operations
    • creat
    • delete
    • open

      • Load the file attributes and the disk address table into memory.
    • close

      • Close the file to free the internal table space.
    • read

    • write
    • append
    • seek
    • get attributes
    • set attributes
    • rename
4.1.7 A sample program that uses a file system call.

copyfile abc xyzthe operation was completed

4.2 Catalog

The directory itself is also 文件 .

4.2. Level 11 Directory System

目录系统The simplest form is that a directory contains all the files.

    • such as telephones, digital cameras and some portable music players using
4.2.2 Hierarchy Directory System

4.2.3 Path name
    • Absolute path

      • Must 分隔符 begin with
        • WIN:\
        • Unix:/
        • MULTICS:>
    • Relative path

      • Must be 工作目录 used together with the.
      • A few special symbols
        • .: Current directory
        • ..: Previous level Directory
        • ~: The Linux root directory of your own users
        • ~user: Linux user The root directory in
        • -: Last Linux recently accessed directory in
4.2.4 Directory Operations
    • creat
    • delete
    • opendir
    • closedir
    • readdir
    • rename
    • link

      • Connection technology allows the same file to appear in multiple directories.
      • Increase the number of i节点 counters for the file (record the directory tree containing the file).
      • Sometimes called 硬连接(hard link) .
        • 硬链接Do not consume disk space
        • 硬链接Can only be used for files and cannot span partitions.
        • 硬链接It's no different from the original file, and it shares an inode number.
        • 硬连接is completely equal, and is in control of the real, rather than a shortcut.
    • unlink

      • Delete the connection file.
      • In Unix , the system call to delete the file is actuallyunlink
    • 符号连接: Similar to shortcut, also called软连接

      • Boundaries that can span disks
      • Slow speed
4.3 Implementation of the file system

The perspective of the realization 文件 .

4.3.1 File system layout

File systems are stored 磁盘 on the

    • Most disks are divided into one or more 分区 .

      • Each 分区 has a separate file system.
    • The disk is 0号扇区 called 主引导记录(Master Boot Record,MBR) , which is used to boot the computer.

    • MBRThe end is 分区表 .

      • The table gives the start and end addresses for each partition.
      • A partition in the table is marked as 活动分区 .

At 引导 the time of the computer being

    • BIOSRead in and execute MBR .
    • MBRThe first thing to do is to determine 活动分区 , read into its first block, called 引导块(boot block) , and execute it.
    • 引导块The program in will mount the partition 操作系统 .
      • For the sake of unification, all 分区 starts 引导块 , even without the operating system.

The layout of the subsequent disk partitions varies with 文件系统 each other. The above is a possible layout.

    • The first one is超级块(superblokc)

      • 超级块Contains all the key parameters of the file system.

        • Determines the file type used 魔数 .
        • The number of file system data blocks.
        • Other important management information.
      • When the computer starts, or when the file system is first used, it 超级块 is read into memory.

    • Information about the file system free block.

      • Can be given using a bitmap or a pointer list.
    • may have i节点
      • is an array of data structures, one for each 文件 .
      • i节点Explains all aspects of the file.
    • 根目录
    • 文件和目录
Implementation of the 4.3.2 file

Key issues in file storage implementation

    • Log individual files using those disk blocks separately
1. Continuous distribution

mallocDynamic Storage allocation is identical to the model. (But malloc these shortcomings can be overlooked)

Advantages :

    • Simple to implement

      • Simply log the file's disk address and the number of blocks of the file
    • Good reading performance

      • It takes only one look, then no need 寻道 and no 旋转 delay.
      • The data is entered at a rate of full disk bandwidth.

Disadvantages :

    • There are too many external fragments to maintain a free list to store files.

      • But in order to find a suitable block in the free list,

        You have to know the final size of the file.

    • Most of the time the 文本文档 final size of the file is not known, so this is a mishap.

Usage Scenarios

    • In this case, the size of the file is known beforehand and will not be changed in subsequent use.

      CD-ROMFile system.

    • DVDThe file system

Years ago, the idea was abandoned because its simplicity and high performance were actually used in the disk file system, and later because it hated having to specify the size of the final file when it was created. But with CD-ROM the DVD advent of, and other disposable write optical media , the sudden succession of distributions has become a good idea. So it is important to study old-fashioned systems and ideas with clear and concise concepts, because they are likely to be used in a surprising way in future systems.

2. List allocation

Advantages :

    • Make full use of each disk block.
    • You just need to store the first address.

Disadvantages

    • The complexity of random reads is O(n) , too slow.
    • The pointer takes up some bytes, so that the number of bytes stored in the data is no longer 2的整数次幂 .

      • The program is to read and 2的整数次幂 write disk blocks, reading a complete block, it may be two pieces of disk block stitching.

        Make the speed slow

3. Allocation of table-linked tables in memory

Remove the pointer word for each disk and place it in a table in memory. This table is called 文件分配表(File Allocation Table,FAT) .

Advantages:

    • Two disadvantages of solving the linked list above.
      • For random Read and write, because in memory, even if it O(n) is acceptable.
      • The data is stored 2的整数次幂 .

Disadvantages:

    • For 200GB the disk and 1KB size of the block, this table is required for 2亿项 each table entry 4字节 .

      The memory of this watch 600 is 800MB not very practical

    • Therefore, it FAT is not suitable for large disks

4. I junction

Each file is given a i节点(index-node) data structure called

    • which lists 文件属性 and文件块的磁盘地址

Advantage

    • The memory is loaded only when the file is opened i节点 .

      • Independent of the total size of the disk, only relevant to open files.
    • A way to resemble a virtual address space 多级页表 .

      • i节点The address is 一级页表 .
      • 最底层的地址is the actual 磁盘地址 .
      • Convenient 索引 , space-saving, perfect.

Cons: Seemingly no

Implementation directory entries for the 4.3.3 directory

Open the file, the operating system using the path name given by the user to find the corresponding 目录项 .

    • 目录项Provides the information needed to find a file disk block.

      • Varies by system
        • may be disk address + length (for continuous allocation)
        • It could be the address of the first block (for a linked list)
        • i节点号
    • In any case, the main function of the directory system is to ASCII map the file name to the 文件数据 information needed to locate it.

File properties

The question that is closely related to this is where to store it 文件属性 .

    • An obvious way to do this is to 文件属性 store it directly in a catalog item. (corresponds to Windows )
    • For i节点 a system to be used, it can also be 文件属性 stored i节点 in, rather than in the catalog item.
File length

The 可变长度 long filenames supported by modern operating systems, and how they are implemented.

    • The first, simplest fixed a 255-byte file name, is a waste of space that is not considered.

    • The second is that it is 图4-15(a) not to be mentioned after the explosion of the latter kind.
    • Third, use to maintain file names.

      • Each catalog item itself has a fixed length.
      • Similar to the maintenance malloc .
      • Disadvantages

        • A catalog item may be distributed across multiple pages.

          Therefore, there may be a page failure to process the file name.

Find file names
    • Using a hash table

      • I don't want to introduce you to my old bank.
    • Leveraging cache

4.3.4 Sharing files

The file system itself is one 有向无环图(DAG) rather than a tree.

There is a way to share ( 连接 )

    • Hard Links

      • Directory pointing i节点 , and counting

      • Only when the count is set to 0 is it really deleted.

    • Soft Connect (Symbolic Connection)

      • Create a LINK file of type, and place the file under B.

        • The actual connection file location is stored inside the
      • Additional overhead required to access the file

Problems with Replication

If the connection is not preserved when copying is allowed ( cp 没有-d )

    • A soft connection that copies directly to the file itself.

    • Hard link, copy it over, reset i-node the counter.

4.3.5 Log Structure file system (can't look down, translation is too bad)

日志结构文件系统(Log-structured File System,LFS)The main reason for the design is

    • The CPU is running faster, the RAM memory becomes larger, and the disk cache is increasing.
      • A large portion of the read request can be completed without the need for a disk access operation.
      • Most of the disk access operations in the future are write operations.
        • Write operations are often fragmented.
        • A 50us track write operation requires a 10ms seek and 4ms a rotation delay operation. The efficiency of the disk is reduced to 1% .

Where do fragmented writes come from? Suppose to write a new one 文件 .

    • Write the directory block of the file directory i .
    • The file's i node, and the file itself.

These writes are slow if the write operation freezes before it finishes.

    • Causing serious inconsistencies in the file system.
4.3.6 Log File system

Save a record of what the system is going to do next 日志 .

    • Even after the crash, you can continue to deal with it 日志 .

    • Microsoft NTFS , ext3 and the ReiserFS file system are all with logs.

4.3.7 Virtual file system (translation is too bad to see the end)

Different file systems are used, even under the same operating system under the same computer.

Windows

WindowsBy specifying different drive characters to handle these different file systems, for example C: , D: etc.

The drive letter is displayed, so you Windows know what file system requests are being passed.

You do not need to consolidate different types of files into the same schema.

Unix

In contrast, all modern UNIX systems have made a very serious attempt to integrate multiple file systems into one unified structure.

Think most UNIX operating systems use 虚拟文件系统(Virtual File System,VFS) to unify the file system.

4.4 File System Management and Optimization 4.4.1 disk space management

Almost all file systems are 文件 partitioned into fixed-size storage, and each block is not necessarily contiguous ( 页式管理 ).

Block size

Once the file is determined to be stored in a fixed-size block, there is a problem, what is the size of the block?

too large a block (空间利用低)

    • Small files waste a lot of disk space, 内部碎片 serious.

too small a block (时间利用低)

    • Most files will have multiple blocks, requiring multiple 寻道 and 旋转 delayed to read this file, and performance is too low.

View

    • historically , the file system will be sized in 1~4KB between. (The number of disks in the historical period is less, and space utilization is more important).

    • As the disk exceeds 1TB, you can increase the time performance by raising the size of the block to the waste of the 64KB receiving disk.

      • There is no shortage of disk space in modern times.
Record Free Blocks

    • Disk block Linked list

      • Usually, it is used 空闲块 to store 空闲表 .
      • 内存In the Save a 指针块
        • When the file is created, the required blocks are removed from the ruling block.
          • If the pointer block is exhausted, a new pointer block is read from the disk.
        • File deletion is that its disk block is freed and added to the 指针块 .
          • When the pointer block fills up, it is written to disk.
        • Avoid unnecessary I/O , (ie facilitates creation, also facilitates writing)
          • Keep a 半满 pointer block in memory, most of the rest of the state
    • Bitmap

      • nDisk requires a block of disks n位位图 .

        • Expression for free blocks 1
        • Allocated blocks are 0 represented by
      • bitmaps require little space.

      • The stealth benefit ensures that the storage will be relatively tight and reduce the movement of the disk wall.

      • 缓存

        • Since the bitmap is a fixed-size data structure, in support of paging 内核

          Can be stored directly into the virtual address space, waiting for the time required 页面的调入 .

Disk quotas

No one should be using this technology now ...

Backup of the 4.4.2 file system (skipped)

Several questions

    • One: Back up those data?
    • Second: A backup that has been backed up before but has not been modified is a waste of time.
    • Three: Is it compressed?
    • Four: It is very rare to make a backup of the file system that is still active.
    • V: Backup introduces non-technical issues (spy?)
4.4.3 File System Consistency

Another problem that affects the reliability of the file system is the file system 一致性 .

    • The system crashes before all the modified disk blocks are written back, possibly leaving the file system in 不一致状态 .

To resolve file system inconsistencies, many computers have a utility to verify the consistency of the file system.

    • Unixfsck, Windows the scandisk .
    • It is generally possible to run selectively after a crash restart.

Let's introduce Unix the fsck principle.

    • Consistency check for blocks
    • Consistency check for files

4.4.4 File System Performance
    1. Cache

      • Block cache
      • Buffer cache
      • Memory mapping
    2. Block in advance read

      • can only accelerate顺序存取方式
      • Yes 随机存取方式 , less efficient.
    3. Reduced disk arm movement

      • 空闲块The allocation reduces the disk arm movement

        • Using bitmaps
        • The free table uses the block cluster technique.
      • iOptimization of nodes

        • Read a very short file, also need to first two times disk access

          • Read i node
          • Read access block
          • The access between the two increases the seek time.
        • Improved

          • The first kind. Storage node in the middle of the disk i .
          • The second kind. The disk is divided into multiple 柱面组 , each cylinder group has its own i节点 .

4.4.5 Disk block finishing win

Disk performance can be restored like this

    • Move the files so that they are adjacent, placing all the free space in one or more large contiguous spaces.
    • windowsTo defrag engage in this matter
      • windowsUsers should use them on a regular basis.
      • NTFSLooks like a lot less debris.
Linux
    • The Linux file System ( ext2 , ext3 ) is i节点 rarely required to use manual disk defragmentation because of the way it is used.
4.5 File System Instance 4.5.1 cd-rom4.5.2 MS-DOS
    • FATAnd FAT32 now still popular in embedded systems.
When reading a file
    • MS-DOSCall the open system call to get the file 句柄 .

      • openThe system call identifies a path. A path is a component of a lookup

        Search for the file to open until you find the final directory to read into memory.

Catalog items

4.5.3 Unix V7

Even the earlier version was UNIX a fairly complex multi-user file system, because it was MULTICS inherited.



[Modern operating system notes] [Chapter fourth Document System]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.