Chapter fourth Document System 4.1 files
From the perspective of the 用户
study 文件
. How users use 文件
文件
them, with those features.
4.1.1 File naming
文件
An abstraction mechanism that provides a way to keep information on disk and to read it later.
4.1.2 File Structure
文件
There are many ways to construct a list of three common ways.
BYTE sequence
- What the operating system sees is bytes, and any meaning is explained only in the user program.
Unix
And Winodws
both are in this way.
Record sequence
Tree
- Similar to
map
, key-value
to, and utilized 平衡树
to maintain key
values.
- So that you can quickly find
key
values.
- In some of the mainframe computers that deal with business systems are welcome
4.1.3 File types
Normal file (regulr files): Contains information about the user
ASCII
File
二进制
File
- With a certain internal structure, the program that uses the file only understands this structure.
Directory: A system file that manages the structure of a file system.
4.1.4 File access
4.1.5 File Properties
File-related information, such as creation date, size, and so on. This information 文件属性(attribute)
is called, some people call it 元数据(metadata)
.
- The first 4 properties are related to File Protection .
- Flag bit
- Time field
- Size field
4.1.6 File Operations
creat
delete
open
- Load the file attributes and the disk address table into memory.
close
- Close the file to free the internal table space.
read
write
append
seek
get attributes
set attributes
rename
4.1.7 A sample program that uses a file system call.
copyfile abc xyz
the operation was completed
4.2 Catalog
The directory itself is also 文件
.
4.2. Level 11 Directory System
目录系统
The simplest form is that a directory contains all the files.
- such as telephones, digital cameras and some portable music players using
4.2.2 Hierarchy Directory System
4.2.3 Path name
Absolute path
Relative path
- Must be
工作目录
used together with the.
- A few special symbols
.
: Current directory
..
: Previous level Directory
~
: The Linux
root directory of your own users
~user
: Linux
user
The root directory in
-
: Last Linux
recently accessed directory in
4.2.4 Directory Operations
4.3 Implementation of the file system
The perspective of the realization 文件
.
4.3.1 File system layout
File systems are stored 磁盘
on the
Most disks are divided into one or more 分区
.
- Each
分区
has a separate file system.
The disk is 0号扇区
called 主引导记录(Master Boot Record,MBR)
, which is used to boot the computer.
MBR
The end is 分区表
.
- The table gives the start and end addresses for each partition.
- A partition in the table is marked as
活动分区
.
At 引导
the time of the computer being
BIOS
Read in and execute MBR
.
MBR
The first thing to do is to determine 活动分区
, read into its first block, called 引导块(boot block)
, and execute it.
引导块
The program in will mount the partition 操作系统
.
- For the sake of unification, all
分区
starts 引导块
, even without the operating system.
The layout of the subsequent disk partitions varies with 文件系统
each other. The above is a possible layout.
Implementation of the 4.3.2 file
Key issues in file storage implementation
- Log individual files using those disk blocks separately
1. Continuous distribution
malloc
Dynamic Storage allocation is identical to the model. (But malloc
these shortcomings can be overlooked)
Advantages :
Simple to implement
- Simply log the file's disk address and the number of blocks of the file
Good reading performance
- It takes only one look, then no need
寻道
and no 旋转
delay.
- The data is entered at a rate of full disk bandwidth.
Disadvantages :
There are too many external fragments to maintain a free list to store files.
Most of the time the 文本文档
final size of the file is not known, so this is a mishap.
Usage Scenarios
Years ago, the idea was abandoned because its simplicity and high performance were actually used in the disk file system, and later because it hated having to specify the size of the final file when it was created. But with CD-ROM
the DVD
advent of, and other disposable write optical media , the sudden succession of distributions has become a good idea. So it is important to study old-fashioned systems and ideas with clear and concise concepts, because they are likely to be used in a surprising way in future systems.
2. List allocation
Advantages :
- Make full use of each disk block.
- You just need to store the first address.
Disadvantages
3. Allocation of table-linked tables in memory
Remove the pointer word for each disk and place it in a table in memory. This table is called 文件分配表(File Allocation Table,FAT)
.
Advantages:
- Two disadvantages of solving the linked list above.
- For random Read and write, because in memory, even if it
O(n)
is acceptable.
- The data is stored
2的整数次幂
.
Disadvantages:
For 200GB
the disk and 1KB
size of the block, this table is required for 2亿项
each table entry 4字节
.
The memory of this watch 600
is 800MB
not very practical
Therefore, it FAT
is not suitable for large disks
4. I junction
Each file is given a i节点(index-node)
data structure called
- which lists
文件属性
and文件块的磁盘地址
Advantage
Cons: Seemingly no
Implementation directory entries for the 4.3.3 directory
Open the file, the operating system using the path name given by the user to find the corresponding 目录项
.
目录项
Provides the information needed to find a file disk block.
- Varies by system
- may be disk address + length (for continuous allocation)
- It could be the address of the first block (for a linked list)
i节点号
In any case, the main function of the directory system is to ASCII
map the file name to the 文件数据
information needed to locate it.
File properties
The question that is closely related to this is where to store it 文件属性
.
- An obvious way to do this is to
文件属性
store it directly in a catalog item. (corresponds to Windows
)
- For
i节点
a system to be used, it can also be 文件属性
stored i节点
in, rather than in the catalog item.
File length
The 可变长度
long filenames supported by modern operating systems, and how they are implemented.
- The first, simplest fixed a 255-byte file name, is a waste of space that is not considered.
Find file names
Using a hash table
- I don't want to introduce you to my old bank.
Leveraging cache
4.3.4 Sharing files
The file system itself is one 有向无环图(DAG)
rather than a tree.
There is a way to share ( 连接
)
Problems with Replication
If the connection is not preserved when copying is allowed ( cp 没有-d
)
A soft connection that copies directly to the file itself.
Hard link, copy it over, reset i-node
the counter.
4.3.5 Log Structure file system (can't look down, translation is too bad)
日志结构文件系统(Log-structured File System,LFS)
The main reason for the design is
- The CPU is running faster, the RAM memory becomes larger, and the disk cache is increasing.
- A large portion of the read request can be completed without the need for a disk access operation.
- Most of the disk access operations in the future are write operations.
- Write operations are often fragmented.
- A
50us
track write operation requires a 10ms
seek and 4ms
a rotation delay operation. The efficiency of the disk is reduced to 1%
.
Where do fragmented writes come from? Suppose to write a new one 文件
.
- Write the directory block of the file directory
i
.
- The file's
i
node, and the file itself.
These writes are slow if the write operation freezes before it finishes.
- Causing serious inconsistencies in the file system.
4.3.6 Log File system
Save a record of what the system is going to do next 日志
.
Even after the crash, you can continue to deal with it 日志
.
Microsoft NTFS
, ext3
and the ReiserFS
file system are all with logs.
4.3.7 Virtual file system (translation is too bad to see the end)
Different file systems are used, even under the same operating system under the same computer.
Windows
Windows
By specifying different drive characters to handle these different file systems, for example C:
, D:
etc.
The drive letter is displayed, so you Windows
know what file system requests are being passed.
You do not need to consolidate different types of files into the same schema.
Unix
In contrast, all modern UNIX
systems have made a very serious attempt to integrate multiple file systems into one unified structure.
Think most UNIX
operating systems use 虚拟文件系统(Virtual File System,VFS)
to unify the file system.
4.4 File System Management and Optimization 4.4.1 disk space management
Almost all file systems are 文件
partitioned into fixed-size 块
storage, and each block is not necessarily contiguous ( 页式管理
).
Block size
Once the file is determined to be stored in a fixed-size block, there is a problem, what is the size of the block?
too large a block (空间利用低
)
- Small files waste a lot of disk space,
内部碎片
serious.
too small a block (时间利用低
)
- Most files will have multiple blocks, requiring multiple
寻道
and 旋转
delayed to read this file, and performance is too low.
View
historically , the file system will be sized in 1~4KB
between. (The number of disks in the historical period is less, and space utilization is more important).
As the disk exceeds 1TB, you can increase the time performance by raising the size of the block to the waste of the 64KB
receiving disk.
- There is no shortage of disk space in modern times.
Record Free Blocks
Disk block Linked list
- Usually, it is used
空闲块
to store 空闲表
.
内存
In the Save a 指针块
- When the file is created, the required blocks are removed from the ruling block.
- If the pointer block is exhausted, a new pointer block is read from the disk.
- File deletion is that its disk block is freed and added to the
指针块
.
- When the pointer block fills up, it is written to disk.
- Avoid unnecessary
I/O
, (ie facilitates creation, also facilitates writing)
- Keep a
半满
pointer block in memory, most 满
of the rest of the state
Bitmap
n
Disk requires a block of disks n位位图
.
- Expression for free blocks
1
- Allocated blocks are
0
represented by
bitmaps require little space.
The stealth benefit ensures that the storage will be relatively tight and reduce the movement of the disk wall.
缓存
Since the bitmap is a fixed-size data structure, in support of paging 内核
Can be stored directly into the virtual address space, waiting for the time required 页面的调入
.
Disk quotas
No one should be using this technology now ...
Backup of the 4.4.2 file system (skipped)
Several questions
- One: Back up those data?
- Second: A backup that has been backed up before but has not been modified is a waste of time.
- Three: Is it compressed?
- Four: It is very rare to make a backup of the file system that is still active.
- V: Backup introduces non-technical issues (spy?)
4.4.3 File System Consistency
Another problem that affects the reliability of the file system is the file system 一致性
.
- The system crashes before all the modified disk blocks are written back, possibly leaving the file system in
不一致状态
.
To resolve file system inconsistencies, many computers have a utility to verify the consistency of the file system.
Unix
fsck
, Windows
the scandisk
.
- It is generally possible to run selectively after a crash restart.
Let's introduce Unix
the fsck
principle.
- Consistency check for blocks
- Consistency check for files
4.4.4 File System Performance
Cache
- Block cache
- Buffer cache
- Memory mapping
Block in advance read
- can only accelerate
顺序存取方式
- Yes
随机存取方式
, less efficient.
Reduced disk arm movement
4.4.5 Disk block finishing win
Disk performance can be restored like this
- Move the files so that they are adjacent, placing all the free space in one or more large contiguous spaces.
windows
To defrag
engage in this matter
windows
Users should use them on a regular basis.
NTFS
Looks like a lot less debris.
Linux
- The Linux file System (
ext2
, ext3
) is i节点
rarely required to use manual disk defragmentation because of the way it is used.
4.5 File System Instance 4.5.1 cd-rom4.5.2 MS-DOS
FAT
And FAT32
now still popular in embedded systems.
When reading a file
Catalog items
4.5.3 Unix V7
Even the earlier version was UNIX
a fairly complex multi-user file system, because it was MULTICS
inherited.
[Modern operating system notes] [Chapter fourth Document System]