File system thinking of the operating system

Source: Internet
Author: User

The file system is part of the operating system and ultimately the purpose is to manage the files.

The concept of creating files in the operating system is to make it easier for multiple processes to share some data, which is stored on disk. Multiple processes can be accessed.

Think of the file as the address space on the disk.

The content of a file is actually a sequence of bytes for a computer. What the user sees is a row of data.

What are the key issues to be addressed by the file system?

is to record which disk blocks are used for a file (which disk blocks are assigned to which files), so that when you find a file, you know which disk block to look for.

Different operating systems use different approaches to achieve this goal.

Broadly divided into three ways:

1, continuous distribution.

Don't understand, exactly where is the difference?

A file that occupies a number of disk blocks, characterized by the contiguous proximity of these disk blocks.

are each file pre-allocated size? How many disk blocks are pre-applied?

The disadvantage of this approach is that, once deleted, adding files will result in a large number of free disk blocks (called disk fragmentation). If you want to add a new file, to use these free disk blocks, then you need to calculate the size of a file before you can find the appropriate size of the disk block deposit. The key problem is that it is difficult to determine the size of a file (since a file will be written to new data in the future, or the data inside the file will be deleted, the size is always changing). Because the file size is difficult to fix dead, it is more suitable for the CD file to store, because the file size on the CD is fixed, will not change.

Disk fragmentation is a drawback of its many.

In general, this file allocation method is only suitable for files with fixed file size.

2, linked list allocation (FAT scheme, file allocation table abbreviation)

Included in the linked list is stored on disk, and the linked list is put into memory. When placed in memory, the speed is fast,

A linked list is relative to the way in which disk blocks are allocated continuously. In this way, a file does not need to be fixed in a contiguous disk block. For example, the content of file A, in the continuous distribution method. As the contents of the file increase, the expansion. will use disk Block 1, disk Block 2, disk block 3, which must be contiguous (position adjacent, contiguous area) of the disk block.

Pros: Avoids disk fragmentation in continuous distribution mode.

In the list mode, no contiguous disk blocks are required. The first byte of each disk block stores a pointer to the address of the next disk block. So that you can follow the pointer to find. No contiguous disk blocks are required.

To increase the speed, the linked list is stored in memory.

3, I node

The disadvantage of the list allocation method is that it occupies a large amount of memory (the linked list is put into memory to increase the speed). A directory how many files, then maintain how many items in memory.

Then n more directories, there will be more.

The larger the disk space, the greater the list of links that need to be maintained, which means that the memory-linked list occupies more memory space. For example, 200g disks, each disk block is 1kb. Then there is a total of 200g*1kb items.

The purpose of this item is to indicate the location of the disk block.

This table requires 200 million items, which roughly requires 600-800m of memory. It's a waste of memory space.

An improved method was invented, and only the user opened the file to load its node information into memory. This will occupy a lot less memory.

3.

The implementation of the directory:

Each directory, a table of contents will be built. Each item in the table of contents is called a catalog item, in fact, it is a file in this directory corresponding to an item, the popular point is that the directory of all the files are placed in the directory table records.

Find files in a directory, or join a file, search for the file entries in this directory table.

The nature of a directory is actually a file, but a special kind of file, because it contains multiple files. So the directory actually includes these items: directory name, starting disk block number for the directory, end disk block number.

Two implementation algorithms, linear tables and hash tables. The length of the hash table is a problem.

Understand the operating system, understand its three concepts, almost become an operating system expert:

1, process (thread). Building a model of the CPU

2, address space. Operating system-to-memory abstract model

3, documents. No wonder in the Linux operating system, everything is the concept of files.

The operating system has its own file system. So how does the database system deal with the disk, is it in its own way, or does it say that it doesn't use the file system provided by the operating system?

However, to know that the database system is ultimately running on the operating system, then to operate the disk data, it is inseparable from the use of file systems.

There are two ways to physically manage a database:

1, using the operating system's file system to organize data.

The file system is responsible for interacting with the disk, requesting and allocating disk blocks.

2, to implement a set of management methods, responsible for the application of disk blocks and allocation. Can understand to implement a set of file system for oneself

In fact, most database systems initially request a fixed-size disk space, which is then allocated and managed by themselves.

Backup: The disk controller handles the disk bad block operation is transparent, even the operating system does not know.

File system thinking of the operating system

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.