Implementation of Linux file system

Source: Internet
Author: User

Vamei Source: Http://www.cnblogs.com/vamei Welcome reprint, Please also keep this statement. Thank you!

Linux file management describes how Linux manages files from a user's level. Linux has a tree-like structure to organize files. The top of the tree is the root directory (/), the node is the directory, and the leaf at the end is the file containing the data. When we give the full path of a file, we start from the root directory, passing through the various directories along the way, eventually reaching the file.

We can do a lot of things with files, such as open and read and write. In the Linux file management related commands, we see many commands for manipulating files. Most of them are based on file opening and reading operations. For example, cat can open a file, read the data, and then display it at the terminal:

$cat Test.txt

For programmers under Linux, understanding the underlying organization of file systems is a must for in-depth system programming. Even for ordinary Linux users, it is possible to design a better system maintenance program based on the relevant content.

Storage device Partition

The ultimate goal of the file system is to organize large amounts of data into persistent (persistant) storage devices, such as hard disks and disks. These storage devices are different from memory. Their storage capabilities are persistent and do not disappear due to power outages, large storage volumes, but slow reading speeds.

Observe common storage devices. The first area is the MBR, which is used for Linux boot-up (refer to Linux boot). The remaining space may be divided into several partitions (partition). Each partition has a related partition table (Partition table) that records information about the partition. This partition table is stored outside of the partition. The partition table describes the starting position of the corresponding partition and the size of the partition.

We often see C partitions, D partitions, and so on in Windows systems. Linux systems can also have multiple partitions, but are mounted on the same file system tree.

The data is stored in a partition. A typical Linux partition (partition) contains the following sections:

The first part of the partition is the boot block, which is primarily for the computer to start service. After Linux boots, the MBR is loaded first, and then the MBR loads the program from the boot area of a hard disk. The program is responsible for further loading and booting of the operating system. For ease of administration, Linux also reserves the boot area for a partition, even if the operating system is not installed.

After the ScanDisk is the Super block. It stores information about the file system, including the type of file system, the number of inode, and the number of data blocks.

It is followed by multiple inodes, which are key to implementing file storage. In a Linux system, a file can be stored in several chunks of data, as if it were scattered around the dragon beads. In order to successfully collect the Dragon Ball, we need a "radar" guideline: The file corresponds to the inode. Each file corresponds to an inode. This inode contains multiple pointers to the individual data blocks that belong to the file. When the operating system needs to read the file, only the "map" corresponding to the inode, collect scattered data blocks, you can harvest our files.

The last part is the data blocks that actually stores the data.

About Inode

Above we see the macro structure of the storage device. We want to drill down into the structure of the partition, especially how the file is stored in the partition.

A file is a partitioned unit of data for a file system. The file system uses a directory to organize files, giving them a hierarchical hierarchy of files. The key to implementing this hierarchical structure on a hard disk is to use the inode to virtualize common file and directory file objects.

In Linux file management, we know that a file, in addition to its own data, has a subordinate message, the file's metadata (metadata). This metadata is used to record a lot of information about a file, such as file size, owner, group, date Modified, and so on. Metadata is not included in the file's data, but is maintained by the operating system. In fact, this so-called meta-data is contained in the inode. We can use $ls-l filename to view the metadata. As we can see above, the area occupied by the Inode is different from the area of the data block. Each inode has a unique integer number (Inode #).

In saving metadata, the Inode is the "file" from abstraction to the specific key. As mentioned in the previous section, the inode is stored by pointers that point to some data blocks in the storage device, and the contents of the files are stored in these chunks. When Linux wants to open a file, it only needs to find the inode of the file, and then, along the pointer, collects all the blocks of data, it can compose the data of a file in memory.

Data block at 1, 32, 0, ...

Inode is not the only way to organize files. The simplest way to organize files is to put the files sequentially into the storage device, and the DVD takes a similar approach. However, if there is a delete operation, the free space caused by the deletion is mixed between normal files, which is difficult to use and manage.

A complex way to use a linked list, each block has a pointer to the next block of data that belongs to the same file. The advantage is that scattered free space is available, and the downside is that the operation of the file must be done in a linear manner. If you want random access, you must traverse the list until the target location. Because this traversal is not in memory, it is very slow.

The FAT system takes a pointer out of the list above and puts it into an array of memory. In this way, fat can quickly find a file based on the index of the memory. The main problem with this is that the size of the indexed array is the same as the total number of data blocks. Therefore, if the storage device is large, the index array will be larger.

The inode can not only make full use of space, in memory occupy space is not related to storage device, solve the above problem. But the Inode also has its own problems. The total number of data block pointers that each inode can store is fixed. If a file requires more data blocks than this total, the inode needs extra space to store more pointers.

Inode Example

In Linux, we find a file by parsing the path and depending on the directory file along the way. The entries in the directory include the file name, along with the inode number. When we enter $cat/var/test.txt, Linux will find the inode number of the Var directory file in the root directory file and then synthesize VAR data based on the inode. Then, according to the records in Var, we find the inode number of text.txt, collect the data block and synthesize the text.txt data along with the pointers in the inode. Throughout the process, we referenced three inode: root directory file, var directory file, text.txt file inodes.

Under Linux, you can use $stat filename to query the inode number for a file.

In the storage device is actually stored as:

When we read a file, we actually found the inode number of the file in the directory, and then, based on the inode pointers, we put the data blocks together and put them into memory for further processing. When we write a file, it allocates a blank inode to the file, writes its inode number to the directory to which it belongs, and then selects a blank chunk of data, letting the inode pointer refer to the data block, and put it in memory.

File sharing

In the process of Linux, when we open a file, a file descriptor is returned. This file descriptor is an array of subscripts, corresponding to the array element as a pointer. Interestingly, this pointer does not directly point to the inode of the file, but instead points to a file table, which, through the table, points to the inode of the target file loaded into memory. For example, a process opens two files.

As you can see, each file table records the status of the opening of the file (status flags), such as read-only, write, etc., and also records the current read/write location (offset) of each file. When there are two processes open the same file, there can be two file tables, each file table corresponding open state and the current location is different, thereby supporting the operation of some file sharing, such as simultaneous reading.

Note that after the process fork, the child process copies only the array of file descriptors, and the parent process shares the file tables and inode maintained by the kernel. Be especially careful with the program writing at this time.

Summarize

Here is a general summary of the Linux file system. Linux lets data form files in an inode way.

Understanding the file system of Linux is an important step in understanding the fundamentals of operating Linux.

Implementation of Linux file system

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.