Linux disk and file system principles

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This chapter describes the operating principles of the Linux File System. I have forgotten a lot about computer components and operating system principles. I will review them here.
For more information, see Section 1 and 2 of this Chapter.

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

1 hard disk physical composition // Principle

Head read/write
Track (one circle of the same radius of the hard disk) Column (all disk tracks stacked with columns)
Slice (a slice area separated by two RADIUS tracks, which is the minimum disk storage unit)

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

2. Disk partitioning // Principle

The column is the smallest unit of disk separation.
Disk partitioning specifies a partition from column A to Column B.

The split information of all disks is stored in MBR (primary Boot Sector, Master Boot recoder), that is, the 0th rail of a hard disk. The computer will read this area as soon as it starts.
According to the definition of MBR, if an MBR of a hard disk is mounted, the hard disk will be mounted.

MBR restrictions: the MBR size determines that it cannot store a lot of split information, and can only remember up to four split information (both primary and extended partitions are called one split ), up to one extended partition can be created.
Based on the above knowledge, a hard disk can be divided into up to four partitions, and only one extended partition can be created. For example, if you split 3 P + 1E, you can no longer split the partition.

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

3. file system // Principle

After notifying the System of the start and end magnetic column of my partition, You need to format the partition as "filesystem" recognized by my operating system )』

We can say that every partition is a filesystem.

We just mentioned that the minimum storage unit of the hard disk is sector, but the minimum unit of data storage is not sector, because it is too inefficient to use sector for storage. To overcome this efficiency problem, logical blocks are generated! The logical block is the "minimum storage unit" specified during filesystem formatting by partition 』, the smallest storage unit is of course the size of the sector (because sector is the smallest physical storage unit of the hard disk! ), So the block size is the power multiple of Sector 2.

Plan Block Size considerations: file reading efficiency; file size may cause space waste

Superblock: As mentioned above, when we perform a partition, each partition is a file system ), the block at the beginning of each file system is called the superblock. the superblock is used to store the block, such as the size, empty, and filled block of the file system, and their respective totals and other such information, which means that when you want to use this disk partition slot (or archive system) for data access, the first block to go through is the superblock block. So, the superblock is broken, and your disk slot will probably get lost!

Note: MBR and superblock are two levels !!!

Disk partitioning: when selecting the file system format for disk partitioning, You Can format the file system architecture in a certain format on the disk.

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

4 Linux ex2 file system // Principle

Linux archives not only have archive content, but also archive attributes. The Linux File System stores the file attributes (stored in inode) and file content (stored in Block) separately.

When a partition is formatted as an ext2 file system, there will be two regions: inode table and block area. inode is located in the inode table and block is located in the block area.

Block is the smallest unit of data storage. So what is inode ?!In short, a block is the region where the "archive content data" is recorded. inode records the information about the "attributes of the archive and the block in which the archive content is stored. In short, inode must not only record the attributes of an archive, but also have the pointer function, that is, point to the block where the archive content is stored, so that the operating system can correctly obtain the file content.

In Linux, how does one read the content of an archive? The following describes the directories and files respectively:

Directory: When we create a directory in the ext2 File System in Linux, ext2 allocates an inode and at least one block to the directory. Inode records the related attributes of the Directory and points to the allocated block. The block records the connection of related files (or directories) under this directory!
File: When we create a general file in ext2 in Linux, ext2 will allocate at least one inode and the number of blocks relative to the file size to this file. For example, if one of my blocks is 4 Kbytes and I want to create a 100 Kbytes file, Linux will allocate an inode and 25 blocks to store the file!

Note that inode does not record the file name, but records the relevant attributes of the file. The file name is recorded in the block area of the directory! What is the relationship between files and directories? As mentioned in the preceding directory, the links related to the file are recorded in the block data area of the directory. Therefore, when we want to read the content of a file, in Linux, The inode of the upper directory where the archive is obtained from the root directory/is stored, and the archive record is closed (in the block area of the directory) obtain the inode of the archive, and finally obtain the final archive content by pointing to the block provided by the inode. Take the file/etc/crontab as an example. The data obtained from the file is as follows:

In addition, there are several small things to remind you about the ext2 File System: (understanding)

? When the ext2 and ext3 files are created (format), a fixed number of inode and block numbers have been set;

? The size of the block allowed by ext2 is 1024,204 8 and 4096 bytes;

? The maximum number of files allowed by a partition (filesystem) depends on the number of inode, because a file must occupy at least one inode!

? If the number of files under the directory is too large, so that a block cannot accommodate all the connected data, Linux will give the directory an extra block to continue recording the connected data;

? Usually the number of inode is set to (partition capacity) divided by (the capacity that an inode is expected to control ). For example, if my block is planned to be 4 Kbytes, assume that one of my inode will control two blocks, that is, if the approximate capacity of one of my files is around 8 Kbytes, assume that my partition capacity is 1 gbytes, the total number of inode is: (1g * 1024 M/G * 1024 k/m)/(8 K) = 131072. An inode occupies 128 bytes of space, so there will be an inode table (131072*128 Bytes/count) = 16777216 byes = 16384 Kbytes during formatting. That is to say, this 1 GB partition is less than 16 Mbytes before it can store any data!

?Because an inode can only record the attributes of one file, it makes no sense to have more inode than the block! In the above example, my block plan is 4 Kbytes, so 1 GB has about 262144 4 Kbytes blocks. If a block corresponds to an inode, when the number of inode is greater than 262144, more inode will be useless, wasting hard disk space in vain! Another idea is that if my archive capacity is large, one archive occupies one inode and several blocks. Of course, the number of inode is much smaller!

? When the block size is smaller and the inode quantity is larger, more space is available, but the efficiency of writing large files is poor. This situation is suitable for the large number of files, however, systems with a small archive capacity, such as BBS or news systems;

? When the block size is large and the number of inode is small, the efficiency of writing large files is better, but the hard disk space may be wasted. This situation is more suitable for systems with large file capacity!

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

5 ex2 File System Storage Architecture and principle // Principle

When an ext2 filesystem is created, it has areas such as superblock/group description/block bitmap/inode table/data blocks. Note that when each ext2 filesystem is created, several block groups are given based on the partition size, and each block group has these parts. The entire filesystem architecture can be presented as follows:

We will simplify the entire filesystem. If there is only one block group, what does the above section represent:

? Superblock: As mentioned above, superblock records information about the entire filesystem. If there is no superblock, this filesystem will be unavailable. The following information is recorded:

Total number of o blocks and inode;
O number of inactive and used inode/blocks;
O the size of a block and an inode;
O file system mounting time, last Data Writing Time, last disk (fsck) time, and other information related to the file system;
O A valid bit value. If the file system has been mounted, valid bit is 0. If not, valid bit is 1.

? Group Description: records where the block starts;

? Block bitmap: record whether the block is used;

? Inode bitmap: indicates whether the inode is used;

? Inode table: Each inode data storage zone;

? Data blocks: stores data in each block.

　　　　When we add a new file (directory:

1. Based on the inode bitmap/block bitmap information, find the unused inode and block, and then record the attributes and data of the file into inode and block respectively;

2. Inform superblock, inode bitmap, and block bitmap of the inode and block numbers that have just been used to update the metadata. In general, inode table and block area are called data storage areas. Other records such as superblock, block bitmap, and inode bitmap are called metadata. Through the above two actions, we know that when a piece of data is written to the hard disk, there will be these two actions.

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

6. Operation of the file system // Principle

Well, we know that the data access of the entire ext2/ext3 is recorded through Journal (log), metadata, and data storage areas. But in fact, when the Linux file system is operating, do you really need to store the data directly on the hard disk ?!

To accelerate the access efficiency of the entire system in LinuxAsynchronously.What is asynchronous? For example, "when the system reads a file, the data in the block where the file is located will be loaded into the memory, therefore, the disk block will be placed in the cache of the primary storage. If the data in these blocks is changed, only the block data of the primary storage will be changed at the beginning, the block data in the buffer zone will be marked as "dirty". At this time, the disk entity block has not been corrected! Therefore, it means that the data in the "dirty" block must be written back to the disk to maintain the consistency between the data in the disk's physical block and the block data in the primary storage .』

Why? This is because the operation speed of the primary storage is much faster than that of the hard disk. In case of a large file in the system and persistent access, the slow hard disk access speed, the entire Linux system will be slowed down, So asynchronous data processing will be used!

However, because the data on the hard disk and the primary storage may not be synchronized, if Linux is not properly shut down (for example, power-off or on-the-go), the system will start up again, it will take quite a bit of time to perform disk inspection and possibly damage the disk!

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

7 significance of Mount Points

The mount point must be a "directory" instead of a file! That is to say, this mount point is the entry to the filesystem.

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

8 disk and directory capacity // practice

1) commands for displaying the current total disk capacity and remaining available capacity

DF [parameter] directory or file name

2) list the capacity of each file in the directory

Du [parameter] directory or file name
　　　　

Certificate ---------------------------------------------------------------------------------------------------------------------------------------------------

9 Connection Files

1) hard link

Hard link only adds the connected data of the file in a directory!

For example, if my/root/crontab is a hard link file, it is linked to the/etc/crontab file, that is, in fact,/root/crontab and/etc/crontab are the same file, but there are two directories (/etc and/root) that record the connected data of the crontab file! That is to say, I know that the inode of crontab is placed at a from the joined data recorded in the/etc directory, and the joined data under the/root directory is located, crontab also refers to inode at! So, the inode and block of the crontab file have not changed, and some have only two directories that record the associated data. So what are the benefits? The biggest benefit is "Security !』 As mentioned above, no matter which file is deleted in/root/crontab and/etc/crontab, it is only to remove the file connection data under a directory, the inode and block data of the original file are not updated! In addition, no matter which directory is connected to the correct inode and block of crontab, You can correctly modify the data!

In general, when hard link is used to set the link file, the disk space and inode count will not change! From the above description, we can know that hard link only writes one more connected data to the block in a directory, so inode and disk space will not be used!

TIPS: in fact, it may change, that is, when the directory block is used up, it may add a new block to record, resulting in disk space changes! However, hard link usually uses a small amount of data, so it usually does not change the size of inode and disk space!

Since hard link establishes data connections on the same partition, there are restrictions on hard link:
? Cannot span filesystem;
? Cannot link the directory.
It cannot be better understood across filesystems, because hard link is originally a connection established in a partition, so what is wrong if hard link cannot be connected to a directory? This is because when hard link is used to link to a directory, the linked data needs to be linked together with all the data under the directory to create a connection, which causes great environmental complexity. Currently, hard link does not support directories.

2) symbol Link

Compared with hard link, symbolic link has a better understanding. Basically, symbolic link is creating an independent file, this file will let the data read the file content pointing to his link! Because the source file is deleted, the symbolic link file will be unable to open after it is deleted !』. Note that the symbolic link and the windows shortcut can give him an equal sign, and the files created by symbolic link are an independent new file, so it will take up inode and block!

From the above description, it seems that hard link is safe, because even if the connected data under a directory is killed, it does not matter, as long as there is a connected data under any directory, the file will not be seen! For example, my/etc/crontab and/root/crontab point to the same file. If I delete the/etc/crontab file, the delete operation only removes the related data about crontab under the/etc directory. The inode and block of crontab are not changed!

However, unfortunately, due to too many restrictions on hard link, including link that cannot be used as a "directory", the usage is limited! But symbolic link is widely used! All right, it seems like you are almost dizzy! It doesn't matter. You will know what's going on after you implement it! To create a link file, you must use the ln command (for detailed usage, see help!

Link quantity of directories:

You may have discovered that, when we use hard link for "File Link", we can find that the second field displayed in LS-l will be added to the pair,
What is the default number of links when a directory is created? Let's think about what at least exists in an "empty directory? Haha! There are two directories, "." and! When we create a new directory named/tmp/testing, there are basically three things:
•/Tmp/Testing
•/Tmp/testing /.
•/Tmp/testing /..
Among them,/tmp/testing and/tmp/testing/. are actually the same! All represent this directory ~ And/tmp/testing /.. it indicates the directory/tmp. Therefore, when we create a new directory, the number of links in the new directory is 2, and the number of links in the upper directory is increased by 1 』

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux disk and file system principles

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux disk and file system principles

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support