10 questions about Linux File System --- deep understanding of file storage methods, 10 questions about linux
10 questions about Linux File System
-- Do you know about file systems? Source: File System 10 questions
I believe everyone is familiar with file systems. As a siege lion, we deal with it almost every day, but what is the depth of our understanding. Let's take a look at the issues related to the Linux File System in the kernel:
1. the random read/write speed of a mechanical disk is very slow. What skills does the operating system use to improve the random read/write performance?
2. does touch a new empty file occupy disk space? How much does it take?
3. Create an empty directory to occupy disk space? How much does it take? Which one occupies more space than the new one?
4. Do you know where the file name is recorded on the disk?
5. How long is the file name? What are the restrictions?
6. Will the system performance be affected if the file name is too long? Why is there an impact?
7. How many files can be created in a directory?
8. How much disk space does a new file with a content of 1 kb actually occupy?
9. How much will the operating system actually read when I initiate a command to read the file 2Byte to the operating system?
10. How can we increase the disk I/O speed when using files?
If you want to answer the question at least, close this article. If you can't, and you have the same interests as the author in operating system privacy, let me explore these interesting aspects of the file system, I believe that understanding this will be of great help to our work.
I. Disk structure and partitioning
1. physical disk structure
Let's start with the most basic physical structure of the disk. Note that this article only discusses mechanical disks, and SSD is not covered in this article. We humans are always used to dividing a certain structure and managing everything based on this rule. The army is divided into military, division, brigade, group, and camp. The company is divided into business groups, departments, centers and groups ., Then. Disk management can be divided into disk surface, Head, track, cylinder, and fan area.
Disk Surface: A disk is composed of a stack of disks, as shown in the lower left figure.
Head (Heads): Each head corresponds to a disk and is responsible for reading and writing data on the disk.
Track: each disc is divided into multiple concentric circles around the center. Each circle is called a Track.
Cylinders: A Cylinders is a three-dimensional system composed of tracks at the same position on all disks.
Sector (Sector): the disk management unit is still too large, so the computer predecessors divide each track into multiple sectors, one reason I fall in love with Linux is that, as long as you are willing to work hard, you can take off the Linux coat to the end to satisfy all your desires (please try your best ). On Linux, you can run the fdisk command to view the physical information of the disks used by the current system. The above is the physical disk information of one of my own virtual machines. We can see that my disk has 255 heads, that is, a total of 255 disks. There are 3263 cylinders, that is to say, each disk has 3263 channels, and 63 sectors/track describes a total of 63 sectors on each track. The command result also shows that the Sector size value is 512 bytes. Let's calculate the size of the disk.
255 disc * 3263 cylindrical * 63 sectors * 512 bytes = 26839088640 bytes for each sector.
The result is 26.8 GB, which is consistent with the total size of the disk (as for the detailed results given by fdisk, the difference is about 4 MB, which I did not fully understand, interested readers can continue to study ).
In addition, I checked the disk status of the other two machines and found an interesting thing. For example, whether the disk capacity is large or small, the number of cores and the number of sectors per track remain unchanged, but the number of tracks increases. 2. partitions
Partitioning is the first step for the operating system to manage disks. This is also a concept that any computer user is very familiar. For example, C, D, E, and F disks in Windows. Think about it,
Thinking: The detailed physical structure of the previous disk already exists. If you want to divide the entire disk into C, D, and other partitions, what would you do?
Solution 1: 255 disks, 0-disks for drive C, and-disks for drive D ,......
Solution 2: 3263 cylinders, 0-cylinders on drive C,-cylinders on drive D ,......
Which of the above two solutions will you choose ?? First, let's talk about the process of disk I/O. The first step is to move the head radial to find the track where the data is located. This part of time is called the seek time. Step 2: Find the target track and rotate it on the disk to move the target sector to the bottom of the head. Step 3: read or write data to the target slice. So far, one disk IO is completed, so:
Single Disk IO time = seek time + rotation delay + Access time.
For the rotation delay, the current mainstream servers often use 1 W to/minute disks, the time required for each week of rotation is 60*1000/10000 = 6 ms, therefore, the rotation delay is (0-6 ms ). The access time generally takes a short time, which is a few milliseconds. For the track time, the modern disk is about 3-15 ms. The track time is mainly affected by the relative distance between the current position of the head and the target track.
Which one is actually used? The most important thing is that the performance is faster. Because data in the same partition is often read together, if the first type is used, the head needs to continuously jump between more than 3000 tracks, in this way, the disk tracing time will double, and the disk performance will decrease. For solution 2, if disk C is used, you only need to move the head between 1-channels, greatly reducing the track seeking time. (In fact, the partition does not start from 0. The cylinder corresponding to the first track of the disk is used to install the boot loader and the disk partition table ). Therefore, the partitioning method of solution 2 can reduce the tracing time in the disk I/O time. Therefore, all operating systems adopt solution 2, and solution 1 is not used.
If you have used fdisk for partitioning in Linux, You can note the following information. This fully proves that the operating system adopts solution 2.
Back to question 1, what skills does the operating system use to reduce the performance of random read/write? The operating system divides partitions by the cylinder corresponding to the track to reduce the time spent on disk IO and improve the disk read/write performance.
Ii. directories and files
1. Introduction
All right, after all the disk basics are completed, we will officially enter the topic and start our discussion on the Linux File System. Isn't the file system a directory or a file? These two are the ones we are familiar. But are you sure it's not your familiar stranger? First, create an empty directory and an empty file. The result is as follows:
We all know that the fifth column shows the occupied space, so let me ask a few small questions.
(1) Why does the directory occupy 4096 of the space?
(2) Why is the space occupied by empty files 0?
(3) If an empty file occupies 0 bytes of space, where are the file name, creator, permission-rw-r-, and other folder-related information stored?
2. I do not believe that empty files do not occupy space.
To solve this problem, the df command is required. Input df-I,
Inodes information is displayed in the red box in the Linux results. If you are not familiar with the inode concept, you can treat it as a guy who secretly manages the operating system, it will occupy space. Next I touch an empty file and then df-I again.
Although the operating system tells us that the space occupied by a new empty file is 0. However, this experiment proves that the operating system "spoofs" us and consumes an inode. So what is the node size of inode? The dumpe2fs command can help us to see the actual size of this stuff.
In the output result, we can find the following line:
It tells us that the size of each inode is 256 bytes. Of course, the size of each machine is different. It is actually determined by the system when formatting the disk.
Now, the second question is answered. The original creation of an empty file will occupy the disk space, and the actual usage is 256 bytes. Oh, no. The correct description should be an inode size. The specific value is determined during formatting.
Let's talk about creating an empty directory. As mentioned above, creating a new empty directory will occupy 4 kb of disk space. So is it true? We also use df-I to monitor the system inode usage before and after creating a directory.
The original directory also occupies an inode node, and the third question also has the answer. Creating an empty directory will occupy 4 kb of disk space + inode size. Oh, this is not necessarily 4 K in your system, it is actually a block size. You can also see it in dumpe2fs.
I only use the size of 4 kb for disk formatting!
3. 4 kb of the mysterious empty directory
The previous mysteries have been solved, and I have become curious about another thing as a siege lion. It is the 4 kb occupied by the empty directory. What is the space used for storage? It's mysterious.
Cd to the directory we created.
Create two empty files and check the space usage of the directory.
It seems that there are no new discoveries. Because Empty files do not occupy blocks, the block occupied by directories is displayed here, which is not changed with the previous size. Then I continue to use the php script to create 100 empty files with a 32 byte file name.
At this time, we found that the disk space occupied by the Directory increased, and it became three blocks. Haha, this answers our fourth question in the beginning. The file name exists in the block occupied by the Directory. Next, I also proved that the number of file names saved in each directory block is related to the file name length (it seems a bit nonsense, but I personally prove that my conjecture is still a little cool ). I created another empty directory and created 100 empty files with a file name length of 32*3. the disk space occupied by the temporary directory is as follows:
You may ask why the number of blocks occupied by the file name has not changed to three times. In fact, in the Linux File System, apart from the file name, there are other fields in the file structure. The file name is 3 times longer and will not lead to 3 times larger structure, for more information, see Linux kernel.
Now, question 6 has an answer. If the file name is too long, it will certainly affect the system performance, because this may lead to more disk IO. Many programmers like to name a file as a meaningful long string, so that users can see the purpose of the file. Of course, I did not say this is not good, but if you have a large number of files, you should consider whether your file name causes your directory block to occupy too much. The occupied space is trivial and the disk is cheap, but you have to consider how the operating system feels when searching for files in the directory. The operating system can use the file name you provided to compare strings, if you are lucky, you need to repeat all the blocks under your account. (Of course, the length of your file name is not abnormal, and the number of files does not reach 100,000 orders of magnitude. In fact, this overhead will not be too large, but you still know this overhead)
As for question 5, the maximum length of the file name. In fact, the Linux operating system imposes a limit of no more than 255 bytes to prevent programmers from using long file names unlimitedly.
In addition, do you have any experience? when there are many files in the directory, we will be very slow when using the ls command. Now you know the reason. In fact, the operating system is reading all the blocks in the current directory. If there are many blocks, multiple IO operations may be required to complete this simple ls command.
I created empty files in a directory on my computer. The ls command did not output any results in one minute, and I ctrl + c dropped. Do not do this in your project. Although the operating system can cache your directory data, it will block a lot for your next call, however, I suggest you keep the number of files in a single directory to tens of thousands. Otherwise, your program may suffer poor performance when it runs for the first time after restart.
Now, let's go back to Question 7. Do you have an answer? The maximum number of files that can be created in a directory is limited by the number of inode partitions in your directory. If you have inode, you can create a maximum of files. However, as mentioned above, the number of files in a single directory should not exceed, otherwise it will bring about system performance problems.
4. File block
Then we will make an experiment on the file. I created a new empty directory and created a new file under it, which only writes a space data. After saving, the du command is shown as follows:
If 4 K is a directory in the 8 K, it can be calculated that the operating system allocates 4 kb for files containing only one space. In fact, the file block is relatively simple. Unlike the block in the directory, there will be a lot of structure in the file system, and the file block will only save the file data. The above experiment shows that the operating system allocates space in the smallest unit of block. That is to say, as long as your file data is not empty, the operating system will allocate at least one block for you to store until you exceed 4 kb, and the operating system will allocate you the next block. Therefore, for question 8, creating a file with a content size of 1 kb actually occupies 1 block (generally 4 kb) and an inode (generally 256 bytes ).
In fact, when the file system initiates an IO request to the disk, it is also based on block size. Even if you only initiate a 2 Byte read to the operating system, the operating system will read 4 kb at a time. Therefore, disk IO is really slow, and we only need to access the 2 bytes, it is indeed likely that we will continue to access the content after the 2 bytes, which is the principle of program locality, therefore, the operating system simply reads more data at a time. This is the answer to question 9.
This is like going to the supermarket. It is really a waste of time to go shopping, which is much slower than the hard disk IO. We will never go around and buy an apple in the supermarket. Then we will definitely buy more for future demands at home, it doesn't take much time to buy a bunch of things than to buy an apple.
Let's talk about Question 10. How can we design your documents to increase the I/O speed? That is, if you know how much space your new file will occupy, such as 1 M. When you create a new file, let's talk about it with the operating system and let it help you reserve the file size. In this way, the operating system will allocate continuous blocks as much as possible for you, so that when you read this file again, the head will save a lot of seek Time, And the IO speed will be much faster.
3. What is written below
All we mentioned above are based on my own file system. In this case, the size of a block is 4 kb, and the size of an inode is 256 bytes. The number of inode on my virtual machine is only more than 1.4 million. These values are not fixed. You can set them to other values when formatting your hard disk. The Setting principle is to check the capacity of your hard disk and your usage.
If your files are larger than 4 kb or even several MB or several GB files, we recommend that you increase the block size as much as possible so that you can write down several addresses in inode.
If most of your files are less than 1 kb, using a 4 K block will result in a little waste. If your boss has extremely demanding cost requirements, you can set your block to a smaller value.
Also, pay attention to the inode of your file system. When viewing disk space occupied by directories and files, the operating system hides inode node usage, which is intended to provide users with a white box environment, we can recognize the space occupied by data, and hide inode information to make it easier for us to understand the operating system. In fact, developers who are not common users should have the right to know. This is directly related to the number of files that can be created by your file system. Otherwise, when you find that there is a lot of space left on the online machine disk, but inode is used up, You Have To reformat or migrate the server. These two operations are hard to avoid.
Thinking questions: we all have the experience that when there are too many small files in the directory, copying to other places will be very slow, and we will usually compress the Directory and copy it again. Now, can you tell me why this is so fast?