Inode is an important concept and is the basis for understanding Unix/Linux file systems and hard disk storage.
I think understanding inode not only helps improve the system operation level, but also helps to understand the Unix design philosophy, that is, how to abstract the underlying complexity into a simple concept, thus greatly simplifying user interfaces.
Below are my inode study notes, which should be kept as simple as possible.
==========================================
Understanding inode
Author: Ruan Yifeng
Address: http://www.ruanyifeng.com/blog/2011/12/inode.html
I. What is inode?
To understand inode, start with file storage.
Files are stored on the hard disk. The minimum storage unit of the hard disk is "Sector" Sector ). Each slice stores 512 bytes (KB ).
When the Operating System reads a hard disk, it does not read multiple sectors. This is too inefficient. Instead, it reads multiple sectors consecutively at a time, that is, one block is read at a time ). This "Block" composed of multiple sectors is the smallest unit for file access. The size of the "block", the most common is 4 kb, that is, eight consecutive cuts constitute a block.
File data is stored in "blocks". Obviously, we must find a place to store the object metadata, for example, the file creator, the file creation date, and the file size. The region where the object metadata is stored is called inode, and the Chinese name is "index node ".
Each file has an inode that contains information related to the file.
Ii. inode content
Inode contains the object metadata. Specifically, it includes the following content:
* Number of file bytes
* User ID of the file owner
* File Group ID
* File read, write, and execution Permissions
* File timestamp. There are three timestamps: ctime refers to the last inode change time, mtime refers to the last change time of the file content, and atime refers to the last time the file was opened.
* Number of links, that is, how many file names point to this inode
* File data block location
You can run the stat command to view inode information of a file:
Stat example.txt
All file information except the file name exists in inode. As to why there is no file name, we will explain it in detail below.
3. inode size
Inode also consumes disk space. Therefore, during hard disk formatting, the operating system automatically divides the hard disk into two areas. One is the data area that stores file data, and the other is the inode table in the inode area. This stores information contained in inode.
The size of each inode node, usually 128 bytes or 256 bytes. The total number of inode nodes, which is given during formatting. Generally, an inode is set every 1KB or 2KB. Assume that the size of each inode node in a 1 GB hard disk is 128 bytes, and an inode is set every 1 kb, the size of the inode table will reach 128 MB, it accounts for 12.8% of the total disk.
Run the df command to view the total number of inodes and the number of inodes used in each hard disk partition.
Df-I
To view the size of each inode node, run the following command:
Sudo dumpe2fs-h/dev/hda | grep "Inode size"
Because each file must have an inode, inode may be used up, but the hard disk is not full. In this case, you cannot create a new file on the hard disk.
Iv. inode number
Each inode has a number. The operating system uses the inode number to identify different files.
It is worth repeating here. In Unix/Linux systems, inode numbers are used to identify files instead of file names. For the system, the file name is only a nickname or nickname that inode numbers are easy to recognize.
On the surface, the user opens the file through the file name. In fact, the process inside the system is divided into three steps: first, the system finds the inode number corresponding to the file name; second, it obtains the inode information through the inode number; Finally, according to the inode information, find the block where the file data is located and read the data.
Use the ls-I command to view the inode number corresponding to the file name:
Ls-I example.txt
V. Directory files
In Unix/Linux systems, directory is also a file. Opening a directory is actually opening a directory file.
The directory file structure is very simple, that is, a list of directory items dirent. Each directory item consists of two parts: the file name of the included file and the inode number corresponding to the file name.
The ls command only lists all file names in the directory file:
Ls/etc
The ls-I command lists the entire directory file, that is, the file name and inode Number:
Ls-I/etc
To view the detailed information of a file, you must access the inode node and read the information according to the inode number. The ls-l command is used to list detailed information about a file.
Ls-l/etc
6. Hard Link
In general, the file name and inode number are "one-to-one correspondence", and each inode number corresponds to a file name. However, in Unix/Linux, multiple file names can point to the same inode number.
This means that you can access the same content with different file names. Modifying the file content affects all file names. However, deleting one file name does not affect access to another file name. This situation is called "hard link" hard link ).
The ln command can create a hard link:
Ln source file target file
After running the preceding command, the source file and the target file have the same inode number and point to the same inode. In the inode information, there is a name called "number of links", which records the total number of file names pointing to the inode and increases by 1.
In turn, deleting a file name will reduce the number of links in the inode node by 1. When this value is reduced to 0, it indicates that no file name points to this inode, and the system will reclaim this inode number and its corresponding block area.
Here, by the way, the "number of links" of the directory file ". When creating a directory, two directory items are generated by default: "." and "..". The former inode number is the inode Number of the current directory, which is equivalent to the "hard link" of the current directory. The latter inode number is the inode Number of the parent directory of the current directory, it is equivalent to the "hard link" of the parent directory ". Therefore, the total number of "hard links" in any directory is always equal to 2 and the total number of its subdirectories includes Hidden Directories ).
7. Soft links
In addition to hard links, there is also a special case.
Although the inode numbers of file A and file B are different, the content of file A is the path of file B. When reading file A, the system automatically directs the visitor to file B. Therefore, no matter which file is opened, file B is eventually read. In this case, file A is called "soft link" soft link "of file B) or" symbolic link ).
This means that file A depends on file B. If file B is deleted, an error is returned when file A is opened: "No such file or directory ". This is the biggest difference between soft links and hard links: file A points to the file name of file B, rather than the inode number of file B. The inode "number of links" of file B will not change.
The ln-s command can create soft links.
Ln-s source file or directory target file or directory
VIII. Special Functions of inode
Because inode numbers and file names are separated, this mechanism leads to some Unix/Linux system-specific phenomena.
1. Sometimes, the file name contains special characters and cannot be deleted normally. In this case, deleting inode nodes can delete files.
2. Move or rename a file, but change the file name without affecting the inode number.
3. After opening a file, the system uses the inode number to identify the file, regardless of the file name. Therefore, in general, the system cannot learn the file name from the inode number.
Makes software updates easy. You can update the software without shutting down the software and do not need to restart the software. Because the system uses inode numbers to identify running files, rather than file names. During update, the new version generates a new inode with the same file name, without affecting the running file. When the software runs next time, the file name automatically points to the new version, and the inode of the old version is recycled.