Multi-angle Analysis Why Linux hard connections cannot point to directory __linux

Source: Internet
Author: User
Tags parent directory file permissions

Translator Note: Recently in the file system-related, whenever you read the inode related things, the book or blog will be related to hard link/soft link content, so today specifically for hard links translated a few English, understand it. One, Hard link

This section is translated from: Http://

In a traditional UNIX file system, a directory is a file that contains an associated list. Entries in the catalog file are string-type filenames and their corresponding unique file identifiers-the inode number. An inode number is essentially a pointer on a disk, and the file object can be positioned efficiently through it. No two disk objects share an inode number, and no disk target has two inode numbers.

"Hard link" is essentially synonymous with "directory entries." When a goal is created for the first time, a directory entry is created for it. This is actually a hard link, and most people often associate a "hard link" with "Creating an extra directory entry for an existing object." But the original directory entry is actually not any special, all the links are equal, so in a sense there is no way to identify which is the original.

Catalogs can also contain directories, which, of course, are done through hard links. When a subdirectory is created, a directory entry is also created in its parent directory, which is used to associate the name of the subdirectory with the newly created inode. In addition, two directory entries are automatically created in the new catalog file, which are associated with the ".", "...", and the current directory and its parent directory. So, creating a subdirectory creates a new hardware link to its parent directory, and two hardware links to the newly created object (subdirectory): one from its parent directory and the other from his own ("." , which means that the number of hard links for a directory entry is at least 2.

long@zhouyl:~/test$ mkdir ABC
long@zhouyl:~/test$ ls-l total
drwxr-xr-x  2 long long 4096 APR 09:02 a BC
            --Number of hard links

Directory hard links are more specific. First, the only way to create them is to create a table of contents; The Operating system hardware link function does not allow a hard link to operate on the target is a directory inode. The reason for this is that loops may be generated in the file system directory structure. Depending on the kernel, it is also necessary to comply with the file system module itself if the directory hard links are allowed.

In the traditional UNIX file system, the cycle is bad, for the following two reasons:
First, the collection of storage is based on reference counts, and it does not handle circular references. The special direction reference is "." and "..", but they are treated as special cases.
Secondly, in the tree structure the direction reference can lead to nausea of multithreading problems. In traditional kernel design (such as the BSD kernel), the inode being used is represented by the structure vnodes in memory. These nodes are accessed at the same time and contain locks. Some operations preserve the lock of the directory when accessing subdirectories of a directory. This can cause deadlocks to occur. These lock operations are generally not interrupted by a signal, so the deadlock process remains deadlocked until it is restarted.

Access to ".." In BSD There are special ways to avoid this kind of deadlock. Basically, the lock on the original directory Vnode just released, "..." The lock is requested and then the original directory is locked again. It's like a race. (This paragraph and the next paragraph translation is not good, but and understand the hard link is not, in fact, the above has been explained enough.) I just want to be as complete as possible.

Once I implemented a cycle detection algorithm for Vnode locks, as much as possible to support the cyclic hard links of a BSD version of the file system, but the problem is: Although the program works well, it's hard to get the rest of the kernel to work. Many places in the kernel, such as file system-driven layers, simply assume that the lock will succeed or eventually succeed, so there is no way to handle EDEADLK errors. This is not very clear, even if you are allowed to use information that prompts you for a deadlock that might occur, how do you deal with it? You will interrupt all the system calls. What kind of retry would you use? How the application process responds to random file system operations that may have deadlocks.

Second, why hard links can not point to the directory

This section is translated from: Http://

The first section has a good explanation for concepts such as hard links and inode, but in order to ensure the integrity of the original text, the following may have a duplicate explanation. 2.1 From the perspective of the Inode

Allowing hard links to directories can break the system's forward-free loop structure, possibly creating directory loops, which can cause errors in fsck and other software that traverses the file tree. First, to understand this, you must first understand the inode. The data in the file system is stored in blocks of data on disk, which are aggregated by the inode. It can be said that the inode is a file, but the inode is missing a filename, so you need links. A link is actually a pointer to an inode. A directory is an inode that holds these links, and each file name in the directory is a link to the inode. In this case, opening a file in a Unix system also creates a link, but it is a different type of link (it is not a named link).
A hard link is just an extra directory entry that points to the inode, and when you use the Ls-l command to view the file, the number after the file permissions is the number of named connections. Most files have only one link. Creating a new hard link to a file will point two file names to the same inode.

long@zhouyl:~/test$ Touch Test
long@zhouyl:~/test$ ls-l Total
-rw-r--r--1 long long 0 APR 16:56 test
  long@zhouyl:~/test$ LN test test1
long@zhouyl:~/test$ ls-l Total
-rw-r--r--2 long long 0 Apr 16:56 te St
-rw-r--r--2 long long 0 APR 16:56 test1

Now you can clearly see that there is actually no hard link, a hard link is the same as a normal name (this is the same as the hard links described in section I, the first section explains that a hard link is an entry in a directory file that records a filename with its corresponding inode), and in the example above, test and Test1 which is the original file, which is the hard link. You can't really tell (ignore timestamps) because they are all links to the same inode for the same content.
long@zhouyl:~/test$ Ls-li Total 
2114356-rw-r--r--2 long long 0 Apr 16:56 test
2114356-rw-r--r--2 lon G Long 0 APR 16:56 test1

Use the Ls-li (-I flag to let LS display the file's inode number in the first column) we can see that test and test1 have the same inode number at this time. Now, if you are allowed to use hard links on the directory, different directory entries for different pointers in the file system will point to the same thing. In fact, a subdirectory can point to his parent directory to create a loop.

Why you need to consider this cycle. Because when you traverse the directory tree, you have no way to detect the loop (if you do not have the Inode number that tracks the traversal). For example, now that you are using the du command, du needs to traverse all subdirectories to understand the use of the disk. And how does the du command know that it encounters a loop. This can easily happen.

Soft links, also known as symbolic links, are a completely different thing because they are a special type of file (translator added: File types in UNIX file system include: Ordinary files, directory files, block special files, character special files, FIFO, sockets and symbolic links.) For example, through the soft links created by "ln-s a B", the inode number of file B and a is not the same after the soft link is created, which means that files A and B are not the same file at this time. At this point in file B is the path to file A, when reading B, the system recognizes that file B is the symbolic link will automatically guide its corresponding file a. )。 Note that a symbolic link can point to a non-existent target because they are pointing only to the name rather than directly to the inode. This is not the same as hard links, because hard links mean there must be files.

So why du can handle symbolic links very easily and not deal with hard links. As we discussed earlier, there is no difference between using a hard link to a directory and a normal directory, and soft links are special, detectable, and skipped. Du notes that a directory is a symbolic link that it will completely skip over it.

long@zhouyl:~/test$ ln-s  /home/long/videos test2
long@zhouyl:~/test$ ls-l Total
-rw-r--r--2 long Long  0 Apr 16:56 Test
-rw-r--r--2 Long long  0 Apr 16:56 test1
lrwxrwxrwx 1 long long APR 16 17:3 1 test2->/home/long/videos
long@zhouyl:~/test$ du-ah 
0	/test2
4.0K	.

2.2 from the point of view of Mount

From the mount point point of view, any directory has and only one parent directory "...".

One way to PWD is to check the device: "." and ".." The inode, if they are the same, indicate that you are already in "/". Otherwise, find the parent directory name merged into the stack, and then compare "... /." and ".. /... ", then compare". /.. /."".. /.. /.." ...。 Until you arrive at "/", start out the stack and print the name of the directory item saved in the stack, and finally get the full directory name of the current directory. This algorithm relies on each directory and has only one parent directory.

If the hard link to the directory is allowed, "..." That points to which of multiple parent directories. This is a compelling reason why "hard links to directories are not allowed". The soft link to the directory does not raise this problem, and if a program requires it, it can detect the symbolic link by lstat () the path name. The PWD algorithm returns the correct path to the destination directory.

Third, summary

In the history of UNIX file systems, hard links to directories are possible. But this may generate loops in the file system tree, which makes traversing the file system confusing (in UNIX advanced environment programming, the author Steven has experimented on his own system, and as a result: The file system becomes riddled with errors after creating a hard link to the directory). A directory can even be its own parent directory, as shown in the following illustration, where a testdir hard link to foo itself is created in directory foo, and a loop occurs. Hey, a lot of bad things would come!

Modern file systems generally prohibit these confusing states, and only the root directory remains the exception: the root directory is its own parent directory. LS/.. Is the content of the root directory. Of course, we can use the "Mount-o Bind/dir1/dir2" to mount the Dir1 on Dir2, which is the same as the hard link to the directory, except that this command requires both DIR1 and DIR2 to exist.

It is also said that the essential difference between hard links and soft links is that soft links can be detected by the system and hard links do not. So it is safe to create soft links to directories, but hard links are not. This article, welcome to congratulate ~ ~


Note: The translation of this text is relatively long, if there is not the correct place please everyone advice. In addition, reprint please indicate the source

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.