Linux linked files
Several Basic Concepts
Linux linked files can be divided into hard link and soft link ). To understand them, you must first understand several basic concepts.
In addition to pure data, files must also contain management information for such pure data, such as the file name, access permission, owner of the file, and the disk block corresponding to the file's data. These management information is called metadata (mata data) and stored in the inode node of the file. We can useStatCommand to view the inode information of a file:
$ Stat/etc/passwd File: "/etc/passwd" Size: 936 Blocks: 8 IO Block: 4096 basic File Device: fd00h/64768d Inode: 137143 Links: 1 Access: (0644/-rw-r --) Uid: (0/root) Gid: (0/root) Access: 23:01:39. 905999995 + 0800 Modify: 16:36:12. 802999997 + 0800 Change: 16:36:12. 809000014 + 0800 $ ls-l/etc/passwd-rw-r -- 1 root 936 July 15 16:36/etc/passwd
Here we have checked/Etc/passwdFile metadata.Ls-lThe command also lists the metadata information of some files (from left to right: permission, number of hard links, owner, group, file size, recent change time, file name ), however,StatCommand output information is more complete. In the stat output, the file has three timestamps: Recent Access, recent changes, and recent changes, which correspond to Access, Modify, and Change in English. The Access time is easy to understand. When the data of this file is accessed every time (note, it is not metadata), this time will be updated. For exampleCatOrMoreThe access time is updated when the command reads the file content.LsOrStatCommand, because only the inode of the file is accessed, the access time value is not updated. Modify time is the last modification time of the file data, for exampleVimEdit the file and save it. The modify time of the file is updated. Change time is the last modification time of the file metadata (inode). For exampleChownCommand to modify the file owner, the change time of the file will be updated.
In fact, when we first create a partition and use itMkfs. ext4When you create a file system, the inode node is retained in the fixed area of the file system. We can usedf -i
Command to view the inode node size and usage of a file system:
# Df-ih/dev/mapper/pdc_bcfaffjfaj2 File System Inode used (I) available (I) used (I) % mount point/dev/mapper/pdc_bcfaffjfaj2 18 M 127 K 18 M 1%/home
As you can see, in the Linux Mint17.3 system, the partition/dev/mapper/pdc_bcfaffjfaj2 retains a total of 18 m inode regions, which currently uses 127 K. Is there any possibility that a partition still has space and the inode area is used up? Yes. This happens when there are too many small files! At this time, even if the file system still has space available, we still cannot create new files in this file system. What if there are so many small files in my application environment? In fact, when creating an ext4 file system, you can manually specify the proportion of the inode region.Man mkfs. ext4View related parameters and options.
Used just nowStatWhen viewing the inode information of the file, we can see that there is a line in the output information.Inode: 137143, This is/Etc/passwdThe inode Number of the file. Each inode has a unique inode number for the entire file system. The operating system kernel uses the inode number instead of the file name to identify different files. The file name is only for your convenience. The kernel finds inode through the file name and then accesses the actual file data through inode. Is there a possibility that multiple file names correspond to the same inode? Yes. In this way, the so-called hard-link file is generated.
Although each file corresponds to a unique inode number, the inode number is messy and meaningless. We hope to get a meaningful file name for each file. A basic function provided by modern file systems is access by name. Therefore, we also need to create a file name to correspond to the inode number, which leads to the concept of directory entry (dentry. In the Linux File System, there is a special type of file called "directory". The directory stores the correspondence between all file names under the Directory and inode numbers, each correspondence is called a dentry. In Linux, all the files and directories are built into an inverted tree structure. In this way, we only need to determine the inode Number of the root directory to access the entire file system by name.
The essence of hard links is another entry of existing files in the directory tree. That is to say, hard links and original files are separated by dentry in different or identical directories. They direct to the same inode and correspond to the same disk data block ), has the same access permissions and attributes. In short, a hard link is actually an alias for an existing file. If the file system is compared to a book, the hard link is in the book directory, there are two directory items pointing to the same chapter on the same page number.
The advantage of hard link is that it occupies almost no disk space (because it only adds a directory item ), however, this advantage is not obvious compared with soft links (because soft links occupy a small amount of disk space ). In addition, hard links have the following limitations: 1. Hard links cannot be created across file systems. The reason is simple. inode numbers can be unique only in one file system. If they span the file system, the inode numbers may be repeated. 2. You cannot create hard links to directories. I will explain the cause later. Due to the limitations of hard links and the ease of management of soft links, soft links are more commonly used. The examples mentioned in this article show that almost all examples are soft links.
A soft link is also called a symbolic link. It is abbreviated as "symlink ". Unlike hard links, which are just a directory item, soft links are also files, but the content of this file is a pointer to another file name. When Linux accesses a soft link, it follows the pointer to find the target file containing actual data. We also use books as an example. Soft links are a chapter in a book, but there is no content in this chapter. There is only one line of words, "transfer XX chapter to xx Page ".
Soft links can point to files in another partition across the file system, or even a file pointing to a remote host across the host, or to a directory. InLs-lIn the output file list, the first field contains "l", indicating that the file is a symbolic link.
$ ls -ltotal 0lrwxrwxrwx 1 wjm wjm 11 Aug 10 00:51 hh -> /etc/passwd
We can see that the soft link permission is 777, that is, all permissions are open, and you actually cannot useChmodCommand to modify its permissions, but the actual file protection permissions still work.
In addition, the symbolic link can point to a non-existing file (it may be that the file to which it was originally pointed has been deleted, or the file system to which it is directed has not been mounted, or it points to a non-existent file when the symbolic link is initially established. We call this state "broken" (broken ). In contrast, hard links cannot point to a non-existent file.
What are the benefits of using links?
Here we will summarize the following benefits of using a linked file:
- Maintain software compatibility
For example, in RHEL6, let's look at the output of the following command:
$ ls -l /bin/shlrwxrwxrwx. 1 root root 4 Jul 15 11:41 /bin/sh -> bash
We can see that the/bin/sh file is actually a symbolic link pointing to/bin/bash. Why is it designed like this? Because the first line of almost all shell scripts is as follows:
#!/bin/sh
"#!" The line specifies the interpreter used by the script.#! /Bin/shRepresents the use of the Bourne Shell as the interpreter, which is an early Shell. In modern Linux distributions, we usually use the Bourne Again Shell, that is, bash, which is an improvement and enhancement of sh, and the early Bourne Shell does not exist in the system. To run the script smoothly without modifying the shell script, we only need to create a soft link/bin/sh to point it to/bin/bash. In this way, bash can be used to explain the script originally written for the Bourne Shell.
For example, if we have installed a large software Matlab, it may be installed in the/usr/opt/Matlab directory by default, its executable file is located in the/usr/opt/Matlab/bin directory, unless you add it to the PATH environment variable, otherwise, it is inconvenient to enter a long string of paths every time you run the software. You can also do this:
$ ln -s /usr/opt/Matlab/bin/matlab ~/bin/matlab
In your~ /BinCreate a symbolic link (~ /BinThe system is included in the PATH environment variable by default). In the future, you do not need to enter the complete PATH in the command line, just enterMatlabYou can.
- Maintain old operation habits
For example, in SuSE, the start script is placed in/Etc/init. dDirectory, while in the release of RedHat/Etc/init. d/rc. dDirectory. To avoid the problem that the administrator cannot find the location because the system switches from SuSE to RedHat, we can create a symbolic link./Etc/init. dPoint it/Etc/init. d/rc. dYou can. In fact, the RedHat release also does the following:
$ ls -ld /etc/init.d/lrwxrwxrwx. 1 root root 11 Jul 15 11:41 init.d -> rc.d/init.d
- Convenient System Management
The most impressive example is/Etc/rc. d/rcX. dThe symbolic link in the directory (X is 0 ~ 7 digits ).
$ ls -l /etc/rc.d/total 60drwxr-xr-x. 2 root root 4096 Jul 15 16:36 init.d-rwxr-xr-x. 1 root root 2617 Nov 23 2013 rcdrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc0.ddrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc1.ddrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc2.ddrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc3.ddrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc4.ddrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc5.ddrwxr-xr-x. 2 root root 4096 Jul 15 16:36 rc6.d-rwxr-xr-x. 1 root root 220 Nov 23 2013 rc.local-rwxr-xr-x. 1 root root 19688 Nov 23 2013 rc.sysinit
InInit. d/The directory contains many scripts for starting and stopping system services, such as sshd and crond. These scripts can accept a parameter, indicating to start or stop the service. To determine which scripts are run at a certain level and which parameters are passed to these scripts, RedHat designs an additional directory mechanism, that is, seven directories from rc0.d to rc6.d, each directory corresponds to a running level. If you need to start or stop a service at a certain running level, create a symbolic link in the corresponding rcX. d directoryInit. d/Scripts in the directory. For example:
$ ls -l /etc/rc.d/rc3.dtotal 0lrwxrwxrwx. 1 root root 19 Jul 15 11:42 K10saslauthd -> ../init.d/saslauthdlrwxrwxrwx. 1 root root 20 Jul 15 11:42 K50netconsole -> ../init.d/netconsolelrwxrwxrwx. 1 root root 21 Jul 15 11:42 K87restorecond -> ../init.d/restorecondlrwxrwxrwx. 1 root root 15 Jul 15 11:42 K89rdisc -> ../init.d/rdisclrwxrwxrwx. 1 root root 22 Jul 15 11:44 S02lvm2-monitor -> ../init.d/lvm2-monitorlrwxrwxrwx. 1 root root 19 Jul 15 11:42 S08ip6tables -> ../init.d/ip6tableslrwxrwxrwx. 1 root root 18 Jul 15 11:42 S08iptables -> ../init.d/iptableslrwxrwxrwx. 1 root root 17 Jul 15 11:42 S10network -> ../init.d/networklrwxrwxrwx. 1 root root 16 Jul 15 11:44 S11auditd -> ../init.d/auditdlrwxrwxrwx. 1 root root 17 Jul 15 11:42 S12rsyslog -> ../init.d/rsyslog... ....
The service scripts to be run and corresponding parameters under runlevel 3 are listed here. The first letter S and K of the symbolic link respectively indicate the passing parameters.StartAndStop, Followed by two digits to indicate the order in which the script runs. In this way, as long as the link is added or removed under the rcX. d directory, you can control which service scripts need to be run for each runlevel. If you need to modify a service script, you only need to editInit. d/Files under the directory (""), and it can affect all soft links under the rcX. d directory ("split "). This is a concise and clever design!
Ln command
We useLnCommand to create a hard or soft link. Its syntax is:
In the first form of this command, a new link pointing to the file will be created. The options option can only be remembered,-SCreates a soft link. By default, a hard link is created. For example:
# ln -s /usr/src/linux-2.6.32 /usr/src/linux
Here we create a symbolic link/usr/src/linux pointing to the real Linux source code directory/usr/src/linux-2.6.32.
Let's take another example to demonstrate the difference between soft link and hard link. We create a myfile file and then create a soft link myslink and hard link myhlink pointing to the file:
$ $ echo "an example." > myfile$ ln -s myfile myslink$ ls myfile myhlink
UseStatCheck the preceding files:
$ stat my* File: `myfile' Size: 12 Blocks: 8 IO Block: 4096 regular fileDevice: fd00h/64768d Inode: 11552 Links: 2Access: (0664/-rw-rw-r--) Uid: ( 500/ wjm) Gid: ( 500/ wjm)Access: 2016-08-10 03:59:54.421017669 +0800Modify: 2016-08-10 03:59:54.421017669 +0800Change: 2016-08-10 04:00:08.689000105 +0800 File: `myhlink' Size: 12 Blocks: 8 IO Block: 4096 regular fileDevice: fd00h/64768d Inode: 11552 Links: 2Access: (0664/-rw-rw-r--) Uid: ( 500/ wjm) Gid: ( 500/ wjm)Access: 2016-08-10 03:59:54.421017669 +0800Modify: 2016-08-10 03:59:54.421017669 +0800Change: 2016-08-10 04:00:08.689000105 +0800 File: `myslink' -> `myfile' Size: 6 Blocks: 0 IO Block: 4096 symbolic linkDevice: fd00h/64768d Inode: 11553 Links: 1Access: (0777/lrwxrwxrwx) Uid: ( 500/ wjm) Gid: ( 500/ wjm)Access: 2016-08-10 04:00:03.784997923 +0800Modify: 2016-08-10 04:00:03.784997923 +0800Change: 2016-08-10 04:00:03.784997923 +0800
Observe myfile and myhlink carefully and find that they point to the same inode (the same inode number is 11552 ). The number of hard Links (the Links field) is the same as 2, which indicates that there are two directory items pointing to this inode, and the value of each added hard link Links field will increase by 1. In the myslink file, we find that its inode number is different from the first two, and its access permission is 0777. Let's Delete the hard link myhlink and see what changes will happen? This time we usels -il
Command to view:
$ rm myfile$ ll -litotal 411552 -rw-rw-r-- 1 wjm wjm 12 Aug 10 03:59 myhlink11553 lrwxrwxrwx 1 wjm wjm 6 Aug 10 04:00 myslink -> myfile$ cat myhlinkan example.$ cat myslinkcat: myslink: No such file or directory
The-I option of the ls command can also output the inode Number of the file. The third column of the output information is the number of hard links. After the myfile file is deleted, the number of hard links of myhlink has changed from 2 to 1, however, the data of the original myfile file can still be accessed through the hard link myhlink, because the hard link accesses the file data through the inode Number of the file. However, you cannot access the data of the original myfile file through the myslink soft link, because the soft link is essentially a full path pointing to the target file, and any link in this path is broken, will invalidate this soft link.
Follow Link
Since the soft connection, when you want to back up, copy, or move a directory or file, the question of "follow the link" may occur. If yes, the object to which the link is directed is copied. If not, the link itself is only operated.
Normally, for exampleTarOrCpAnd other command tools will provide the option of whether to follow the link. For exampleCp, You can use-LIndicates the link to be followed (the target to which the link is copied), or-PIndicates that the link is not followed (copy the link itself ). For example:
$ mkdir dir1$ ln -s /tmp/a.txt dir1/slink$ cp -rL dir1 dir2$ ls -l dir2total 0-rw-rw-r-- 1 wjm wjm 0 Aug 6 17:02 slink
Here we create a soft link under the dir1 directory.-LOption to copy it to the dir2 directory, we can see that the slink under the dir2 directory is now a common file. If you use-POption (save link), the copied file is still a soft link:
$ cp -rP dir1 dir3$ ls -l dir3total 0lrwxrwxrwx 1 wjm wjm 10 Aug 6 17:07 slink -> /tmp/a.txt
If not explicitly specified-LOr-POption, thenCpThe default behavior will vary with the version.
Directory hard link
As mentioned above, hard links cannot be created for a directory. However, there is actually a hard link in the directory, but this hard link is automatically created by the system, and we cannot create it manually. When we useMkdirWhen creating an empty directory, you will find that the number of hard links in this directory is 2, for example:
$ ls -dl ~drwx------. 6 wjm wjm 4096 Aug 10 04:25 /home/wjm$ cd ~$ mkdir mydir$ ls -dli ~8605 drwx------. 7 wjm wjm 4096 Aug 10 04:25 /home/wjm$ ls -dli mydir11556 drwxrwxr-x 2 wjm wjm 4096 Aug 10 04:25 mydir
The number of hard links in the original/home/wjm directory is 6. When an empty directory mydir is created under/home/wjm, the number of hard links becomes 7, the hard link number of mydir in this empty directory is 2. Why? The reason is that there are two hidden hard links in any directory:
ls -ali mydirtotal 811556 drwxrwxr-x 2 wjm wjm 4096 Aug 10 04:25 . 8605 drwx------. 7 wjm wjm 4096 Aug 10 04:25 ..
We can see thatMydirThere are two hidden hard links in the directory. You can use the-a option of ls to list them. One of the hard links is ". ", pointing to inode 11556, that is, the inode Number of the Directory mydir; the other is" .. ", via inode we found it points to its parent directory/home/wjm. Therefore, after an empty directory mydir is created, the number of hard links of mydir is 2, and the number of hard links of its parent directory is 1. Therefore, the number of hard links in a directory = the number of subdirectories + 2.
This type of hard link is automatically created by the system. When you try to manually create a hard link pointing to a directory, the system will report an error to prevent you from doing this. Why?
In fact, in the history of UNIX operating systems, creating hard links to directories was allowed. However, it is found that there will be many problems in doing so, especially some commands that traverse the directory tree, such as fsck and find, cannot be correctly executed. In the "Unix advanced environment programming", I mentioned that Steven had performed experiments on his system. The result was: after creating a directory hard link, the file system became faulty. This will damage the tree structure of the file system and may cause loops between directories. For example:
$ ln ~ ~/mydir/myhdir_linkln: `/home/wjm': hard link not allowed for directory$ ln -s ~ ~/mydir/myhdir_link
Here, the First Command tries to create a hard link under the mydir directory pointing to its parent directory, but fails. This makes the/home/wjm and/home/wjm/mydir directories form a ring. We can no longer distinguish the two directories from the parent directory and the sub-directory. However, the second command can successfully create a soft link pointing to its parent directory. Isn't it a ring like this?
Why can't soft links point to directories and hard links? The root cause is that soft links are essentially a file, while hard links are essentially a directory item (dentry ). In linux, each file (directory is also a file, and soft link is also a file) corresponds to an inode structure. The inode data structure contains the file type (directory, common file, that is to say, the operating system can determine the symbolic connection when traversing the directory. Now that we can determine the symbolic connection, we can take some measures to prevent it from entering the endless loop. The system stops traversing after eight consecutive symbolic connections, this is why directory symbolic connections do not enter an endless loop. Hard links are essentially synonyms of "directory items. When a target is created for the first time, a directory item is created for it, which is actually a hard link. Most people often think of "hard links" as an existing object to create an additional directory item. However, the original directory items are not special and all hard links are equal, therefore, the Linux kernel cannot identify which is the "original file" and which is a "hard link ". In this way, the Ring Formed by directory hard links cannot be properly processed.
However, the root directory is a special case. We observe:
$ ls -dli /2 dr-xr-xr-x. 22 root root 4096 Aug 10 00:50 /$ ls -ali /total 102 2 dr-xr-xr-x. 22 root root 4096 Aug 10 00:50 . 2 dr-xr-xr-x. 22 root root 4096 Aug 10 00:50 ..... ...
The inode number in the root directory is 2, and the hidden hard link (...) pointing to its parent directory also points to itself.