Article Title: Resolve the VFS file system mechanism in Linux (I ). Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
This article describes the file system in Linux. The source code is from the IA32-based 2.4.20 kernel. In general, the file system in Linux can be divided into three main parts: one is the system call of the Upper-layer file system, and the other is the Virtual File System VFS (Virtual Filesystem Switch ), third, the actual file systems attached to VFS, such as ext2 and jffs. This article focuses on explaining the internal mechanism of VFS in Linux kernel through code analysis. In this process, it will involve upper-level file system calls and how to mount the lower-level actual file system. This article attempts to explain the VFS file system mechanism in Linux from a relatively high point of view.
1. Summary
This article describes the file system in Linux. The source code is from the IA32-based 2.4.20 kernel. In general, the file system in Linux can be divided into three main parts: one is the system call of the Upper-layer file system, and the other is the Virtual File System VFS (Virtual Filesystem Switch ), third, the actual file systems attached to VFS, such as ext2 and jffs. This article focuses on explaining the internal mechanism of VFS in Linux kernel through code analysis. In this process, it will involve upper-level file system calls and how to mount the lower-level actual file system. This article attempts to explain the VFS file system mechanism in Linux from a relatively high point of view. Therefore, the description focuses more on the main context of the entire module, rather than the details, there are also several illustrations to help readers understand.
The VFS code is relatively cumbersome and complex. I hope that you will have a clear understanding of the overall VFS Operating Mechanism in Linux after reading this article. Before reading this article, we recommend that you read the source code of the file system to create the most basic concepts of the file system in Linux. For example, you should be familiar with super block, dentry, and inode at least, the meaning of data structures such as vfsmount, so that you can read this article for better understanding.
2. Overview of VFS
VFS is a software mechanism. It may be called a Linux File System Administrator. The data structure associated with it only exists in the physical memory. Therefore, during each system initialization, Linux must first construct a VFS directory tree in the memory (namespace in the Linux source code ), in fact, it is to establish the corresponding data structure in the memory. The VFS directory tree is an important concept in the file system module of Linux. I hope that you do not confuse it with the actual file system directory tree. In my opinion, the directories in VFS are mainly used to provide mount points of the actual file system. Of course, file-level operations are also involved in VFS. This article does not describe this situation. The directory tree or directory is mentioned below. Unless otherwise specified, it refers to the directory tree or directory of VFS. Figure 1 is a possible image of the directory tree in memory:
Figure 1: VFS directory tree structure
3. File System Registration
The file system here refers to the actual file systems that may be mounted to the directory tree. The so-called actual file system means that the actual operations in VFS will eventually be completed through them, it does not mean that they must exist on a specific storage device. For example, I have registered more than a dozen file systems under my Linux machine, including "rootfs", "proc", "ext2", and "sockfs.
3.1 Data Structure
In Linux source code, each actual file system is represented by the following data structure:
Struct file_system_type {
Const char * name;
Int fs_flags;
Struct super_block * (* read_super) (struct super_block *, void *, int );
Struct module * owner;
Struct file_system_type * next;
Struct list_head fs_supers;
};
The registration process will actually instantiate the struct file_system_type data structure of each actual file system, and then form a linked list. In the kernel, a global variable named file_systems is used to point to the table header of the linked list.
3.2 register the rootfs File System
In many actual file systems, the reason for introducing the registration process of the rootfs File System separately is that the VFS of the file system is too closely related, if ext2/ext3 is a local Linux file system, then the rootfs file system is the basis for the existence of VFS. Generally, the registration of file systems is completed through the module_init macro and the do_initcils () function (readers can read the module_init macro statement and arch \ i386 \ vmlinux. the lds file to understand this process), but the registration of rootfs is completed through the init_rootfs () initialization function, this means that the registration process of rootfs is an integral part of the Linux kernel initialization phase.
Init_rootfs () registers the rootfs File System by calling the register_filesystem (& rootfs_fs_type) function. rootfs_fs_type is defined as follows:
Struct file_system_type rootfs_fs_type = {name: "rootfs", read_super: ramfs_read_super, fs_flags: FS_NOMOUNT | FS_LITTER, owner: THIS_MODULE ,}
The structure of the registered file_systems linked list is shown in Figure 2:
Figure 2: file_systems linked list Structure
4. Create VFS directory tree
Since it is a tree, the root is the basis for its existence. This section describes how Linux establishes the root node in the initialization phase, that is, the "/" directory. This includes mounting the rootfs File System to the root directory. The code for constructing the root directory is in the init_mount_tree () function (fs \ namespace. c.
First, the init_mount_tree () function will call do_kern_mount ("rootfs", 0, "rootfs", NULL) to mount the previously registered rootfs file system. This seems a bit strange, because according to the previous statement, it seems that there should be a mount directory first, and then mount the corresponding file system on it, however, VFS does not seem to have its root directory. It doesn't matter, because here we call do_kern_mount (), this function will naturally create the root directory that we are most concerned about and most critical to (in Linux, the data structure corresponding to the directory is struct dentry ).
In this scenario, do_kern_mount () is mainly used:
1) Call the alloc_vfsmnt () function to apply for a piece of memory space (struct vfsmount * mnt) in the memory and initialize some of its member variables.
2) Call the get_sb_nodev () function to allocate a super block structure (struct super_block) sb in the memory and initialize some of its member variables, insert s_instances to the two-way linked list pointed by fs_supers in the rootfs file system type structure.
3) Call the ramfs_read_super () function through the read_super function pointer in the rootfs file system. Remember that when the rootfs file system was registered, its read_super Pointer Points to the ramfs_read_super () function. For more information, see.
4) The ramfs_read_super () function calls ramfs_get_inode () to allocate an inode structure (struct inode) in the memory and initialize some of its member variables, among them, I _op, I _fop, and I _sb are important:
Inode-> I _op = & ramfs_dir_inode_operations;
Inode-> I _fop = & dcache_dir_ops;
Inode-> I _sb = sb;
In this way, in the future, commands such as file operations initiated on VFS by calling the file system will be taken over by the corresponding function interfaces in the rootfs file system.
5) after the inode structure is allocated and initialized, The ramfs_read_super () function calls the d_alloc_root () function to create a key root directory (struct dentry) dentry for the directory tree of VFS, point the d_sb pointer in dentry to sb and the d_inode pointer to inode.
6) Point mnt_sb pointer in mnt to sb, mnt_root and mnt_mountpoint pointer to dentry, and mnt_parent pointer to itself.
In this way, when the do_kern_mount () function returns, the relationships between the allocated data structures and the rootfs file system are shown in 3. The numbers below the mnt, sb, inode, and dentry blocks in the figure indicate the order in which they are allocated in the memory. Due to space limitations, only part of the member variables are given in each structure. Readers can refer to the source code as shown in the figure for better understanding.
Finally, the init_mount_tree () function prepares the namespace field in the process data block for the system's initial process (namely, the init_task process). The main purpose is to mount do_kern_mount () the mnt and dentry information created in the function is recorded in the process data block of the init_task process. In this way, all the processes that will be fork from the init_task process will inherit this information first, we can see why sys_mkdir is used to create a directory in VFS. The main code for creating a namespace for a process is as follows:
Namespace = kmalloc (sizeof (* namespace), GFP_KERNEL );
List_add (& mnt-> mnt_list, & namespace-> list); // mnt is returned by do_kern_mount ()
Namespace-> root = mnt;
Init_task.namespace = namespace;
For_each_task (p ){
Get_namespace (namespace );
P-> namespace = namespace;
}
Set_fs_pwd (current-> fs, namespace-> root, namespace-> root-> mnt_root );
Set_fs_root (current-> fs, namespace-> root, namespace-> root-> mnt_root );
The last two lines of this Code record the mnt and dentry information created in the do_kern_mount () function in the fs structure of the current process.
The above describes the origins of a lot of data structures. In fact, the ultimate goal is to create a VFS directory tree in the memory. More specifically, init_mount_tree () this function creates a root directory "/" for VFS. Once the root directory is used, the number of partitions can grow, for example, sys_mkdir can be called by the system to create a new leaf node on the tree. Therefore, the system designer mounts the rootfs File System to the root directory of the tree. For the rootfs file system, if you look at its file_system_type structure in Figure 2 above, you will find that one of its member function pointers read_super to ramfs_read_super, from the ramfs in this function name, the reader can probably guess that all the file operations involved in this file are