Linux Programming Study Notes-file management system

Source: Internet
Author: User

Linux Programming Study Notes-file management system

This article is organized from the network

File System Management in Linux 1. Overview of VFS File System

Linux uses VFS to manage the file system, and one of the linux design principles is everything is file. Therefore, the file management system is the core embodiment of linux design.

The full name of VFS is Virtual File System ).

In general, the file system in Linux can be divided into three main parts: one is the system call of the Upper-layer file system, and the other is the Virtual File System VFS (Virtual Filesystem Switch ), third, the actual file systems attached to VFS, such as ext2 and jffs.

VFS is a software mechanism. It may be called a Linux File System Administrator. The data structure associated with it only exists in the physical memory. Therefore, during each system initialization, Linux must first construct a VFS directory tree in the memory (namespace in the Linux source code ), in fact, it is to establish the corresponding data structure in the memory. The VFS directory tree is an important concept in the file system module of Linux. I hope that you do not confuse it with the actual file system directory tree. In my opinion, the directories in VFS are mainly used to provide mount points of the actual file system. Of course, file-level operations are also involved in VFS. This article does not describe this situation. The directory tree or directory is mentioned below. Unless otherwise specified, it refers to the directory tree or directory of VFS. The figure is a possible image of the directory tree in the memory:

2. File System Registration

The file system here refers to the actual file systems that may be mounted to the directory tree. The so-called actual file system means that the actual operations in VFS will eventually be completed through them, it does not mean that they must exist on a specific storage device. For example, there are more than a dozen file systems registered on some Linux machines, such as "rootfs", "proc", "ext2", and "sockfs.

2.1 Data Structure

In Linux source code, each actual file system is represented by the following data structure:

struct file_system_type {const char *name;int fs_flags;struct super_block *(*read_super) (struct super_block *, void *, int);struct module *owner;struct file_system_type * next;struct list_head fs_supers;};
The registration process will actually instantiate the struct file_system_type data structure of each actual file system, and then form a linked list. In the kernel, a global variable named file_systems is used to point to the table header of the linked list. 2.2 register the rootfs File System

In many actual file systems, the reason for introducing the registration process of the rootfs File System separately is that the VFS of the file system is too closely related, if ext2/ext3 is a local Linux file system, then the rootfs file system is the basis for the existence of VFS. Generally, the registration of file systems is completed through the module_init macro and the do_initcils () function (readers can read the module_init macro statement and arch \ i386 \ vmlinux. the lds file to understand this process), but the registration of rootfs is completed through the init_rootfs () initialization function, this means that the registration process of rootfs is an integral part of the Linux kernel initialization phase.

Init_rootfs () registers the rootfs File System by calling the register_filesystem (& rootfs_fs_type) function. rootfs_fs_type is defined as follows:

 struct file_system_type rootfs_fs_type = { \name:"rootfs", \read_super:ramfs_read_super, \fs_flags:FS_NOMOUNT|FS_LITTER, \owner:THIS_MODULE, \ }

Shows the structure of the registered file_systems linked list:

3. Create VFS directory tree

Since it is a tree, the root is the basis for its existence. This section describes how Linux establishes the root node in the initialization phase, that is, the "/" directory. This includes mounting the rootfs File System to the root directory. The code for constructing the root directory is in the init_mount_tree () function (fs \ namespace. c.

First, the init_mount_tree () function will call do_kern_mount ("rootfs", 0, "rootfs", NULL) to mount the previously registered rootfs file system. This seems a bit strange, because according to the previous statement, it seems that there should be a mount directory first, and then mount the corresponding file system on it, however, VFS does not seem to have its root directory. It doesn't matter, because here we call do_kern_mount (), this function will naturally create the root directory that we are most concerned about and most critical to (in Linux, the data structure corresponding to the directory is struct dentry ).

In this scenario, do_kern_mount () is mainly used:

1) Call the alloc_vfsmnt () function to apply for a piece of memory space (struct vfsmount * mnt) in the memory and initialize some of its member variables.

2) Call the get_sb_nodev () function to allocate a super block structure (struct super_block) sb in the memory and initialize some of its member variables, insert s_instances to the two-way linked list pointed by fs_supers in the rootfs file system type structure.

3) Call the ramfs_read_super () function through the read_super function pointer in the rootfs file system. Remember that when the rootfs file system was registered, its read_super Pointer Points to the ramfs_read_super () function. For more information, see.

4) The ramfs_read_super () function calls ramfs_get_inode () to allocate an inode structure (struct inode) in the memory and initialize some of its member variables, among them, I _op, I _fop, and I _sb are important:

inode->i_op = &ramfs_dir_inode_operations;inode->i_fop = &dcache_dir_ops;inode->i_sb = sb;
In this way, in the future, commands such as file operations initiated on VFS by calling the file system will be taken over by the corresponding function interfaces in the rootfs file system.

5) after the inode structure is allocated and initialized, The ramfs_read_super () function calls the d_alloc_root () function to create a key root directory (struct dentry) dentry for the directory tree of VFS, point the d_sb pointer in dentry to sb and the d_inode pointer to inode.

6) Point mnt_sb pointer in mnt to sb, mnt_root and mnt_mountpoint pointer to dentry, and mnt_parent pointer to itself.

In this way, when the do_kern_mount () function returns, the relationships between the allocated data structures and the rootfs file system are shown in.

The numbers below the mnt, sb, inode, and dentry blocks in the figure indicate the order in which they are allocated in the memory. Due to space limitations, only part of the member variables are given in each structure. Readers can refer to the source code as shown in the figure for better understanding.

Finally, the init_mount_tree () function prepares the namespace field in the process data block for the system's initial process (namely, the init_task process). The main purpose is to mount do_kern_mount () the mnt and dentry information created in the function is recorded in the process data block of the init_task process. In this way, all the processes that will be fork from the init_task process will inherit this information first, we can see why sys_mkdir is used to create a directory in VFS. The main code for creating a namespace for a process is as follows:

namespace = kmalloc(sizeof(*namespace), GFP_KERNEL);   list_add(&mnt->mnt_list, &namespace->list);  //mnt is returned by do_kern_mount()namespace->root = mnt;init_task.namespace = namespace;for_each_task(p) {get_namespace(namespace);p->namespace = namespace;}set_fs_pwd(current->fs, namespace->root, namespace->root->mnt_root);set_fs_root(current->fs, namespace->root, namespace->root->mnt_root);

The last two lines of this Code record the mnt and dentry information created in the do_kern_mount () function in the fs structure of the current process.

The above describes the origins of a lot of data structures. In fact, the ultimate goal is to create a VFS directory tree in the memory. More specifically, init_mount_tree () this function creates a root directory "/" for VFS. Once the root directory is used, the number of partitions can grow, for example, sys_mkdir can be called by the system to create a new leaf node on the tree. Therefore, the system designer mounts the rootfs File System to the root directory of the tree. For the rootfs file system, if you look at its file_system_type structure in Figure 2 above, you will find that one of its member function pointers read_super to ramfs_read_super, from ramfs in this function name alone, the reader can probably guess that all file operations involved in this file are aimed at data objects in the memory. In fact, this is also true. From another perspective, because VFS itself is a data object in the memory, the operations on it are limited to the memory, which is also very logical. In the next chapter, we will use a specific example to discuss how to use the function tree provided by rootfs to add a new directory node for VFS.

The main purpose of each directory in VFS is to provide a mount point for the file system to be mounted later. Therefore, the real file operations must be performed through the functional interfaces provided by the mounted file system.

4. Create a directory under VFS

To better understand VFS, let's take a practical example to see how Linux creates a new "/dev" directory under the root directory of VFS.

To create a new directory in VFS, you must first search for the directory to find information about its parent directory, ". For example, to create a directory/home/ricard, you must first perform a layer-by-layer search along the directory path. In this example, you must first find the directory from the root directory and then find the directory home under the root directory, next, you need to create a new directory named ricard. First, you need to search for the directory. In this example, you need to find the parent directory of the new directory ricard, that is, the information corresponding to the home directory.

Of course, if an error is found during the search process, for example, the parent directory of the directory to be created does not exist, or the current process does not have the corresponding permissions, the system will call the relevant process for processing in this case, in this case, I will not mention it.

In Linux, sys_mkdir is called by the system to add new nodes to the VFS directory tree. In addition, the following data structure is introduced for collaborative path search:

struct nameidata {struct dentry *dentry;struct vfsmount *mnt;struct qstr last;unsigned int flags;int last_type;};

This data structure is used to record relevant information in the path search process and plays a similar role as a "road map. The dentry in the first two items records the information of the parent directory of the directory to be created. The mnt member will explain it later. The last three records are the information of the last node (that is, the directory or file to be created) of the path to be searched. Currently, sys_mkdir ("/dev", 0700) is called to create the directory "/dev". The parameter 0700 is ignored. It only limits the mode of the directory to be created. The sys_mkdir function first calls path_lookup ("/dev", LOOKUP_PARENT, & nd); to find the path, where nd is the variable declared by struct nameidata nd. In the subsequent description, because the function call relationship is cumbersome, in order to highlight the main process line, it will not be described strictly according to the function call relationship.

Path_lookup finds that "/dev" starts with "/", so it looks down from the root directory of the current process. The specific code is as follows:

nd->mnt = mntget(current->fs->rootmnt);nd->dentry = dget(current->fs->root);

Remember to record the newly created VFS root directory information in the init_task process data block in the second half of the init_mount_tree () function. In this scenario, nd-> mnt points to the mnt variable in Figure 3, AND nd-> dentry points to the dentry variable in figure 3.

Then, call the path_walk function and go down to find the information returned by the nd variable. last. name = "dev", nd. last. len = 3, nd. last_type = LAST_NORM. For the mnt and dentry members in nd, the value set previously remains unchanged in this scenario. In this loop, we only use nd to record the relevant information. The actual Directory creation work is not really expanded, but the previous work has collected the necessary information for creating new nodes.

Well, the creation of a new directory node will be expanded so far. This is done by the lookup_create function. When calling this function, two parameters will be passed in: lookup_create (& nd, 1 ); the nd parameter is the variable mentioned above. Parameter 1 indicates creating a new directory.

The general process here is: a new memory space with a struct dentry structure is allocated to record the information corresponding to the dev directory. The dentry structure will be mounted to its parent directory, that is, in the dentry structure corresponding to the "/" directory, this relationship is implemented by the linked list. Next we will assign a struct inode structure. In Inode, The I _sb AND THE d_sb in dentry point to the sb respectively. In this case, you do not need to re-allocate a super block structure when creating a new directory in the same file system, because they all belong to the same file system, a file system corresponds to only one super block.

In this way, after sys_mkdir is called to successfully create a new directory "/dev" in the VFS directory tree, the relationships between the new data structures are shown in. New_inode and new_entry are two rectangular blocks with darker colors. They are the newly allocated memory structure in the sys_mkdir () function. For the mnt, sb, dentry, inode and other structures in the figure, the corresponding data structure is still in, and the links between them remain unchanged (in the figure, to avoid excessive link curves, some links are ignored, such as the links between mnt, sb, and dentry, for more information, see ).

It should be emphasized that, since the rootfs file system is mounted to the VFS tree, it will inevitably participate in the sys_mkdir process. In fact, throughout the process, ramfs_mkdir, ramfs_lookup, and other functions in the rootfs file system have been called.

5. Mount the file system in the VFS tree

This section describes how to mount a file system to a directory (mount point) in the directory tree of VFS.

This process can be simply described as: Install a file system (file_system_type) on a certain device (dev_name) to a directory (dir_name) on the VFS directory tree ). The solution is to convert operations on a directory in the VFS directory tree to operations on the actual file system installed on it. For example, if you install the root file system on hda2 (assuming the file system type is ext2) to the "/dev" directory created in the previous section (at this time, the "/dev" directory becomes the Installation Point). After the installation is successful, you must execute the "ls" command to the "/dev" directory of the VFS file system, this command should be able to list all directories and files under the root directory of the ext2 File System on hda2. Obviously, the key here is how to convert the "/dev" directory operation commands in the VFS tree to the corresponding commands in the ext2 actual file system installed on it. Therefore, the next statement will focus on how to transform this core issue. Before proceeding, you may wish to imagine how the Linux system will solve this problem. Remember: operations on directories or files will eventually be performed by the corresponding functions in the I _op and I _fop function tables in the inode structure corresponding to the directory or file. Therefore, no matter what the final solution is, you can imagine that I _op and I _fop in the inode corresponding to the "/dev" directory must be converted to I _op and I _fop IN THE inode corresponding to the root file system ext2 in hda2..

The initial process is initiated by the sys_mount () system calling function. The prototype declaration of this function is as follows:

asmlinkage long sys_mount(char * dev_name, char * dir_name, char * type,unsigned long flags, void * data);

The char * type parameter indicates the file system type string to be installed, which is "ext2" for the ext2 file system ". The flags parameter is the number of pattern identifiers during installation, which is the same as the following data parameter. This document does not focus on it.

To help readers better understand this process, I will use a specific example to describe how to prepare a 2nd partition (hda2) for the self-owned hard disk in the future) the ext2 File System on is installed in the "/dev" directory created earlier. The call to the sys_mount () function is as follows:

sys_mount("hda2","/dev ","ext2",…);

After copying these parameters from the user's memory space to the kernel space, the function calls the do_mount () function to install the file system. Similarly, in order to easily describe and clarify the main process, the subsequent instructions will not strictly follow the specific function call details.

The do_mount () function first calls the path_lookup () function to obtain information about the Installation Point, as described during Directory Creation, the information of this installation point is recorded in a variable of the struct nameidata type. For the convenience of description, the variable is recorded as nd. In this example, when the path_lookup () function returns, the following information is recorded in the nd: nd. entry = new_entry; nd. mnt = mnt; variables 3 and 4 are shown here.

The do_mount () function then calls one of the following four functions based on the call parameter flags: do_remount (), do_loopback (), do_move_mount (), and do_add_mount ().

In our current example, the system will call the do_add_mount () function to install an actual File System in the VFS Tree "/dev. In do_add_mount (), two important tasks are completed: obtain a new installation area block, and add the new installation area block to the installation system linked list. They are completed by calling the do_kern_mount () and graft_tree () functions. The description here may be a bit abstract, such as the installation of area blocks and the installation of system linked lists, but don't worry, because they are all defined by the author, wait until there is a special chart explanation, then you will be clear.

The do_kern_mount () function is used to create a new installation area block. The specific content has been described in the previous section VFS directory tree creation.

The graft_tree () function is to add a struct vfsmount type variable returned by the do_kern_mount () function to the installation system linked list, And graft_tree () add the newly assigned struct vfsmount type variable to a hash table. The purpose will be shown later.

In this way, when the do_kern_mount () function returns, the relationship between the new data structures is shown in Figure 4. The data structure in the Red Circle area is called the installation area block, which may be called e2_mnt as the pointer to the installation area block. The Blue Arrow curve forms the so-called installation system linked list.

After clarifying the data structure relationships formed after these functions are called, let's go back to the issues mentioned at the beginning of this chapter, after the ext2 file system is installed on "/dev, how to convert operations on this directory to operations on the ext2 file system. As shown in figure 5, the call to the sys_mount () function does not directly change the I _op and I _fop pointers in the inode (that is, the new_inode variable in the figure) structure corresponding to the "/dev" directory, in addition, the "/dev" corresponding dentry (that is, the new_dentry variable in the figure) structure is still in the directory tree of VFS and is not hidden from it. Correspondingly, the e2_entry corresponding to the root directory of the ext2 File System on hda2 is not replaced by new_dentry in the vfs directory tree as I originally thought, so how is the transformation implemented?

Please note the following code:

while (d_mountpoint(dentry) && __follow_down(&nd->mnt, &dentry));

This code is called in the link_path_walk () function, and the link_path_walk () function will eventually be called by the path_lookup () function. If you have read the Linux code about the file system, we should know that the path_lookup () function is an important basic function in the tedious File System Code of Linux. To put it simply, this function is used to parse the file path name. The file path name here is the same as what we usually use in applications, for example, open or read a file/home/windfly in a Linux application. in cs,/home/windfly. cs is the file path name. The responsibility of the path_lookup () function is to search for the file path name until the dentry or target corresponding to the directory to which the target file is located is directly a directory, I don't want to explain this function in detail in a limited space. Readers only need to remember that path_lookup () will return a target directory.

The above code is so inconspicuous that it is often ignored when I first read the code of the file system, however, as mentioned above, the conversion from VFS operations to actual file system operations is done by it, and the installation of file systems in VFS is indispensable. Now let's take a closer look at the Code Section: d_mountpoint (dentry), which only returns the value of the d_mounted member variable in dentry. The dentry here is still something on the VFS directory tree. If a directory on the VFS directory is installed once, the value is 1. You can install a directory in VFS multiple times. The following example shows the situation. In our example, d_mounted = 1 in new_dentry corresponding to "/dev", so the first condition in the while loop is met. Next let's take a look at what the _ follow_down (& nd-> mnt, & dentry) code has done? At this point, we should remember that the dentry member in nd is new_dentry, And the mnt member in nd is mnt. So now we can put _ follow_down (& nd-> mnt, & dentry) to _ follow_down (& mnt, & new_dentry). Next we will rewrite the code of the _ follow_down () function (just remove some irrelevant code, to facilitate the description, the serial number is added before some code lines) as follows:

static inline int __follow_down(struct vfsmount **mnt, struct dentry **dentry){struct vfsmount *mounted;[1]mounted = lookup_mnt(*mnt, *dentry);if (mounted) {[2]*mnt = mounted;[3]*dentry = mounted->mnt_root;return 1;}return 0;}

The lookup_mnt () function in the code line [1] is used to find the pointer to the installation region block when a directory under a VFS directory was last mounted, in this example, e2_mnt in Figure 5 is returned. The search principle is described roughly here. Remember that when we install the ext2 File System to "/dev", the graft_tree () function will be called later, in this function, the installation area block pointer e2_mnt in Figure 5 is mounted to a hash table (called mount_hashtable in Linux 2.4.20 source code, the key value of this item is generated by the dentry (new_dentry in this example) and mount (mnt in this example) corresponding to the installed point, so naturally, when we know that a dentry has been installed in the VFS tree (this dentry becomes an installation point) and want to find the latest installed installation zone block pointer, similarly, the one-key value is generated by the dentry and mount values corresponding to the installation point, and mount_hashtable is indexed with this value, naturally, you can find the head pointer of the linked list formed by the block pointer of the installation area corresponding to the installation point, and then traverse the linked list. When it is found that the block pointer of a certain installation area is recorded as p, when the following conditions are met:

(p->mnt_parent == mnt && p->mnt_mountpoint == dentry)

P indicates the block pointer of the installation area corresponding to the installation point. After finding the pointer, replace the mnt member in nd with the block pointer of the installation region, and replace the dentry member in nd with the dentry pointer in the installation region block. In our example, the e2_mnt-> mnt_root Member points to the e2_dentry directory of the ext2 file system. In this way, when the path_lookup () function finds "/dev", the dentry member in the nd is e2_dentry, instead of the original new_dentry, And the mnt member is replaced by e2_mnt, the conversion is completed without knowing it.

Now let's take a look at the situation of multiple installations at a certain Installation Point. For example, after installing an ext2 File System on "/dev, then install an ntfs file system on it. Before installation, you will also call the path_lookup () function to search for the path of the Installation Point. However, this time, because the ext2 file system has been installed on the "/dev" directory, so the information returned by nd is: nd. dentry = e2_dentry, nd. mnt = e2_mnt. It can be seen that during the second installation, the installation point has changed from dentry to e2_dentry. Next, similarly, the system will assign an installation region block. Assume that the pointer of the installation region block is ntfs_mnt, And the dentry in the region block is ntfs_dentry. The parent pointer of ntfs_mnt points to e2_mnt, and mnt_root in mnfs_mnt points to ntfs_dentry, which represents the root directory of the ntfs file system. Then, the system generates a new hash key value through e2_dentry and e2_mnt. Using this value as the index, ntfs_mnt is added to mount_hashtable, and the member d_mounted value in e2_dentry is set to 1. The installation process ends.

As you may already know, the last installation on the same installation point will hide several previous installations. Here we will explain this process through the above example:

After the ext2 and ntfs file systems are installed in the "/dev" directory, we call the path_lookup () function to search for "/dev, the function first finds the dentry and mnt corresponding to the installation point "/dev" under the VFS directory. At this time, it finds that the d_mounted in the dentry member is 1, so it knows that a file system has been installed on the dentry, so it generates a hash value through dentry and mnt, and searches for mount_hashtable Based on the installation process, it should be able to find the e2_mnt pointer and return it, and the original dentry has been replaced with e2_dentry. Let's look back at the following code: while (d_mountpoint (dentry) & _ follow_down (& nd-> mnt, & dentry); when the first loop ends, nd-> mnt is already e2_mnt, while dentry is e2_dentry. In this case, because the value of member d_mounted in e2_dentry is 1, the first condition of the while loop is met. You must continue to call the _ follow_down () function, which has been analyzed before, after the result is returned, the nd-> mnt becomes ntfs_mnt, And the dentry is changed to ntfs_dentry. Because ntfs_dentry has not been installed with other files, its member d_mounted should be 0 and the loop ends. The path_lookup () function initiated for "/dev" finally returns the dentry corresponding to the root directory of the ntfs file system. This is why "/dev" itself and ext2 installed on it are hidden. If you run the ls command on the "/dev" directory, all the files and directories under the root directory of the ntfs file system installed will be returned.

6. Install the root file system

With the foundation of Chapter 4 above, it is not difficult to understand the installation of the root file system in Linux, because the installation process of a file system to a certain installation in VFS is the same after all.

This process is roughly as follows: First, determine the source of the ext2 file system to be installed, and then determine the installation points of the ext2 File System in VFS, and then the specific installation process.

For the first question, there is a lot of code to solve the Linux 2.4.20 kernel. I don't want to describe this process here, remember that it is the solution where to find the file system to be installed. Here we may consider that the root file system to be installed comes from the first partition of the primary hard disk hda1.

For the second question, the Linux 2.4.20 kernel installs the ext2 file system from hda1 to the "/root" directory in the VFS directory tree. In fact, it is not important to install the ext2 File System under the VFS directory tree (except for the root directory of VFS), as long as this installation point exists in the VFS tree, and the kernel has no other purpose for it. If you like it, you can create a "/Windows" directory in VFS and install the ext2 file system as the root directory for future user processes. The key to the problem is to set the root directory of the process and the current working directory, because after all, only the user process is used to focus on the actual file system, you need to know that the author's article is saved to the hard disk.

In Linux, you can set the current working directory of a process by calling sys_chdir. During initialization, after Linux installs the ext2 File System on hda1 to "/root", it calls sys_chdir ("/root") to run the current process, that is, the current working directory (pwd) of the init_task process is set to the root directory of the ext2 file system. Remember that the root directory of the init_task process is still the dentry in Figure 3, that is, the root directory of the VFS tree, in the future, all processes in the Linux World will be derived from the init_task process and will inherit the root directory of the process without exception. If so, this means that when a user process searches for a directory from the root directory, it actually starts from the root directory of VFS, but in fact it searches from the root file of ext2. The contradiction is solved by the following two functions called by the system after the mount_root () function is called:

sys_mount(".", "/", NULL, MS_MOVE, NULL);sys_chroot(".");

The main function is to convert the root directory of the init_task process into the root directory of the installed ext2 file system. Interested readers can study this process on their own.

Therefore, in the user space, we can only see one leaf of the big tree VFS, and the file system has been installed. In fact, it is invisible to the user space. I think VFS is more used by the kernel to implement its own functions, and is provided by user processes in the form of system calls. As for the installation of different file systems implemented on it, it is only one of the functions.

The development of the application layer does not need to care about the specific implementation of the VFS source code. You only need to know the interface functions of various file systems provided by VFS.

Reference

Http://www.ibm.com/developerworks/cn/views/linux/libraryview.jsp

From the network, reproduced please indicate the source: http://blog.csdn.net/suool/article/details/38172057

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.