VFS File System Structure Analysis and vfs System Structure Analysis

Source: Internet
Author: User
Tags symlink

VFS File System Structure Analysis and vfs System Structure Analysis

This article was originally published by fireaxe and can be freely copied and reproduced using GPL. However, for reprinting, please maintain the integrity of the document and indicate the original author and original link. The content can be used at will, but no guarantee is made for the consequences caused by the use of the content.

Author: fireaxe_hq@hotmail.com blog: fireaxe.blog.chinaunix.net
VFS is a core concept of Linux. Most operations in linux require VFS-related functions. VFS is briefly described from the user's point of view. Users not only need to know which file operation functions are available in Linux, but also need to have a clear understanding of the VFS structure to better use it. For example, if hard link and symbolic are not understood about the VFS structure, they cannot be used.
This article first sets up a simple directory model, then introduces the structure of the directory in VFS, and finally summarizes how to use various file operation functions.


In line with the principle of simple use, we mainly use the method of analysis and speculation. In view of my limited level, this article will inevitably have some errors. You are welcome to read rationally and criticize it boldly. Your criticism is the driving force of my progress.

1. directory model

The following directory is used as an example.

Dir is the first level directory. dir contains two subdirectories, subdir0 and subdir1, and a file file0. "Subdir0" contains two files: file1 and file0. Subdir1 has a file file3.

2. Concept of VFS

VFS is a Virtual File System in Linux, also known as Virtual File System Switch ). It provides an abstraction for application programmers and shields the differences between various underlying file systems. As shown in:

Different file systems, such as Ext2/3, XFS, and FAT32, have different structures. If you call open and other file IO functions to open files, the specific implementation will be very different. To avoid this difference, Linux introduces the concept of VFS. As a result, Linux has built a new file system stored in memory. All other file systems must be converted to the VFS structure before they can be called by users.

3. Build VFS

The so-called VFS construction is the process of loading the actual file system, that is, the process of mount being called. As shown in, the following uses an ext2 file system as an example.

This is a simplified Ext2 disk structure and is only used to describe the basic process of using it to build VFS.

The general form of the mount command is: mount/dev/sdb1/mnt/mysdb1

/Dev/sdb1 indicates the device name and/mnt/mysdb1 indicates the mount point.

The basic structure of the VFS file system is the dentry structure and inode structure.

Dentry indicates a point in a file directory, which can be a directory or a file.

Inode represents a file on the disk, which corresponds to disk files one by one.

Inode and dentry do not necessarily correspond to each other. One inode may correspond to multiple dentry items. (Hard link)

During Mount, linux first finds the super block of the disk partition, and then parses the inode table and file data of the disk to construct its own dentry list and indoe list.

Note that VFS is actually built in Ext mode, so the two are very similar (after all, Ext is a Linux native File System ).

For example, inode nodes, Ext and VFS call the file management structure inode, but they are actually different. The inode node of Ext is on the disk, and the inode node of VFS is in the memory. Some member variables in Ext-inode are useless, such as reference counting. They are retained to be consistent with vfs-node. In this way, when using ext-inode nodes to construct vfs-inode nodes, you do not need to assign values one by one, and only need to copy the memory once.

If a non-EXT disk is used, it is not so lucky. Therefore, the mount of a non-EXT disk will be slower.

4. VFS Structure

After the VFS file system is built, the next step is to map the directory model mentioned in section 1 to the VFS struct.

As mentioned above, VFS mainly consists of denty and inode. Dentry is used to maintain the directory structure of VFS. Each dentry item represents the item we read when using ls (each directory and each file corresponds to a dentry item ). Inode is a file node, which corresponds to files one by one. In Linux, a directory is also a file, so dentry also corresponds to an inode node.

Is the structure of the directory model in VFS in section 1.

 

5 Dentry cache

Each file must correspond to one inode node and at least one dentry item. Suppose we have a GB hard disk, which is filled with empty files. How much memory does it need to reconstruct VFS?

The file occupies at least 1 block (generally 4 K ). If a false dentry and an inode need 100 bytes, The dentry and inode need to occupy 1/40 of the space. GB hard drive requires 2g space. I have recently started to replace the 1 TB hard drive. It takes 25 GB of memory to put down inode and dentry. I believe few computers can afford it.

To avoid resource waste, VFS adopts the dentry cache design.

When a user uses the ls command to view a directory or open a file, VFS creates a dentry item and inode for each directory item and file used here, that is, "create on demand ". Then, a LRU (Least Recently Used) list is maintained. When Linux considers that VFS occupies too many resources, VFS releases dentry and inode items that have not been Used for a long time.

Note that the release is based on memory usage. From a Linux perspective, dentry and inode are inherent in VFS. The difference is whether VFS reads dentry and inode to the memory. For Ext2/3 file systems, the process of building dentry and inode is very simple, but for other file systems, it will be much slower.

After understanding the concept of Dentry cache, we can understand why there are two file locating methods below.

6. Locate the file without dentry

Because of the Dentry Cache mentioned above, VFS cannot ensure that the dentry and inode items are available at any time. The following describes how to locate an entry without a dentry or inode entry.

To simplify the problem, we assume that the dir dentry item has been found (the process of finding the dentry item will be explained later ).

First, find the inode0 node through the dentry0 corresponding to the dir. With the inode node, you can read the information in the directory. The directory contains the list of the next-level directories and file files, including the name and inode number. This information is actually viewed using the ls command. "Ls-I" displays the inode Number of the file.

> Ls-I

975248 subdir0 975247 subdir1 975251 file0

 

Then, inode2 is reconstructed based on the inode number corresponding to subdir0, And the dentry node of subdir0 is rebuilt through file data (the directory is also a file) and inode2: dentry1.

> Ls-I

975311 file1 975312 file2

 

Then, inode4 is rebuilt based on the inode number corresponding to file1, And the dentry node of file1 is rebuilt through file data and inode4.

Finally, you can access the file through the inode4 node.

Note: The inode number corresponding to the file is determined, but the inode struct needs to be re-constructed.

7. Locate the file when dentry is available

Once the Dentry item is set up in the dentry cache, the next access will be very convenient.

A key variable in Dentry is d_subdirs, which stores the list of the next-level directories for Fast File locating.

First, find the dentry item named "subdir0" in d_subdirs that represents dentry0 in the dir directory and find dentry1.

In dentry1, find the dentry item named "file1" and find the dentry item corresponding to file1,

Finally, inode4 corresponding to file1 is obtained through the dentry item corresponding to file1.

Compared with the absence of dentry items, operations with dentry items are much simpler.

 

8 Symbolic link

The command for creating symboliclink is the target file of the ln-s source file.

The symbolic link in Linux is similar to the shortcut in Windows. As shown in, symlink1 is the symbolic link pointing to file1. Symlink1 is also a file, so it has its own independent inode node. In symlink, the relative path of the source file is actually stored.

Most file operations directly perform operations on the target pointed to by the symbolic link, such as open ("symlikn1"). In fact, file3 is opened.

What will happen if file3 is not there? The open function will still open the file according to the file path in symlink1. But file3 does not exist. Therefore, an error is reported indicating that the file does not exist.


9 hard link

In addition to symbolic link, Linux also has the concept of hard link.

Hard link creation is actually a copy of The dentry item, all of which point to the same inode node. When we use write to rewrite the content of file1, the content of hardlink1 will also be rewritten, because they are actually the same file.

As shown in, hardlink1 is a hard link of file1. They all point to the same inode1 node. Inode1 has a counter used to record several dentry items pointing to it. Deleting any dentry does not cause inode1 to be deleted. Inode1 is deleted only when all dentry pointing to inode1 is deleted.

They actually

In a sense, all dentry items are hard links.

 

10 process management of files

The process control block task_struct contains two variables related to files: fs and files.

Files stores the root and pwd pointer to the dentry item. When the user sets the path, the absolute path will be located through root; the relative path will be located through pwd. (Root of a process is not necessarily the root directory of the file system. For example, the root directory of the ftp process is not the root directory of the file system, so that you can only access the content under the ftp directory)

Fs is a file object list. Each node corresponds to an opened file. When a process locates a file, it constructs a file object and associates it with the inode node through f_inode. When the file is closed, the process releases the corresponding file object. F_mode in File object is the permission selected when it is enabled, and f_pos is the read/write location. When a file is opened multiple times, a new file object is created each time. Each file object has an independent f_mode and f_pos.



11 open process

Opening a file involves a series of structural adjustments, which are described in steps below:

First, create a file management structure, as shown in. The process has opened two files, and then we open a new file.


 

Step 1: Find the file;

From the above, we can locate the inode node of our file and find the inode node.

 

Step 2: create a file object;

Create a new file object, put it in the file object list, and point it to the inode node.

 

Step 3: create a file descriptor

File descriptor is the fd_array maintained in files in the process control module task_struct. Because it is an array, file descriptor has already allocated space in advance. Here, we need to associate an idle file descriptor with a file object. The index number of the file descriptor in the array is the file fd obtained when the file is open.

12 open and dup

The same file can be opened multiple times, as shown in the structure. A new file descriptor and file object will be created each time you open the file. Then point to the inode node of the same file. If the open file and fd1 point to the same file, the newly created file object 2 and fd1 file object 2 point to the same inode2 node.

Linux also provides the dup function for copying file descriptor. Using dup will not create a new non-file object, so the newly created file descriptor and the original filedescriptor will point to the same file object at the same time. In, we get fd2 through dup (fd1), then fd2 and fd1 point to the same file object2.

Because a new object is generated after the two open operations, the file read/write attributes, file read/write location (f_pos), and other information are independent. After dup is used to copy file descriptor, because there are no independent objects, the attributes of a fd or the file read/write location will change accordingly.

13. Effect of Fork on file opening

The operation of Dup is similar to that of fork sub-process.

Is the file structure of an existing parent process:


The structure after fork is used is as follows. Similarly, no new file object is created. Therefore, when fd1 in parent process is moved (such as reading and writing), fd1 in child process is also affected. That is to say, the opened files list is not part of the process, so it will not be copied. Opened files list should be a global resource linked list, and the process maintains a pointer list fd table. Therefore, only the pointer list fd table is copied, not the opened files list.



14 file operation function Parsing

Through the above analysis, you can have a clearer understanding of the scope and usage of each function. Common file operations are listed below:

Function Name

Target object

Description

Creat

Dentry, inode

When a file is created, a new dentry and inode are created.

Open

File object

If the object does not exist and the O_CREAT parameter exists, the creat

Close

File object

Delete a file object, but does not delete the object.

State/lstate

Inode

Read inode content. If the target is a symbolic link, stat reads the content pointed to by the symbolic link; lstat reads the symbolic link file itself.

Chmod

File object

Change f_mode in file object

Chown/lchown

File object

Change f_uid and f_gid in file object

Truncate

Inode

Change the file length.

Read

File object

Reading a file changes f_pos in the file object.

Write

File object, inode

Writing a file changes f_pos in the file object, and changes the file content and update modification time.

Dup

File object

Create a new file object

Seek/lseek

File object

Change f_pos in file object

Link

Dentry

Create a new dentry entry pointing to the same inode node.

Unlink

Dentry

Delete A dentry item. If the inode node to which the dentry points is not used by other dentry items, delete the inode node and the disk file.

Rename

Dentry

Modify the d_name in the dentry phase

Readlink

-----------

Read cannot read the content of the symbolic link file. You must use readlink to read the content.

Symlink

Dentry, inode

The function is similar to creat, but the property of the created file is symbolic link.

Note: Disk Files correspond to inode nodes one by one. Therefore, disk files are not listed separately in the table.


Reference file:
Advanced Programming in the UNIX Environment (3rd) W. Richard Steven S & Stephen A. Rago
Understanding the Linux Kernel (3rd) Daniel P. Bovet & Marco Cesati


What is a virtual file system? What file systems are commonly used in windows? What file systems does Linux use?

Virtual File Systems
The Virtual File System (VFS) was created by Sun icrosystems when defining network file systems (NFS. It is a distributed file system used in the network environment. It is an interface that allows different file systems to be used as the operating system.
A Virtual File System (VFS) is an interface layer between a physical file system and a service. It abstracts all the details of each file system in Linux, in this way, different file systems run in Linux core and other processes. Strictly speaking, VFS is not an actual file system. It only exists in the memory and does not exist in any external storage space. VFS is established when the system is started and disappears when the system is shut down.
VFS makes it possible for Linux to install and support many different types of file systems at the same time. VFS has public interfaces for various special file systems, such as superblocks, inode, and file operation function portals. The details of the actual file system are centrally indexed by the public interface of VFS. They are transparent to the system core and user processes.
The functions of VFS include recording the types of available file systems, connecting devices with the corresponding file systems, processing some common file-oriented operations, and operations on file systems, VFS shadow them to the physical file systems related to control files, directories, and inode.
When a process releases a file-oriented system call, the core calls the corresponding functions in VFS, which process operations unrelated to the physical structure, and redirects it to the corresponding function calls in the real file system, which is used to process the operations related to the physical structure.

What modules does a typical embedded linux software consist? Can they interact with each other? What are the two phases of Bootloader? Minute

Describes the development and design process of embedded products based on the features of software and hardware design.
Project demonstration stage: analyze the feasibility of the project and form a feasibility study report.
System solution phase: Analyze and refine product requirements, abstract the list of functions to be completed, and clearly define the tasks to be completed.
System design phase: the software development module completes the software requirement analysis, forms the software overall design scheme, software development interface specification, and so on; The hardware module completes the hardware overall design scheme, interface definition and description.
Detailed product design stage: complete detailed software/hardware design, compile code, and form the design description of each software module; schematic diagram of each hardware board, PCB and material Sheet, complete the product structure design.
Manufacturing joint trial phase: completes product system debugging and reliability testing, and forms the corresponding system debugging report and reliability testing report.
Which modules are composed of typical embedded Linux software? What are its functions and relationships?
Bootloader, embedded Linux kernel, and embedded file system. Bootloader initializes the hardware device and loads the boot kernel. The kernel manages all the data and files in the entire system through the file system.
What are the two phases of BootLoader? What functions are implemented respectively?
Stage1 and stage2.
Completed tasks:
Hardware Device initialization.
Prepare the RAM space for loading the stage2 of Bootloader.
Copy the stage2 of Bootloader to the RAM space.
Set the stack.
Jump to the C entry point of stage2.
Stage2 jobs:
Initialize the hardware devices to be used in this phase.
Memory ing of the monitoring system.
Copy the kernel image and root file system image from the Flash device to the RAM space.
Set kernel startup parameters.
Call to start the kernel.
Describes the types and management mechanisms of Embedded file systems.
Ext2fs file system 2. Flash-based file system 3. RAM-based file system 4. Network File System.
Linux introduces virtual file system (vfs), which provides a unified application programming interface for various file systems.
This article describes how to understand the scalability and portability of consumer electronic product development, and uses the Linux system as an example.
For Linux, if we do not use an Ethernet device, we can remove the driver and related library files of the device to reduce the size.
Linux can run on CPU platforms of different architectures.
Describes in detail the compiling development environment and compiling development tools for Embedded Linux software development.
Development Environment: first, the Linux operating system must be installed on the host machine. Install the following three parts for the Linux system:
Function library (glibc): it is the main function library of C language in Linux.
Compiler (gcc): You can compile and link C, C ++, source programs, and target programs into executable files.
System header file (glibc_header): a collection of system-related header files.
Compilation and development tools: the editor includes Vi and Emacs. the compiler is GCC and is a powerful multi-platform compiler launched by GUN. The debugger is GDB, you can easily set debugging functions such as breakpoints and single-step tracking. The Project Manager "make" is used to control compilation or repeated compilation and automatically manage the software compilation content, methods, and timing.
How does the Logical space and physical space developed based on S3C2410 embedded Linux correspond? Detailed description.
On a 32-bit processor platform supporting MMU, the addresses of physical and virtual buckets in Linux systems are from 0x00000000 to 0 xFFFFFFFF, 4 GB in total, however, the physical storage space is completely different from the virtual storage space layout. Linux runs in a virtual bucket and maps the physical memory that actually exists in the system to the entire 4 GB virtual bucket according to different requirements.
N physical storage space layout
The physical storage space layout of Linux is related to the processor. For details, refer to the storage space distribution table (memory map) section in the processor user manual, here we only list the Linux physical memory of the embedded processor platform ...... remaining full text>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.