Linux virtual file system four main objects:
1) Super Block
2) Index node (inode)
3) Catalog entry (dentry)
4) Document object (file)
A process that refers to various objects as it operates on a file is as follows:
by Task_struct get files_struct, then through the file descriptor (int fd) to obtain the corresponding file object (**FD), then get the Directory item object (Dentry), and finally get the Index node object (inode), The object includes actions related to the operation of the file, which are inherited from the Super object block. It is related to the specific file system.
First, Super Block :
Super Block Object (Super_block): Stores control information for an installed file system (file system status, file system type, block size, number of chunks, index node count, dirty flag, action method), which represents an installed file system, each time an actual file system is installed, The kernel reads some control information from a specific location on the disk (the disk's Super block location) to populate the in-memory super-block object. An installation instance and a super block object one by one correspond. The Super Block records the file system type to which it belongs through a domain S_type in its knot. Even if two identical file systems (File_system_type) are installed, there will be two super blocks (two disks and memory). Disk
Super Block Main method: The method set mainly includes the operation of the inode and the operation of Super_block.
Alloc_inode: Initializing an Index node object
Read_inode: Reads the index node from disk and populates the in-Memory Index node object
Write_inode: Writes the given index node to disk, which represents the actual creation of a file
Write_super: Update the Super Block object to disk
A super block corresponds to a filesystem (the file system type already installed, such as ext2, here is the actual file system, not the VFS). File systems are used to manage the data formats and operations of these files, system files have their own file system files, and for different disk partitions there can be different file systems. Then a super block for a standalone file system. Save the file system type, size, state, and so on.
(" file System" and "File system type" are not the same!) A file system type can include many file systems i.e. many Super_block)
Since we know that there are different super_block for different file systems, the operation of the different super_block must be different, so we can see the abstract struct structure described above in the Super_block structure below ( For example, the following: struct super_operations):
s_list: Pointer to the Super block list, this struct list_head is a familiar structure, which is actually the prev and next fields for connecting relationships.
The kernel of the structure of the processing is fastidious (also said in the kernel protocol stack), the kernel alone with a simple structure of all the super_block are linked together, but this structure is not super_block itself, because its data structure is too large, inefficient, all just use
struct
{
List_head prev;
List_head Next;
}
Such a structure to link the s_list in the Super_block, then after traversing to s_list, directly read Super_block such a long block of memory, you can put this
Super_block read it straight in! This is very quick and convenient! This is why s_list must be placed in the first field.
S_dev: Contains the block device identifier for the specific file system. For example, for/DEV/HDA1, its device identifier is 0x301
s_blocksize: Data block size in file system, in byte units
s_blocksize_bits: The upper size occupies a number of bits, for example 512 bytes is 9 bits
s_dirt: Dirty bit, identifies if super block is modified
s_maxbytes: Maximum allowable file size (in bytes)
struct File_system_type *s_type: File system type (which is the current file system type?). EXT2 or FAT32)
To differentiate between "file system" and "File system type"! A file system type can include many file systems that are many super_block, which will be said later!
struct super_operations *s_op: A collection of functions that point to a specific file system for a super block operation
struct dquot_operations *dq_op: A collection of functions that point to a specific file system for a quota operation
struct Quotactl_ops *s_qcop: A method for configuring disk quotas to handle requests from user space
s_flags: Installation identification
s_magic: Identities that differ from other file systems
s_root: directory entry that points to the specific file system installation directory
s_umount: Synchronization of Super Block read and write
s_lock: Lock flag bit, if this bit is placed, other processes cannot operate on the Super block
s_count: Usage count for the Super block
s_active: Reference count
s_dirty: Modified index node inode form a linked list, a file system has a lot of inode, some inode node content will be modified, then will be recorded first, and then write back to disk.
s_locked_inodes: The linked list formed by the index node to be synchronized
s_files: All of the linked lists that have open files, this file and the actual process-related
S_bdev: Block device that points to file system installation
u: U union domain includes super block information belonging to the specific file system
s_instances: The specific meaning will be said later! (The same type of file system connects all the Super_block through this sub-pier.)
s_dquot: Disk quota-related options
Second, index node Inode:
Index node Object (inode): stores information about files and directories (and the file itself is a two different concept.) It contains information such as file size, owner, creation time, disk location, file operation method, dirty flag, and so on, representing a real file, which is saved on disk. When a file is first accessed, the kernel assembles the corresponding index node object in memory to provide the kernel with all the information necessary to operate on a file. Disk
The main methods of the index node include the creation of the inode, deletion of the directory, symbolic connection, and so on.
int Create (struct inode *dir, struct dentry *dentry, int mode): Called by Create or open system to create a new index node for the Dentry object.
struct Dentry * Lookup (struct inode *dir, struct dentry *dentry): The index node is found by the specified directory entry.
Mkdir (dir, dentry, mode): Called by the system call Mkdir to create a new directory
What is saved is actually some information about the actual data, which is called "metadata" (that is, the description of the file attributes). For example: File size, device identifier, user identifier, user group identifier, file mode, extended attributes, file read or modified timestamp, number of links, pointers to disk chunks that store the content, file classification, and so on.
(Note data partitioning: Meta data + data itself)
Also note : There are two types of inode, one is the inode of the VFS, and one is the inode of the specific file system. The former is in memory and the latter is on disk. So each time the inode redeployment in the disk is populated with the inode in memory, this is the use of the disk file Inode.
Notice how the Inode is generated : the size of each inode node, typically 128 bytes or 256 bytes. The total number of inode nodes, given at the time of formatting (modern OS can dynamically change), usually set an inode every 2KB. The general file system rarely has less than 2KB of files, so the reservation in accordance with the 2KB, the general inode is not finished. Therefore, the inode will have a default number when the file system is installed, and the latter will change according to the actual needs.
Note inode number : The inode number is unique and represents a different file. In fact, in the internal Linux, access to files are through the inode number, the so-called file name is only easy for users to use. When we open a file, first, the system finds the inode number corresponding to the file name, and then, through the inode number, the inode information, and finally, the inode to find the file data block, can now process the file data.
Inode relationship to a file: When a file is created, an inode is assigned to the file. An inode only corresponds to one actual file, and one file will have only one inode. The maximum number of inodes is the maximum number of files.
Explain some of the fields:
I_hash: Pointing to the hash list pointer, for the INODE hash table, the following will say
i_list: Pointer to the index node link table for the connection between the inode, which will say
i_dentry: Point to the Directory Necklace table pointer, note that a inodes can correspond to multiple dentry, because an actual file may be linked to another file, then there will be another dentry, This list is linked to all the dentry associated with this inode.
i_dirty_buffers and i_dirty_data_buffers: Dirty data buffers
I_ino: Index node number, each inode is unique
i_count: Reference count
I_dev: If the inode represents a device, then it is the device number
i_mode: Type of file and access rights
I_nlink: Number of files linked to the node (number of hard links)
i_uid: File owner designator
I_gid: The group label of the file
i_rdev: Actual device identification
Note the difference between I_dev and I_rdev: If an ordinary file, such as a disk file, is stored on a disk, then I_dev represents the disk number that holds the file, but if this is a special file such as the disk itself (because all devices are considered file-handling), then I_ Rdev represents the actual disk number of this disk.
i_size: The size, in bytes, of the file represented by the Inode
i_atime: File Last access time
i_mtime: Last modified time for file
i_ctime: Inode Last Modified time
i_blkbits: Block size, byte units
i_blksize: block size, bit unit
i_blocks: Number of blocks in the file
i_version: Version number
i_bytes: The number of bytes in the last block in a file
I_sem: points to the semaphore structure for synchronous operation
I_alloc_sem: Protect IO operations on inode from being interrupted by another
I_zombie: Zombie inode Signal Volume
i_op: Index node operation
I_FOP: File operations
I_SB: The super-block pointer to the file system to which the inode belongs
i_wait: Point to index node wait queue pointer
i_flock: File chain table
Note the following: Address_space does not represent an address space, but is used to describe pages in the page cache. A file corresponds to a address_space, a address_space, and an offset can determine the page in a page cache.
i_mapping: Indicates to whom the page is requested
i_data: A page that is read and written by the Inode
i_dquot: Disk quotas for Inode
About disk quotas: In a multitasking environment, the disk usage limits for each user are mandatory and play a fair role.
There are two types of disk quotas: block limit and inode limit, and for a special file system, the quota mechanism used is the same, so the limit operation function
Put it in the Super_block and ok!.
i_devices: Device Chain list. A linked list of devices that share the same driver.
i_pipe: Point to Pipe file (used if file is a pipe file)
I_bdev: pointing to the block device file pointer (used if the file is a block device file)
I_cdev: pointing to the character device file pointer (used if the file is a character device)
i_dnotify_mask: Directory Notification event mask
i_dnotify: for directory notifications
i_state: Status ID of the index node: i_new,i_lock,i_freeing
i_flags: The installation identity of the index node
I_sock: True if it is a socket file
i_write_count: Record How many processes open this file in write mode
i_attr_flags: File creation identity
i_generation: Reserved
u: specific inode information
note the four linked lists that manage the Inode :
inode_unused: Link up inode that is not currently in use (via i_list domain link)
Inode_in_use: The inode currently in use is linked (via the I_list domain link)
S_dirty in Super_block: Link all modified inode, this field in Super_block (linked by i_list domain)
Inode_hashtable: note In order to expedite the search efficiency of the inode, the inode and the dirty inode being used will also be placed in a hash structure such as inode_hashtable,
However, the hash values of the different inode may be equal, so the inode with equal hash value is connected by this I_hash field.
Third, Catalog Entry :
Catalog Item Object (Dentry): It represents a catalog item (including the directory object corresponding to the index node, the subdirectory linked list, the parent directory item object, the directory item object linked to its sibling directory, the use count, the cache flag), is an integral part of the path (note: Each component in the path is represented by an index node object). The object is stored in memory only. The concept of introducing catalog items is primarily for the purpose of finding files conveniently. Each component of a path, whether it is a directory or an ordinary file, is a directory item object. For example, in path/home/source/test.c, directory/, home, source, and file test.c all correspond to one directory entry object. Unlike the previous three objects, the Catalog item object does not have a corresponding disk data structure, and the VFS parses the path name into the directory item object one by one in the process of traversing the pathname, and uses the caching mechanism to improve the speed of the lookup (the Catalog item object corresponds to the index node object one by one, that is, it also represents a file, The file can be a normal file or a directory file, etc.). Memory
There are three states of a catalog item object: Used, unused, and negative states
Used: A used directory entry corresponds to a valid index node (d_inode points to the corresponding index node) and indicates that the object exists with one or more users (D_count is positive). A directory entry is in use, meaning that it is being used by VFS and points to a valid index node and therefore cannot be freed.
Unused: An unused directory entry corresponds to a valid index node (d_inode points to the corresponding index node), but it should be indicated that VFS is not currently using it (D_count is 0). The catalog item object still points to a valid object and is left in the cache for use when it is needed. Because this catalog item is not destroyed prematurely, you do not have to recreate it when you need it later, which makes the path lookup faster. However, if you want to reclaim memory, you can destroy unused directory entries.
Negative state: There is no valid index node for the corresponding. Because the index node has been deleted or the path is incorrect, the catalog entries remain so that the subsequent path queries can be resolved quickly.
The above three types of catalog items are cached in the catalog item cache, and the hash table is cached. In addition, if the directory entry is cached and is in use, the corresponding index node is also cached.
A directory entry is a logical attribute of a description file that exists only in memory and does not have a description on the actual disk, rather, a directory entry cache that exists in memory, and is designed to improve lookup performance. Note that both the folder and the final file belong to the catalog item, and all the catalog items together form a large directory tree. For example: Open a file/home/xxx/yyy.txt, then/, home, XXX, Yyy.txt is a directory entry, the VFS at the time of the search, based on a layer of directory entries to find the corresponding inode for each directory entry, then follow the directory entry to find the final file.
Note: The directory is also a file (so there is also a corresponding inode). Opening the directory is essentially opening the directory file.
Explain some of the fields:
d_count: Reference count
d_flags: Directory item cache identifier, dcache_unused, dcache_referenced, etc.
D_inode: inode associated with this catalog item
d_parent: directory entry for parent directory
D_hash: The kernel uses dentry_hashtable to manage Dentry, dentry_hashtable is a linked list of List_head, and after a dentry is created, it is
The D_hash link enters the linked list of corresponding hash values.
d_lru: Linked list of recently unused catalog items
d_child: Directory entries are added to the parent directory by this D_subdirs
d_subdirs: All children in this directory linked list header
D_alias: An effective dentry must be associated with an inode, but an inode can correspond to multiple dentry, because one file can be linked to other files, so This dentry is linked to the I_dentry linked list in its own inode structure through this field. (as mentioned in the Inode)
d_mounted: The number of file systems installed in this directory! Note that a file directory can have different file systems!
d_name: directory entry name
d_time: Time to become valid again! Note that the dentry is valid if the operation succeeds, otherwise it is invalid.
d_op: Catalog Item operation
D_SB: The super block of the file system to which this directory entry belongs
d_vfs_flags: Some signs
d_fsdata: File System Private Data
d_iname: storing short file names
Some explanations: An effective dentry structure must have an inode structure, because a directory item either represents a file or represents a directory, and the directory is actually a file. So, as long as the dentry structure is valid, its pointer d_inode must point to an inode structure . But the inode can correspond to multiple
Dentry, the above has been said two times.
Note: The whole structure is really a tree.
Four, File Object :
File object: Is an in-memory representation of an open file (including the corresponding directory item object, usage count, access mode, current offset, action method, and so on), which is used primarily to establish the correspondence between the process and the files on disk. It was created by Sys_open () and destroyed by Sys_close (). The relationship between a file object and a physical file is a bit like the relationship between a process and a program. When we stand in the user space to look at the VFS, we are like just dealing with the file object, without caring about the Super block, index node, or directory entry. Because multiple processes can open and manipulate the same file at the same time, there may be multiple corresponding file objects for the same file. A file object represents an open file only in the process view, which in turn points to the Directory item object (which in turn points to the index node). A file corresponding to a file object may not be unique, but its corresponding index node and directory item object is undoubtedly unique. Memory
File Operation Method:
Llseek: Update offset
Read, write, open, Mmap, Aio_read, Fsync
Files_struct: A collection of file objects that the process opens. Although the file descriptor (int) is obtained by using open, it corresponds to one by one (the file descriptor is the subscript of the struct file * * * FD array in file_struct). The structure is in the task_struct.
Note The file object describes a file that the process has opened. Because a file can be opened by multiple processes, a file can have multiple file objects. But because the file is unique, then the inode is unique, the catalog item is also set!
The process is actually manipulating files through file descriptors, noting that each file has a 32-bit number that represents the next read and write byte position, which is called the file location. In general, after opening a file, open the bit
Explain some of the fields:
f_list: All the open files form the linked list! Note that all open files of a file system are linked to the S_files list in the Super_block by this link!
f_dentry: Dentry associated with this file
f_vfsmnt: The installation point of the file in this file system
f_op: File operation, the I_fop file operation in the associated inode of this file initializes the F_op field when the process opens the file.
f_count: Reference count
f_flags: The identity specified when opening a file
f_mode: access mode for files
f_pos: Offset from the relative beginning of the current file
unsigned longf_reada, F_ramax, F_raend, F_ralen, F_rawin: Read-ahead flag, maximum number of pages to read, last read-ahead file pointer, number of read-ahead bytes, and number of pre-read pages
f_owner: Record a process ID and signal to the ID process when something is sent
f_uid: User ID
f_gid: Group ID
f_error: write operation error code
f_version: Version number, when F_pos changes, version increments
private_data: Private data (file system and driver use)
focus on explaining some important fields :
First, F_flags, F_mode, and F_pos represent the control information that this process currently operates on this file. This is important because for a file that can be opened simultaneously by multiple processes, the operation of this file is asynchronous for each process, so this three field is important.
Second: For reference counting F_count, when we close a process of a file descriptor, in fact, is not really closed file, just F_count minus one, when f_count=0, will really go to close it. For dup,fork these operations, will make f_count increase, specific details, and later.
The third: F_op is also very important! Is the operation structure that involves all the files. For example, a user using read will eventually invoke read operations in File_operations, whereas the file_operations struct is not necessarily the same for different file systems. Inside an important operation function of the release function, when the user executes close, in fact, in the kernel is the implementation of the release function, this function will only f_count minus one, which explains the above, the user close a file is actually the f_count minus one. Only the reference count is reduced to 0 to close the file.
Note: for "in use" and "unused" file objects are managed using a two-way linked list, respectively.
Note that the file above is only for a document, for a process (user) can process multiple files at the same time, so need another structure to manage all the files!
That is, the user opens the file table--->files_struct
Explain some of the fields:
count: Reference count
File_lock: Lock, protect the following fields
Max_fds: The maximum number of current file objects
max_fdset: Maximum number of file descriptors
next_fd: The largest assigned file descriptor +1
FD: Pointer to an array of file object pointers, typically pointing to the last field Fd_arrray, when the number of files exceeds Nr_open_default, an array is reassigned, and then points to the new array pointer!
close_on_exec: File descriptor to be closed when exec () is executed
Open_fds: Pointer to open file descriptor
close_on_exec_init: A file descriptor initialization value that needs to be closed when exec () is executed
open_fds_init: File description Fu Yan value Collection
Fd_array: Initializing an array of file object pointers
Note that the above file and Files_struct record information about the files associated with the process, but for the process itself, some of its own information is expressed in terms of what is involved in the fs_struct structure.
Explain some of the fields:
Count: Reference count
Lock: Protection lock
Umask: Default file access permissions when opening a file
Root: Directory of the process
PWD: The current execution directory of the process
Altroot: User-Set Replacement root directory
Note : These three directories are not necessarily in the same file system when actually running. For example, the root directory of a process is usually the Ext file system installed on the "/" node, and the current working directory may be a file system installed in/etc, and the replacement root directory can also be in a different file system.
ROOTMNT,PWDMNT,ALTROOTMNT: Corresponds to the above three mounting points.
It is known that the process is using a domain files_struct files in task_struct to understand the file object it is currently opening, whereas the file descriptor that we typically call is actually the index value of the file object array that the process opened. The file object finds its corresponding Dentry object through the domain F_dentry, and the Dentry object's domain D_inode finds its corresponding index node (through the index node and can get the Super block information, it can get the method of the final operation file, In the case of an open file, this process is used, which establishes the association of the file object with the actual physical file. Finally, it is also important that the file object's list of file operation functions is obtained through the domain I_fop of the index node, and I_FOP is eventually initialized by the struct super_operations *s_op.
File system VFS Data structure (super-block inode dentry file) (Collect collation)