File systems within the Linux kernel

Source: Internet
Author: User

    • File descriptor

When it comes to files and file systems, it's subconsciously thought that they exist on disk. However, the disk can only be passive storage, but can not take the initiative to process the files, to process the files, can only copy their data into memory, to the CPU processing, processed data first written to memory, and then sent back to disk. So how does the operating system manage various files in memory? That is, the runtime file system representation in the kernel.
As we know, the process is the basic unit for the operating system to allocate resources, and the files are processed in the process. such as the use of vim to write code, then vim this program becomes a process in the operating system, vim processing object is a code file. The operating system with the PCB to manage the process, PCB from the point of view of the code is TASK_STRUCT structure, this structure has a pointer to the FILES_STRUCT structure, this is called the file descriptor, its code is as follows:

structfiles_struct {atomic_t count;/ * Number of processes sharing the table * /rwlock_t File_lock;/ * Protect all domains below to avoid nesting in Tsk->alloc_lock * /         intMax_fds;/ * Maximum number of current file objects * /         intMax_fdset;/ * Maximum number of current file descriptors * /         intNEXT_FD; /* Assigned file description multibyte1*/structFILE * * FD;/ * Pointer to an array of file object pointers * /Fd_set *close_on_exec;/* File descriptor to close when executing exec () */Fd_set *open_fds;/ * Pointer to open file descriptor * /Fd_set Close_on_exec_init;/* The initial set of file descriptors that need to be closed when exec () is executed */Fd_set Open_fds_init;/ * The set of initial values for the file descriptor * /         structFile * fd_array[ +];/ * Initialize an array of file object pointers * /};

This table is private to each process, and each process has one, which is also called the User Open File table, which is to differentiate the system from opening the file table. Our main concern is the last member: struct file * fd_array[32];, this is an array of pointers, each member of the array points to a file struct, which holds information about the files that the process has opened. Because a process can open multiple files, it uses an array of pointers to hold their information. For each file that has an entry address in the array, the index of the array is the file descriptor (filename descriptor). Typically, the first element of an array (index 0) is the standard input file for the process, the second element of the array (index 1) is the standard output file of the process, and the third element of the array (index 2) is the standard error file for the process. These three files are open by default for each process, and if the process needs to open additional files, the process descriptor will start at 3. The file descriptor is an important resource of the system, although the system memory can open how many file descriptors, but in the actual implementation process of the kernel will do the corresponding processing, the general maximum open files will be 10% of the system memory (in kilobytes) (known as system-level restrictions), To view the maximum number of open files at the system level, you can use Sysctl-a | grep fs.file-max command to view. At the same time, the kernel, in order not to let a process consume all the file resources, it will be the maximum number of open files per process default processing (called User-level restrictions), the default value is generally 1024, using the Ulimit-n command can be viewed.
In Linux, a process accesses a file through a file descriptor (a filename descriptors, or FD) rather than a file name, which is actually an integer.

    • Virtual file system
      Here's a look at the file struct with the following code:
structfile{structList_head f_list;/ * All open files form a linked list * / structDentry *f_dentry;/ * Pointer to related catalog item * / structVfsmount *f_vfsmnt;/ * Pointer to VFS mount point * / structFile_operations *f_op;/ * Pointer to the File action table * /mode_t F_mode;/ * Open mode for file * /loff_t F_pos;/ * Current Location of file * / unsigned  ShortF_flags;/ * Flag specified when opening the file * / unsigned  ShortF_count;/ * Number of processes using the structure * / unsigned LongF_reada, F_ramax, F_raend, F_ralen, F_rawin;/ * read-ahead flag, the maximum number of pages to read, the last read-ahead file pointer, the number of read-ahead bytes, and the number of pre-read pages * / intF_owner;/* Transfer of asynchronous I/O data via signal */ unsigned intF_uid, F_gid;/* UID and gid*/of user intF_error;/ * error code for network write operation * / unsigned LongF_version;/ * Version number * / void*private_data;/ * TTY driver required * /};

The file struct does not correspond to the actual document one by one, for example, when a process opens the same file multiple times, it assigns a different file struct and the corresponding filename descriptor, although these file structures eventually point to the same actual physical file. It can be seen that in-memory files and disk implementation of the file is still not the same, in-memory files are dynamic, because to read and write, so just a copy, all the operation is only the copy, the operation is completed, the results are written back to the disk file; All changes are only present in memory and do not react to disk in real time and can be verified by using VIM and cat to manipulate a file simultaneously. After you understand this, see why each time you open a file to be assigned different files structure: Because each open means to do different operations on the same file, and you want the results of different operations can be written back to the disk file, if the same file is used, then it is possible to overwrite the previous operation.
The probability of overwriting is derived from the member loff_t f_pos of the file struct, which represents the current read and write location of the files. Each file has a 32-bit number that represents the next read and write byte position, which is called the file location. Each time a file is opened, unless explicitly requested, the file location is set to 0, that is, the beginning of the file, after which the read or write operation will be performed from the beginning of the file, you can make the system call Lseek (random storage) to modify the file location. So, if two operations are for different locations in the same file, using the same document structure will make them have the same read and write location, which is obviously not going to work. This is still a different operation of the same process, not to mention the different processes of different operations? As a result, almost every time you open a file, you allocate a new document structure. It can be said that the file structure is the main preservation of this read-write location.
"Almost" is used above, which indicates an exception: When a new process is generated, the child processes share all the information of the parent process, including the file struct, whose members unsigned short f_count, indicating how many processes are currently using the struct, only if it has a value of 0 o'clock. Before destroying it from memory.
The first member in the file struct is the struct list_head f_list, which causes the file structure to form a doubly linked list, called the System Open File table, whose maximum length is nr_file, defined in Fs.h as 8192.
Then the relationship between the file descriptor table, the system open file tables, and the actual files can be as follows:

Then we look at how the file structure points to the final actual document, which involves a second member: struct dentry *f_dentry; The code is as follows:

structdentry {atomic_t d_count;/ * Directory Entry Reference counter * /unsigned intD_flags;/ * Directory entry flag * /structInode * D_INODE;/* The index node associated with the file name */structDentry * d_parent;/ * directory entry for parent directory * /structList_head D_hash;/ * Hash table formed by directory entry * /structList_head D_lru;/ * Unused LRU list * /structList_head D_child;/ * The linked list formed by the subdirectory entry of the parent directory * /structList_head D_subdirs;/* The list of subdirectories of This directory entry is formed by a list * /structList_head D_alias;/ * Linked list of index node aliases * /intd_mounted;/ * installation point for directory items * /structQstr D_name;/* Directory entry name (can be found quickly) */unsigned LongD_time;/ * Used by the D_revalidate function * /structDentry_operations *d_op;/ * Set of functions for directory items * /structSuper_block * D_SB;/* The root of the directory item tree (that is, the file's Super block) */unsigned LongD_vfs_flags;void* D_FSDATA;/ * Data for specific file systems * /unsigned CharD_iname[dname_inline_len];/ * Short file name * /};

This structure holds the path information, including the file name. When I introduced the ln command, I said that the operating system looks for files based on the inode number, and the human is based on the file name, so we need to find the correspondence between the two in this structure. We can take advantage of these two members: struct inode * d_inode; and struct qstr d_name;. The former points to the inode structure associated with the real file, which holds the information from the inode of the disk partition. At this point, the file descriptor in the process points to the final disk file, and I guess the files struct will eventually be written back to the disk by the inode struct body.
Finish.

File systems within the Linux kernel

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.