Linux virtual file System (VFS) Learning

Source: Internet
Author: User

Virtual file system (Vsan Filesystem), also known as virtual file system conversion (Filesystem switch), is a kernel software layer that handles all system calls related to UNIX standard file systems. Its robustness table now provides a common interface for various file systems.

Universal File System Model

The main idea behind VFS is to introduce a common filesystem model (common file model) that can represent all supported file systems. In the common file model, each folder is considered a file and can include several files and other subfolders.

The common file model consists of the following object types:

Super Block Objects (Superblock object)

Contains information about the installed file system (a Superblock object represents a mounted filesystem). For disk-based file systems, such objects typically correspond to file system control blocks stored on disk

Index node objects (Inode object)

Holds general information about detailed files (an Inode object represents an object within the filesystem). For disk-based file systems, such objects typically correspond to file control blocks that are stored on disk. Each index node object has an index node number that uniquely identifies the file in the file system.

File Object

Holds information about the interaction between open files and processes (a file object represents a file opened by A process). This type of information is only present in kernel memory during a process Access file.

Folder item objects (Dentry object)

The information that holds the folder item (that is, the specific name of the file) linked to the corresponding file.

Data structure of VFS

This is just a list of process-related structures.

Index Node Object

All the information required by the file system to process the file is placed in a data structure called an index node. The file name can be changed at any time, but the index node is unique to the file and exists as the file exists. An in-Memory index node object is composed of a struct inode data structure.

struct Inode {struct hlist_nodei_hash;  For hash list struct list_headi_list;/* backing dev IO list */struct list_headi_sb_list;   struct list_headi_dentry;      The folder item object that references the index node unsigned longi_ino;        Index node number atomic_ti_count;    Reference counter unsigned inti_nlink;          Number of hard links uid_ti_uid;          all identifiers Gid_ti_gid;         Group identifier Dev_ti_rdev;          Real device identifier u64i_version;         Version (incremented after each use) Loff_ti_size;    The number of bytes in the file #ifdef __need_i_size_orderedseqcount_ti_size_seqcount; #endifstruct timespeci_atime;    Time of last visit to the file struct timespeci_mtime;    The last time the file was written, struct timespeci_ctime;       The time blkcnt_ti_blocks the index node was last modified;  Number of blocks of file unsigned inti_blkbits;         The number of bits of the block unsigned the number of bytes of the short i_bytes;//block Umode_ti_mode; Types of files and access rights spinlock_ti_lock;/* I_blocks, I_bytes, maybe i_size */struct mutexi_mutex;struct rw_semaphorei_alloc_sem; Avoid a competitive condition in the direct I/O file operation the read-write Semaphore const struct INODE_OPERATIONS*I_OP; Operation of the index node const struct file_operations*i_fop;/* default file operation former->i_op->default_file_ops */struct SupeR_BLOCK*I_SB; Pointer to super block struct file_lock*i_flock;struct address_space*i_mapping;     Pointer to address_space object struct address_spacei_data; The Address_space object of the file #ifdef config_quotastruct Dquot*i_dquot[maxquotas];  Index node disk quota #endifstruct list_headi_devices;     Used for detailed character or block device index node list union {struct Pipe_inode_info*i_pipe;//Assuming the file is a pipe then use it to struct block_device*i_bdev;         Pointer to block device driver struct Cdev*i_cdev; Pointer to character device driver};__u32i_generation; The version of the index node #ifdef config_fsnotify__u32i_fsnotify_mask; /* All Events this inode cares about */struct hlist_headi_fsnotify_mark_entries; /* fsnotify Mark Entries */#endif #ifdef config_inotifystruct list_headinotify_watches;  /* Watches on this inode */struct mutexinotify_mutex;/* protects the watches list */#endifunsigned longi_state;  Index node Status flag unsigned longdirtied_when;/* jiffies of first dirtying */unsigned inti_flags; File system installation logo atomic_ti_writecount; #ifdef config_securityvoid*i_security; #endif #ifdef config_fs_posix_aclstruct POSIX _acl*i_acl;struct posix_aCl*i_default_acl, #endifvoid *i_private; /* fs or device private pointer */};
File Object

The file object Describes how the process interacts with an open file. A file object is created when a file is opened, and consists of a document structure. The file object does not have a corresponding image on disk, so a "dirty" field is not set in the file structure to indicate whether the files object has been altered.

struct File {/* * Fu_list becomes invalid after file_free are called and queued via * Fu_rcuhead for RCU freeing */union {s Truct list_headfu_list;struct rcu_head Fu_rcuhead;} F_u;struct Pathf_path; #define F_DENTRYF_PATH.DENTRY//file-related folder Item Object # define F_VFSMNTF_PATH.MNT//installed file system const containing the file struct FILE_OPERATIONS*F_OP;  File operation table pointer spinlock_tf_lock; /* F_ep_links, F_flags, no IRQ */atomic_long_tf_count; The reference counter for the file object unsigned int f_flags;      The flag specified when the file is opened Fmode_tf_mode;       Process interview mode Loff_tf_pos; Current file offset struct Fown_structf_owner; Data for I/O event notification via signals const struct CRED*F_CRED;STRUCT file_ra_statef_ra; File pre-read status u64f_version; Version, after each use of their own active increment #ifdef config_securityvoid*f_security; #endif/* Needed for TTY driver, and maybe others */void*private_ data;//pointer to data required for a particular file system or device driver #ifdef config_epoll/* used by fs/eventpoll.c to link all the hooks to this file */struct lis The event polling for the t_headf_ep_links;//file waits for the linked header #endif/* #ifdef config_epoll */struct address_space*f_mapping;//Pointer to the file address space object # Ifdef Config_debug_writecountunsigned long f_mnt_write_state; #endif}; 
The file object is assigned by a slab fast cache named Filp, and Filp descriptive descriptor addresses are stored in the FILP_CACHEP variable. Because the number of file objects allocated is limited, the FILES_STAT variable specifies the maximum number of file objects that can be allocated in its Max_files field, which is the maximum number of files that the system can access at the same time.

During kernel initialization, the Files_init () function sets the Max_files field to 1/10 of the available RAM size. Only, the system administrator can change this value by writing the/proc/sys/fs/file-max file. And even if a Max_files file object is already assigned, the Superuser is always able to get a file object

void __init files_init (unsigned long mempages) {int n; filp_cachep = kmem_cache_create ("Filp", sizeof (struct file), 0,slab _hwcache_align | Slab_panic, NULL);/* * one file with associated Inode and Dcache is very roughly 1K. * Per default don ' t use more than 10% of our memory for files.  
Folder Item Object

VFS considers each folder to be a common file consisting of several subfolders and files. Once the folder entry is read into memory, the VFS converts it into a folder item object based on the dentry structure. For each component in the pathname of a process lookup, the kernel creates a folder item object for it, and the folder item object links each component to its corresponding index node. For example, when looking for pathname/tmp/test, the kernel creates a folder item object for the root folder "/", creates a second-level folder item object for the TMP item under the root folder, and creates a third-level folder item object for test under the/tmp folder.

The folder item object does not have a corresponding image on disk, so the fields that indicate that the object has been altered are not included in the dentry structure. The folder item object is stored in a fast cache named Dentry_cache.

struct Dentry {atomic_t d_count;          Folder Item object reference count unsigned int d_flags;/* folder item fast cache flag protected by D_lock */spinlock_t d_lock;/* per dentry lock */int d_mounted; For a folder, the file system counter that records the installation of the folder item struct Inode *d_inode;/* the index node associated with the file name where the name belongs to-null is * negative *  * * The next three fields is touched by __d_lookup. Place them here * so they all fit in a cache line. */struct hlist_node d_hash;/* Lookup hash list */struct dentry *d_parent;/* parent Folder folder Item Object parent Directory */struct qstr D_na         Me File name struct list_head d_lru;/* LRU list *//* * D_child and D_RCU can share memory */union {struct List_head d_child;/* ch ILD of the parent list */struct rcu_head d_rcu;} D_u;struct List_head d_subdirs;/* Our children */struct list_head d_alias;/* inode alias list */unsigned long d_time;/* us Ed by d_revalidate */const struct Dentry_operations *d_op;//folder Item method struct Super_block *d_sb;/* file Super Block object The root of the dent Ry Tree */void *d_fsdata;/* relies on file system data */unsign fs-specificEd Char d_iname[dname_inline_len_min];/* small names */}; 
Process-related files

Each process has its own current working folder and his own root folder. This is only two examples of the data that the kernel must maintain to represent the processes that the process interacts with the file system. The entire data structure of type fs_struct is used for this purpose

struct Fs_struct {int users;rwlock_t lock;int umask;    When the open File Settings file permission is used bitmask int in_exec;struct path root, pwd; /* Folder entry for the root folder, the file system object that the root folder is                               currently working on, and the file system object that is currently installed in the current working folder */};
The file currently opened by the process is related to the FILES_STRUCT structure

struct fdtable {unsigned int max_fds;  Current maximum number of file objects struct file * * FD;      /* Current FD array */fd_set *close_on_exec;fd_set *open_fds;struct rcu_head rcu;struct fdtable *next;};/ * Open File Table structure */struct files_struct {/   * * Read mostly part   */atomic_t count;     Number of shared processes struct fdtable *FDT; struct fdtable fdtab;  /   * * Written part to a separate cache line in SMP   */spinlock_t file_lock ____cacheline_aligned_in_smp;int next_fd; struct Embedded_fd_set close_on_exec_init;struct embedded_fd_set open_fds_init;struct file * Fd_array[NR_OPEN_DEFAULT Initializes an array of];//file object pointers};

the FD field points to an array of pointers to the file object. The length of the array is stored in the Max_fds field. Typically, the FD field points to the Fd_array field in the FILES_STRUCT structure, which contains 32 file object pointers. Assuming that the process opens more than 32 file data, the kernel allocates a new, larger array of file pointers and stores its address in the FD field, and the kernel updates the value of the Max_fds field at the same time.

For each file that has elements in the corresponding FD array, the index of the array is the descriptive descriptor of the file. Typically, the first element of an array (index 0) is the standard input file for the process, the second element of the array (index 1) is the standard output file of the process, and the third element of the array (index 2) is the standard error output file for the process.
The kernel enforces dynamic limits on the maximum number of file-descriptive descriptors on the signal->rlim[rlim_nlimits] structure of a process-descriptive descriptor; This value is typically 1024, but assuming that the process has super-permissions, the value can be increased.

Types of special file systems most frequently used

Name installation point description

BDEV No block device

Binfmt_misc Optional Other operational formats

Devpts/dev/pts Pseudo Terminal support

Eventpollfs hungry effective event polling mechanism using

Futexfs Hungry Futex (high-speed user-state locking) mechanism use

Pipefs No pipe

Proc/proc general access points to kernel data structures

Rootfs inactive Start-up phase provides an empty root folder

SHM without IPC shared linear zone

Mqueue use when implementing POSIX Message Queuing arbitrarily

SOCKFS No sockets

Sysfs/sys general access to system parameters

TMPFS arbitrary temporary files (assuming that they are not swapped out will remain in RAM)

USBFS/PROC/BUS/USB USB Device

File System Brochure

The file system for each of the brochures is represented by an object of type File_system_type.

struct File_system_type {const char *name;  File system name int fs_flags;      File system flag int (*GET_SB) (struct file_system_type *, int,       const char *, void *, struct vfsmount *);//method of Reading Super block void (*kill _SB) (struct super_block *); Method of deleting Super block struct module *owner;    Pointer to the module implementing the file system struct File_system_type * next;//pointer to the next element in the file system list struct List_head fs_supers; Super Block object chain header with same file system type struct Lock_class_key s_lock_key;struct lock_class_key s_umount_key;struct lock_class_key i_ Lock_key;struct lock_class_key i_mutex_key;struct lock_class_key i_mutex_dir_key;struct lock_class_key i_alloc_sem_ Key;};
Take the Sockfs file system register as an example

static struct File_system_type Sock_fs_type = {. Name = "Sockfs",. GET_SB =sockfs_get_sb,.kill_sb =kill_anon_super,}; static int __init sock_init (void) {/* *      Initialize sock SLAB cache. */sk_init ();/* *      Initialize skbuff SLAB Cache * /skb_init ();/*      Initialize The Protocols module. */init_inodecache ();    /* Register SOCKFS File System */register_filesystem (&sock_fs_type); sock_mnt = Kern_mount (&sock_fs_type);/* The Real Protocol initialization is performed in later Initcalls. */#ifdef Config_netfilternetfilter_init (); #endifreturn 0;}

Linux virtual file System (VFS) Learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.