Proc file system analysis (2)

Source: Internet
Author: User

Ii. System Analysis of Proc files
Based on the previous analysis, we can basically determine the analysis steps for the proc file system. I will analyze the proc file system registration and installation sequence, and analyze the structure of the proc file system based on the code, in particular, the data structure used by the proc file system for internal management. Finally, a feasible XML encapsulation plan is proposed based on the analysis results.
In the analysis of the data structure of the proc file system, I will focus on the analysis of data output, which is the basis for proposing a standard XML encapsulation method.
(1) Introduction to Linux source code
In the Linux code tree, all file system code is stored in the Linux/fs/directory, where the source code of the proc file system is in Linux/fs/proc, the following describes the source files in the proc directory.
There are a total of 11 related files in the directory. They are:
Procfs_syms.c inode. c generic. C base. c
Array. c root. c proc_tty.c proc_misc.c
Kmsg. c kcore. c proc_devtree.c
Among them, procfs_syms.c, generic. C and inode. C is related to proc file system management, including proc file system registration and routines provided to other subsystems in the kernel. This is the most important part of the code, we will start to analyze the proc file system from here.
The source file root. C is related to the management of the root node of the proc file system.
Base. C and array. C are used to process information in the/proc directory, including command line, Process status, memory status, and other process-related content. Proc_tty.c is used to process/proc/tty information, and proc_misc.c is used to manage most files in the/proc directory.
In addition, there are two very important header files proc_fs.h and proc_fs_ I .h, which can be found in the/Linux/include/Linux/directory.
(2) Registration of the proc file system
The proc file system complies with VFS specifications and must be registered before use. We know that every file system will fill in a file_system_type data structure in its own initialization routine, and then call the registration function register_filesystem (struct file_system_type * FS) for registration.
In the proc file system, the related file is procfs_syms.c. In this file, the type of proc file system is declared:
Static declare_fstype (proc_fs_type, "proc", proc_read_super, fs_single );
In FS. H, we can find the definition of the macro declare_fstype:
# Define declare_fstype (VAR, type, read, flags )/
Struct file_system_type Var = {/
Name: type ,/
Read_super: read ,/
Fs_flags: Flags ,/
Owner: this_module ,/
}
Therefore, we can see that a file type proc_fs_type is declared, and its name is "proc". The function for reading Super blocks is proc_read_super, and fs_flags is set to fs_single. According to the instructions in the source code, we know that when the fs_flags file system is declared as fs_single, it means that the file system has only one super block, and you must call kern_mount () after registering the function (), so that vfsmnt within the kernel range is placed at-> kern_mnt.
The following is the registration of the proc file system. The code for the init_proc_fs () function is as follows:
Static int _ init init_proc_fs (void)
{
Int err = register_filesystem (& proc_fs_type );
If (! Err ){
Proc_mnt = kern_mount (& proc_fs_type );
Err = ptr_err (proc_mnt );
If (is_err (proc_mnt ))
Unregister_filesystem (& proc_fs_type );
Else
Err = 0;
}
Return err;
}
As you can see, proc file system registration is very simple, the main steps are as follows:
1. Call register_filesystem (& proc_fs_type) to add the proc file type to the one-way linked list of the file type in a clever way. If an error occurs, return.
2. call the kern_mount function. This function basically completes three steps. First, call the read_super () function. In this function, VFS assigns a super block structure to the proc file system and sets s_dev, s_flags and other fields. Then, the read_super routine of the proc file system will be called, corresponding to the proc file system. This routine is proc_read_super (), which sets other values of the super block structure. We will analyze it in the next section.
Second, use the add_vfsmnt () function to create the vfsmount structure of the proc file system and add it to the linked list of the mounted file system (see figure-XX ).
Finally, return the vfsmount structure and use the returned value to use the proc_mnt pointer to point to the vfsmount structure.
3. Determine whether the returned value is incorrect. If the returned value is incorrect, uninstall the file system.
In this way, a file system is successfully registered to the core. Similarly, the uninstallation of the proc file system is very simple. The Code is as follows:
Static void _ exit exit_proc_fs (void)
{
Unregister_filesystem (& proc_fs_type );
Kern_umount (proc_mnt );
}
(3) create a super block for the proc file system
As we can see just now, in the kern_mount function, we call read_proc to establish a super block structure, and then we will call the file system's own routines for reading Super blocks to fill our super block structure, next, let's take a look at how the hyperblock reading routine proc_read_super () of the proc file system works, and what work it has completed. This function is available in FS/proc/inode. c implementation:
Struct super_block * proc_read_super (struct super_block * s, void * data,
Int silent)
{
Struct inode * root_inode;
Struct task_struct * P;

S-> s_blocksize = 1024;
S-> s_blocksize_bits = 10;
S-> s_magic = proc_super_magic;
S-> s_op = & proc_sops;
S-> s_maxbytes = max_non_lfs;
Root_inode = proc_get_inode (S, proc_root_ino, & proc_root );

If (! Root_inode)
Goto out_no_root;
/*
* Fixup the root inode's nlink Value
*/
Read_lock (& tasklist_lock );
For_each_task (p) if (p-> PID) root_inode-> I _nlink ++;
Read_unlock (& tasklist_lock );
S-> s_root = d_alloc_root (root_inode );
If (! S-> s_root)
Goto out_no_root;
Parse_options (data, & root_inode-> I _uid, & root_inode-> I _gid );
Return S;

Out_no_root:
Printk ("proc_read_super: Get root inode failed/N ");
Iput (root_inode );
Return NULL;
}
The function performs the following steps:
1. in this function, the basic information of the file system is first written to the super block passed as the parameter. s_blocksize is set to 1024. Because 1024 = 2 ^ 10, s_blocksize_bit is set to 10, then the magic number of the proc file system is proc_super_magic. The function set of the super block is set to proc_sops. For the proc file system, only four super block functions are implemented, which will be analyzed later. Then, set the maximum number of bytes in the proc file system to max_non_lfs. In FS. H, define this macro as (1ul <31)-1 ).
2. Use the proc_get_inode (S, proc_root_ino, & proc_root) function to create the root node root_inode. This function obtains the inode Based on the parameter Ino, which will be further analyzed later. If the root node is not obtained, jump to the out_no_root label and exit. The ino parameter is used to mark inode. When a new index node of Proc is created, it is dynamically allocated with an ino.
3. modify the number of connections of root_inode. It traverses the process linked list and enables I _nlink ++ for each process. This is because in the/proc directory, each process has a directory. In other words, there is a process, it must correspond to a sub-directory in the proc directory. Therefore, when establishing the root node of Proc, you must modify its I _nlink according to the number of processes.
4. Create a root dentry for the super block s_root Based on the created root_inode:
S-> s_root = d_alloc_root (root_inode)
The root_inode type is struct inode *, while the s_root type is struct dentry *. When introducing VFS, we know that directory cache exists in a tree structure. Therefore, after establishing the root node of the file system, we need to use d_alloc_root () the function creates a root dentry, that is, the dentry structure.
The super block is returned successfully. At this time, the super block has filled in the necessary data information. So we can see that the super block reading routine mainly completes two parts of the work, first writing necessary data to the super block, and then establishing the root node of the file system, the corresponding dentry structure is created in the directory cache.
(4) proc file system superblock operation function set
In the previous section, we saw how to set up your own super block in the proc file system and set the super block operation function set to proc_sops. We will analyze this section, what operations are required for the super block of the proc file system and how these operations are implemented.
In the file fs/proc/inode. C, it is defined as follows:
Static struct super_operations proc_sops = {
Read_inode: proc_read_inode,
Put_inode: force_delete,
Delete_inode: proc_delete_inode,
Statfs: proc_statfs,
};
We can see that the proc file system only implements four super block operation functions. It uses a special method to initialize the structure. This method is called labeled elements, which is the gnu c extension. In this way, the structure does not have to be initialized in the order of structure, you only need to specify the domain name to initialize its value. For fields not mentioned, the value is automatically set to 0.
So we can see that the proc file system only defines four super block operation functions. Let's take a look at why other operation functions do not need to be defined.
First, we know that the proc file system only exists in the memory and does not require physical devices. Therefore, the write_inode function is not required. The notify_change function is called when the attribute of the index node is changed. For inode of the proc file system, the setattr function is not provided. In other words, the file attribute is not changed, therefore, mongo_change will not be called (the proc file system provides only a few operations for inode_operations, and also targets different files/directories when creating a file tree, different index node operation functions are set, which will be described in detail later ). For similar reasons, other functions, such as put_super, write_super, and clear_inode, are not defined.
Let's take a look at the four defined functions:
1 read_inode: proc_read_inode
This function is used to read the information of the specified index node from the mounted file system. In fact, when you need to read a specific index node, the iget (SB, Ino) function of VFS is called. In this function, Sb specifies the super block of the file system, ino indicates the index node. This function searches for the index node in the dcache of the super block. If the index node is found, the index node is returned. Otherwise, the specified index node must be read from the logical file system, the get_new_inode () function will be called. In this function, an inode structure will be allocated and some basic information will be filled in. Then, the super block operation function read_inode will be called, the proc file system is the proc_read_inode () function.
In the subsequent introduction, we will know that in order to facilitate file management, the proc file system creates and maintains a proc_dir_entry structure for each registered proc file. This structure is very important. For the proc file system, this structure is its own private data, which is equivalent to the index node of other logical file systems (such as ext2 file systems) on the physical hard disk. Therefore, the proc_dir_entry structure of the proc file system is linked to the VFS index node only when necessary.
Therefore, the main purpose of the proc_read_inode function is to create a new index node. You only need to fill in some basic information. So we can see that the proc_read_inode function is very simple:
Static void proc_read_inode (struct inode * inode)
{
Inode-> I _mtime = inode-> I _atime = inode-> I _ctime = current_time;
}
Before calling the proc_read_inode function, the get_new_inode () function of VFS has set other basic information for inode, such as I _sb, I _dev, I _ino, I _flags, and I _count.
2 put_inode: force_delete
The put_inode function is called when the reference count of the index node is reduced. We can see that the proc file system does not implement its own put_inode function, but simply sets the force_delete function of VFS, let's take a look at the content of this function:
Void force_delete (struct inode * inode)
{
/*
* Kill off unused inodes... Iput () Will unhash and
* Delete the inode if we set I _nlink to zero.
*/
If (atomic_read (& inode-> I _count) = 1)
Inode-> I _nlink = 0;
}
We know that the put_inode function is called before the reference count I _count is reduced. Therefore, for the proc file system, before each inode reference count is reduced, check whether the reference count is reduced to zero. If yes, set the number of links to the changed index node to zero.
3 delete_inode: proc_delete_inode
When the reference count and link count of an index node reach zero, the delete_inode function of the super block is called. Because we use force_delete to implement the put_inode method of the proc super block, we know that for the proc file system, when the reference count of an inode is zero, the number of links must be zero.
Let's take a look at the source code of the function:
/*
* Decrement the use count of the proc_dir_entry.
*/
Static void proc_delete_inode (struct inode * inode)
{
Struct proc_dir_entry * de = inode-> U. generic_ip;/* For the procfs, inode-> U. generic_ip is a 'proc _ dir_entry '*/
Inode-> I _state = I _clear;
If (proc_inode_proper (inode )){
Proc_pid_delete_inode (inode );
Return;
}
If (de ){
If (de-> owner)
_ Mod_dec_use_count (de-> owner );
De_put (de );
}
}
We can see that this function basically does three jobs. First, set the status bit of this index node to I _clear, which indicates that this inode structure is no longer used. Secondly, check whether the index node is an index node in the PID directory based on the ino Number of the index node, because the index node number of the PID directory uses
# Define fake_ino (PID, Ino) (PID) <16) | (Ino ))
Therefore, the check condition is if (proc_inode_proper (inode )). We know that the/proc directory contains many process-related directories and files. These proc files are in another form of organization, and their files are closely related to the process, the create_proc_entry function is not used for registration as in other proc files, but is dynamically generated based on the actual process linked list. Therefore, in a later Linux version, A super block will be created for the PID proc file separately, but currently, inode ino is used to differentiate. Because inode In the PID directory is closely related to the process data structure, some special work should be done to release some special resources when inode is to be released. This is done by the proc_pid_delete_inode () function.
Finally, call the de_put () function to perform necessary operations on the proc_dir_entry structure associated with this inode. This reduces the reference count of proc_dir_entry. If the Count reaches zero, the proc_dir_entry structure is released.
4 statfs: proc_statfs
When analyzing VFS, we know that the statfs function is used to implement the System Call statfs (2 ). Let's take a look at its source code:
Static int proc_statfs (struct super_block * Sb, struct statfs * BUF)
{
Buf-> f_type = proc_super_magic;/* Here use the super_block's s_magic! */
Buf-> f_bsize = page_size/sizeof (long);/* optimal transfer block size */
Buf-> f_bfree = 0;/* free blocks in FS */
Buf-> f_bavail = 0;/* free blocks avail to non-superuser */
Buf-> f_ffree = 0;/* free file nodes in FS */
Buf-> f_namelen = name_max;/* Maximum length of filenames */
Return 0;
}
We can see that it fills the statistical data of the file system into a Buf, the file system type is proc_super_magic, the idle block in the file system and the file node in the file system are set to 0, therefore, these statistics are meaningless for proc file systems that only exist in the memory.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.