Inotify-File System notification mechanism in the kernel, inotify-Kernel

Source: Internet
Author: User
Tags inotify

Inotify-File System notification mechanism in the kernel, inotify-Kernel

 

Reprinted: http://www.ibm.com/developerworks/cn/linux/l-inotifynew/index.html

I. Introduction

As we all know, Linux desktop systems are much less satisfactory than MAC or Windows. To improve this situation, the open-source community proposes that the kernel should provide some mechanisms for user States, in this way, the user State can know in time what happened to the kernel or underlying hardware device, so as to better manage the device and provide users with better services, such as hotplug, udev, and inotify. Hotplug is a kernel mechanism for notifying user-Mode Applications About some events of hot swapping devices. The desktop system can use it to effectively manage devices, inotify is a file system change notification mechanism that allows you to dynamically maintain Device Files under/dev. Events such as file addition and deletion can be immediately known to users, this mechanism was introduced by beagle, a famous Desktop Search Engine Project, and applied to projects such as Gamin.

In fact, there is a similar mechanism named dnotify before inotify, but it has many defects:

1. for each directory to be monitored, you need to open a file descriptor. Therefore, if there are many directories to be monitored, many file descriptors will be opened, especially, if the monitored directory is on a mobile media (such as a disc or USB disk), the file systems such as umount will not be available because the file descriptor opened by the application using dnotify is using the file system.

2. dnotify is directory-based. It can only get Directory change events. Of course, changes to files in the directory will affect the directory where it is located and cause Directory change events, however, to learn which file is changed through directory events, You Need To Cache a lot of data in the stat structure.

3. The Dnotify interface is unfriendly. It uses signal.

Inotify is designed to replace dnotify. It overcomes the defects of dnotify and provides a simpler and powerful file change notification mechanism:

1. inotify does not need to open a file descriptor for the monitored target. If the monitored target is on removable media, after the file system on the umount media, the watch corresponding to the monitored target will be automatically deleted and an umount event will be generated.

2. Inotify can monitor both files and directories.

3. Inotify uses system calls instead of SIGIO to notify file system events.

4. Inotify uses the file descriptor as an interface. Therefore, you can use the normal file I/O operations select and poll to monitor changes in the file system.

Inotify can monitor the following file system events:

  • IN_ACCESS: The file is accessed.
  • IN_MODIFY, the file is written
  • IN_ATTRIB: file attributes are modified, such as chmod, chown, and touch.
  • IN_CLOSE_WRITE, writable file closed
  • IN_CLOSE_NOWRITE: the file cannot be written.
  • IN_OPEN, the file is open
  • IN_MOVED_FROM: The file is removed, such as mv.
  • IN_MOVED_TO: The file is moved, such as mv and cp.
  • IN_CREATE, create a new file
  • IN_DELETE: The file is deleted, such as rm.
  • IN_DELETE_SELF: indicates that an executable file is deleted when it is executed.
  • IN_MOVE_SELF, self-moving, that is, an executable file moves itself during execution
  • IN_UNMOUNT, the host file system is umount
  • IN_CLOSE, the file is closed, equivalent to (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE)
  • IN_MOVE: The file is moved, equivalent to (IN_MOVED_FROM | IN_MOVED_TO)

Note: The files mentioned above also include directories.

Ii. User Interfaces

In the user State, inotify is used through three system calls and file I/operations on the returned file descriptor. The first step of using inotify is to create an inotify instance:

int fd = inotify_init ();

Each inotify instance corresponds to an independent ordered queue.

The file system change event is called an object management of watches. Each watch is a binary group (destination, event mask), and the target can be a file or directory, the event mask indicates the inotify event to be followed by the application. Each bit corresponds to an inotify event. The Watch object is referenced by the watch descriptor, And the watches object is added by the file or directory path name. The watches directory returns the events that occur on all files in the directory.

The following function is used to add a watch:

 int wd = inotify_add_watch (fd, path, mask);

 

Fd is the file descriptor returned by inotify_init (). path is the path name of the monitored target (that is, the file name or directory name). mask is the event mask, in the header file linux/inotify. h defines the event represented by each digit. You can modify the event mask in the same way, that is, change the inotify event to be notified. Wd is the watch descriptor.

The following function is used to delete a watch:

int ret = inotify_rm_watch (fd, wd);

Fd is the file descriptor returned by inotify_init (), and wd is the watch descriptor returned by inotify_add_watch. Ret is the return value of the function.

File events are represented by an inotify_event structure. They are obtained by using the read function of the Common File Reading function returned by inotify_init:

struct inotify_event {        __s32           wd;             /* watch descriptor */        __u32           mask;           /* watch mask */        __u32           cookie;         /* cookie to synchronize two events */        __u32           len;            /* length (including nulls) of name */        char            name[0];        /* stub for possible name */};

In the structure, wd is the watch descriptor of the monitored target, mask is the event mask, len is the length of the name string, name is the path name of the monitored target, and the name field of this structure is a pile, it only references the file name for the user. The file name is variable and follows the structure. The file name will be filled with 0 so that the next event structure can be 4-byte aligned. Note that len also counts the number of padding bytes.

You can obtain multiple events at a time through the read call, as long as the provided buf is large enough.

size_t len = read (fd, buf, BUF_LEN);

Buf is an array pointer of the inotify_event structure. BUF_LEN specifies the total length to be read. The buf size must be at least smaller than BUF_LEN. The number of events returned by this call depends on the length of BUF_LEN and the file name in the event. Len is the number of bytes actually read, that is, the total length of the obtained event.

You can use select () or poll () on the file descriptor fd returned by the inotify_init () function, or use the ioctl command FIONREAD on fd to get the length of the current queue. Close (fd) will delete all the watches added to fd and perform necessary cleanup.

int inotify_init (void);int inotify_add_watch (int fd, const char *path, __u32 mask);int inotify_rm_watch (int fd, __u32 mask);
Iii. kernel Implementation Mechanism

In the kernel, each inotify instance corresponds to an inotify_device structure:

struct inotify_device {        wait_queue_head_t       wq;             /* wait queue for i/o */        struct idr              idr;            /* idr mapping wd -> watch */        struct semaphore        sem;            /* protects this bad boy */        struct list_head        events;         /* list of queued events */        struct list_head        watches;        /* list of watches */        atomic_t                count;          /* reference count */        struct user_struct      *user;          /* user who opened this dev */        unsigned int            queue_size;     /* size of the queue (bytes) */        unsigned int            event_count;    /* number of pending events */        unsigned int            max_events;     /* maximum number of events */        u32                     last_wd;        /* the last wd allocated */};

Wq is a waiting queue. The process blocked by the read call will be hung in the waiting queue. idr is used to map the watch descriptor to the corresponding inotify_watch. sem is used to synchronize access to the structure, events is the list of events that occur on the inotify instance. All events monitored by the inotify instance are inserted into this list after they occur. watches is the watch list monitored by the inotify instance, inotify_add_watch inserts the newly added watch into this list. count is the reference count, and user is used to describe the user who created the inotify instance. queue_size indicates the number of bytes of the event queue of the inotify instance, event_count is the number of events in the events list, max_events is the maximum number of events allowed, and last_wd is the watch descriptor allocated last time.

Each watch corresponds to an inotify_watch structure:

struct inotify_watch {        struct list_head        d_list; /* entry in inotify_device's list */        struct list_head        i_list; /* entry in inode's list */        atomic_t                count;  /* reference count */        struct inotify_device   *dev;   /* associated device */        struct inode            *inode; /* associated inode */        s32                     wd;     /* watch descriptor */        u32                     mask;   /* event mask for this watch */};

D_list points to a list composed of all inotify_devices. I _list points to a list composed of all monitored inode. count indicates the reference count. dev points to the inotify_device structure corresponding to the inotify instance of the watch, inode points to the inode to be monitored by the watch. wd is the descriptor assigned to the watch, and mask is the event mask of the watch, indicating which file system events it is interested in.

The structure inotify_device is created when the user State calls inotify_init (). When the file descriptor returned by inotify_init () is disabled, it is released. The structure inotify_watch is created when the user State calls inotify_add_watch () and is released when the user State calls inotify_rm_watch () or close (fd.

Both directories and files correspond to an inode structure in the kernel. The inode System adds two fields to the inode structure:

#ifdef CONFIG_INOTIFY    struct list_head    inotify_watches; /* watches on this inode */    struct semaphore    inotify_sem;    /* protects the watches list */#endif

Inotify_watches is the watch List on the monitored target. Whenever you call inotify_add_watch (), the kernel creates an inotify_watch structure for the added watch, insert it to the inotify_watches list of inode corresponding to the monitored target. Inotify_sem is used to synchronize access to the inotify_watches list. When the first part of the event occurs in the file system, the corresponding file system code will display the call to fsnoop _ * to report the corresponding event to the inotify system, * Indicates the corresponding event name. The current implementation includes:

  • Fsnotify_move: The file is moved from one directory to another.
  • Fsnotify_nameremove: The file is deleted from the directory.
  • Fsnotify_inoderemove, self-Deleted
  • Fsnotify_create: Create a new file
  • Fsnotify_mkdir to create a new directory
  • Fsnotify_access, File Read
  • Fsnotify_modify, file written
  • Fsnotify_open: The file is opened.
  • Fsnotify_close: The file is closed.
  • Fsnotify_xattr. The extension attribute of the file is modified.
  • Fsnotify_change: The file is modified or the original data is modified.

One exception is inotify_unmount_inodes, which is called to notify the file system of the umount event to the inotify system when the file system is umount.

The preceding notification functions call inotify_unmount_inodes to directly call inotify_dev_queue_event. This function first checks whether the corresponding inode is monitored. This function is implemented by checking whether the inotify_watches list is empty, if inode is not monitored and nothing is done, return immediately. Otherwise, traverse the inotify_watches list to check whether the current file operation event is monitored by a watch. If yes, call inotify_dev_queue_event. Otherwise,. The inotify_dev_queue_event function first checks whether the event is a duplicate of the previous event. If yes, It discards the event and returns it. Otherwise, it determines whether the inotify instance, that is, whether the event queue of inotify_device overflows. If yes, an overflow event is generated. Otherwise, a file operation event is generated. These events are constructed through kernel_event. kernel_event creates an inotify_kernel_event structure, insert this structure to the events event list of the corresponding inotify_device, and then wake up the waiting queue that wq points to in the inotify_device structure. If the user-state process that wants to monitor file system events calls read on the inotify instance (that is, the file descriptor returned by inotify_init (), but there is no event, the user-state process hangs on the wq waiting queue.

Iv. Example

The following is an example of using inotify to monitor File System Events:

#include <linux/unistd.h>#include <linux/inotify.h>#include <errno.h>_syscall0(int, inotify_init)_syscall3(int, inotify_add_watch, int, fd, const char *, path, __u32, mask)_syscall2(int, inotify_rm_watch, int, fd, __u32, mask)char * monitored_files[] = {    "./tmp_file",    "./tmp_dir",    "/mnt/sda3/windows_file"};struct wd_name {    int wd;    char * name;};#define WD_NUM 3struct wd_name wd_array[WD_NUM];char * event_array[] = {    "File was accessed",    "File was modified",    "File attributes were changed",    "writtable file closed",    "Unwrittable file closed",    "File was opened",    "File was moved from X",    "File was moved to Y",    "Subfile was created",    "Subfile was deleted",    "Self was deleted",    "Self was moved",    "",    "Backing fs was unmounted",    "Event queued overflowed",    "File was ignored"};#define EVENT_NUM 16#define MAX_BUF_SIZE 1024    int main(void){    int fd;    int wd;    char buffer[1024];    char * offset = NULL;    struct inotify_event * event;    int len, tmp_len;    char strbuf[16];    int i = 0;        fd = inotify_init();    if (fd < 0) {        printf("Fail to initialize inotify.\n");        exit(-1);    }    for (i=0; i<WD_NUM; i++) {        wd_array[i].name = monitored_files[i];        wd = inotify_add_watch(fd, wd_array[i].name, IN_ALL_EVENTS);        if (wd < 0) {            printf("Can't add watch for %s.\n", wd_array[i].name);            exit(-1);        }        wd_array[i].wd = wd;    }    while(len = read(fd, buffer, MAX_BUF_SIZE)) {        offset = buffer;        printf("Some event happens, len = %d.\n", len);        event = (struct inotify_event *)buffer;        while (((char *)event - buffer) < len) {            if (event->mask & IN_ISDIR) {                memcpy(strbuf, "Direcotory", 11);            }            else {                memcpy(strbuf, "File", 5);            }            printf("Object type: %s\n", strbuf);            for (i=0; i<WD_NUM; i++) {                if (event->wd != wd_array[i].wd) continue;                printf("Object name: %s\n", wd_array[i].name);                break;            }            printf("Event mask: %08X\n", event->mask);            for (i=0; i<EVENT_NUM; i++) {                if (event_array[i][0] == '\0') continue;                if (event->mask & (1<<i)) {                    printf("Event: %s\n", event_array[i]);                }            }            tmp_len = sizeof(struct inotify_event) + event->len;            event = (struct inotify_event *)(offset + tmp_len);             offset += tmp_len;        }    }}

When you run this program, execute cat./tmp_file on another virtual terminal. The output of this program is:

Some event happens, len = 48.Object type: FileObject name: ./tmp_fileEvent mask: 00000020Event: File was openedObject type: FileObject name: ./tmp_fileEvent mask: 00000001Event: File was accessedObject type: FileObject name: ./tmp_fileEvent mask: 00000010Event: Unwrittable file closed

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.