Epoll_create function implementation source code analysis

Source: Internet
Author: User

I analyzed poll last night and read the code to find that there are many optimizations to poll operations. Epoll is short for eventpoll. Its efficiency is very high. Let's take a look at its implementation today. His implementation is in FS/eventpoll. C, with more than 1500 lines of code.

As we all know, epoll has three system calls, and the C library is encapsulated into the following three:

1.IntEpoll_create (IntSize );

2.
IntEpoll_ctl (IntEpfd,IntOp,IntFD,StructEpoll_event
* Event );

3.IntEpoll_wait (IntEpfd,StructEpoll_event
* Events,IntMaxevents,IntTimeout );

With so many epoll source code, we simply followed them. Get the first one today --- epoll_create

The first one is

/** It opens an eventpoll file descriptor by suggesting a storage of "size"* file descriptors. The size parameter is just an hint about how to size* data structures. It won't prevent the user to store more than "size"* file descriptors inside the epoll interface. It is the kernel part of* the userspace epoll_create(2).*/asmlinkage long sys_epoll_create(int size){int error, fd;struct inode *inode;struct file *file;DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d)\n",current, size));/* Sanity check on the size parameter */error = -EINVAL;if (size <= 0)goto eexit_1;/** Creates all the items needed to setup an eventpoll file. That is,* a file structure, and inode and a free file descriptor.*/error = ep_getfd(&fd, &inode, &file); //(1)if (error)goto eexit_1;/* Setup the file internal data structure ( "struct eventpoll" ) */error = ep_file_init(file); //(2)if (error)goto eexit_2;DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d) = %d\n",current, size, fd));return fd;eexit_2:sys_close(fd);eexit_1:DNPRINTK(3, (KERN_INFO "[%p] eventpoll: sys_epoll_create(%d) = %d\n",current, size, error));return error;}


(1) Here we use an ep_getfd function. From the annotation, we know that this function creates file related to eventpoll. Of course, a file should include the file descriptor, inode, and file objects, this is also the three parameters we passed in. If you don't talk nonsense, check the source code:


/** Creates the file descriptor to be used by the epoll interface.*/static int ep_getfd(int *efd, struct inode **einode, struct file **efile){struct qstr this;char name[32];struct dentry *dentry;struct inode *inode;struct file *file;int error, fd;/* Get an ready to use file */error = -ENFILE;file = get_empty_filp();if (!file)goto eexit_1;/* Allocates an inode from the eventpoll file system */inode = ep_eventpoll_inode();error = PTR_ERR(inode);if (IS_ERR(inode))goto eexit_2;/* Allocates a free descriptor to plug the file onto */error = get_unused_fd();if (error < 0)goto eexit_3;fd = error;/** Link the inode to a directory entry by creating a unique name* using the inode number.*/error = -ENOMEM;sprintf(name, "[%lu]", inode->i_ino);this.name = name;this.len = strlen(name);this.hash = inode->i_ino;dentry = d_alloc(eventpoll_mnt->mnt_sb->s_root, &this);if (!dentry)goto eexit_4;dentry->d_op = &eventpollfs_dentry_operations;d_add(dentry, inode);file->f_vfsmnt = mntget(eventpoll_mnt);file->f_dentry = dentry;file->f_mapping = inode->i_mapping;file->f_pos = 0;file->f_flags = O_RDONLY;file->f_op = &eventpoll_fops;file->f_mode = FMODE_READ;file->f_version = 0;file->private_data = NULL;/* Install the new setup file into the allocated fd. */fd_install(fd, file);*efd = fd;*einode = inode;*efile = file;return 0;eexit_4:put_unused_fd(fd);eexit_3:iput(inode);eexit_2:put_filp(file);eexit_1:return error;}


The comments of this function are quite complete. Here we will briefly mention that, because too many functions are involved, too much knowledge is required to be further explored, and it is impossible to list the code one by one. However, this function is quite classic. This function is the process of creating a file.

First, we need to get a file struct and assign it to us through the kernel. Then we need to get the inode and call this ep_eventpoll_inode (). Then we can get the file descriptor through get_unused_fd; then the d_alloc () function obtains a dentry; d_add (dentry,
Inode) The function creates a hash in the dentry and binds the inode. The file object file will be populated later. fd_install (FD,
File) register a file with the process and associate the file descriptor with the file object in this way.

(2) Before tracking the ep_file_init function, let's take a look at the eventpoll struct:

/** This structure is stored inside the "private_data" member of the file* structure and rapresent the main data sructure for the eventpoll* interface.*/struct eventpoll {/* Protect the this structure access */rwlock_t lock;/** This semaphore is used to ensure that files are not removed* while epoll is using them. This is read-held during the event* collection loop and it is write-held during the file cleanup* path, the epoll file exit code and the ctl operations.*/struct rw_semaphore sem;/* Wait queue used by sys_epoll_wait() */wait_queue_head_t wq;/* Wait queue used by file->poll() */wait_queue_head_t poll_wait;/* List of ready file descriptors */struct list_head rdllist;/* RB-Tree root used to store monitored fd structs */struct rb_root rbr;};

The annotations are also quite clear. This eventpoll can be seen as the core of epoll. It will store the file descriptor you want to listen to, which is why epoll is efficient.

Okay. Let's go back to the sys_epoll_create function and start tracking the ep_file_init function:

static int ep_file_init(struct file *file){struct eventpoll *ep;if (!(ep = kmalloc(sizeof(struct eventpoll), GFP_KERNEL)))return -ENOMEM;memset(ep, 0, sizeof(*ep));rwlock_init(&ep->lock);init_rwsem(&ep->sem);init_waitqueue_head(&ep->wq);init_waitqueue_head(&ep->poll_wait);INIT_LIST_HEAD(&ep->rdllist);ep->rbr = RB_ROOT;file->private_data = ep;DNPRINTK(3, (KERN_INFO "[%p] eventpoll: ep_file_init() ep=%p\n",current, ep));return 0;}


It is actually the initialization of the eventpoll struct.

This is probably the case with the sys_epoll_create function. We will check sys_epoll_ctl tomorrow.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.