I/O multiplexing SELECT POLL, multiplexing poll

Source: Internet
Author: User

I/O multiplexing SELECT POLL, multiplexing poll
Waiting queue

Add basic knowledge first-waiting for queue

Understanding

Definition

Wait_queue_head_t wait_queue;

Initialization

Init_waitqueue_head (& wait_queue );

Wait

Wait_event (queue, condition) waits for a condition to enter sleep

Wait_event_interruptible (queue, condition) waits for a certain condition to enter sleep and allows the signal to interrupt sleep

Wait_event_timeout (queue, condition, timeout): Wait for a certain condition and enter sleep. The maximum waiting time is timeout.

Wait_event_interruptible_timeout (queue, condition, timeout)

Wake up

Void wake_up (wait_queue_head_t * queue); wake up the process blocking on the waiting queue

Void wake_up_interruptible (wait_queue_head_t * queue );

Use

Assume that your device driver receives data during an interruption and provides read operations for the user space.

You can do this:

1. For a simple description, do not consider synchronization.

Read ()

{

If (len> 0){

Read...

Return len;

} Else {

Return 0;

}

}

Irq_handler ()

{

Recv...

Add Len

}

This is a non-blocking implementation.

2,

Read ()

{

If (wait_event_interruptible (wait_queue, len> 0 )){

Return error;

}

Read...

Return len;

}

Irq_handler ()

{

Recv

Add len

Wake_up_interruptible (& wait_queue );

}

With the blocking method implemented by the waiting queue, no data will put itself into the waiting queue for sleep. When data is interrupted, wake up and sleep during an interruption and wait for the process on the queue to be processed. Of course, blocking is not related to sleep. Here you can wait without data, but sleep is a more elegant way.

Further Analysis

Wait_event

Following up with wait_event (queue, condition), you will find that it defines a wait_queue_t _ wait {. private = current ,. func = autoremove_wake_function,} And put _ wait in the queue, that is, to the task_list linked list of the queue.

Next, set the status of the current process to TASK_UNINTERRUPTIBLE, call schedule (), schedule and switch to a new process to start running.

The process set to TASK_UNINTERRUPTIBLE will not be scheduled and executed by the system, and will always die here. At this point, the process leaves the CPU and does not execute any more. You can think that the process is sleep.

Wake_up

Follow up with wake_up (queue). He actually traverses the task_list linked list of queue and calls its func function for each node (wait_queue_t type.

At this time, the queue should include the _ wait put when wait_event, so wake_up calls the _ wait-> func function, __wait-> func, that is, autoremove_wake_function.

Follow up with autoremove_wake_function and find that the function calls try_to_wake_up. The parameter is the current value given in _ wait. In this way, other processes or interrupted processes can be awakened to sleep.

The processing in try_to_wake_up is complicated and will not continue. We can determine that try_to_wake_up sets the status of the previously sleeping process to TASK_RUNING, so that the previous process can continue to be scheduled and executed, it is awakened.

After try_to_wake_up is executed, delete _ wait from the queue, and the operation of wake_up is completed.

Return to wait_event again

Previously, we knew that the process went to bed after calling schedule and was awakened by another process or stopped wake_up. After the process was awakened, it should continue to run after schedule.

Continue to follow up. After schedule returns, it will first determine whether the condition is true. If not, define _ wait again, and then add it to the waiting queue for schedule to sleep. If yes, the wait_event execution is complete and the waiting conditions of the process are met. You can continue to process the event.

Wait_event_timeout

Wait_event_timeout differs from wait_event because wait_event calls schedule, while wait_event_timeout calls schedule_timeout.

Schedule_timeout calls schedule again, but before the call, it defines a timer. When the specified timeout times out, the timer calls wake_up_process and calls try_to_wake_up to wake up the process. That is to say, wait_event_timetou, in addition to relying on other processes or interrupting itself, also has a timer to wake itself up.

Select

We know that select can monitor multiple descriptors at the same time. If any one has an event, it can be directly returned for processing. If no event exists, select sleep and any descriptor has an event to wake up select. Its implementation is based on the waiting queue. In simple terms, each descriptor corresponds to a waiting queue, and the driver corresponding to each descriptor provides a poll method. Select calls the descriptor's poll method and checks whether there is any event. If there is no event, define a wait_queut_t object and put it in the descriptor's waiting queue. After the select Check shows that no event enters sleep, if any descriptor has an event, the select can be awakened by performing the wake-up Wait queue operation.

The Select system calls sys_select. In fs/select. c (linux 2.6.27 kernel), the call path is sys_select-> core_sys_select-> do_select. Next, let's take a look at the specific implementation of slect system calls. There are a lot of code. Let's just look at the key part. Other details have time to study.

When using select, the user space defines fd_set variables, which correspond to different events including readset, writeset, and effectset. In fact, they are all unsigned long arrays, each bit in the array identifies an fd. Our commonly used FD_SET (fd, set) is to set the fd bit of the array in the set to 1. When we are concerned about the fd events, we will pass the first fd location of the corresponding set to the kernel, notifying the kernel to help me monitor the situation and tell me. By looking at the definition of fd_set in the kernel, we can see that fd_set is a 1024-Bit Array, that is, it supports a maximum of 1024 fd. If you need to support more fd, you need to modify the code to re-compile the kernel.

In kernel space, the core_sys_select function first defines an array of the long type. If the number of fd is large and the array is insufficient, it will call kmalloc to dynamically apply for an array. Arrays are divided into six parts, as shown in. Each part is actually a small fd_set, but fd_set is an array with a fixed length (1024 bits, note that it is a bit rather than a byte, however, the length of each part is related to the actual fd number.

Then core_sys_select calls get_fd_set to copy the readset, writeset, and effectset passed by the user space to in, out, and ex, and then calls do_select to pass the large array to him. Do_select uses the bits in, out, and ex to identify the fd to be monitored and the events (read, write, and commit T) to be monitored ), record the monitoring results to res_in, res_out, and res_ex. Return to core_sys_select. The program calls set_fd_set to copy the results in res_in, res_out, and res_ex to the user space. If the select system calls the returned result, the event processing is obtained.

We mentioned do_select in the previous step. We will study it further.

First, set the current process status set_current_state (TASK_INTERRUPTIBLE); (I am not very familiar with this, is the kernel not preemptible? If it is switched out after the status is set, will it never be switched back, first, no wake-up processing is added at this time, and no other processes will wake him up. Second, the CPU will not schedule task _ interruptible state process execution. So is there no kernel preemption or will the process with TASK_INTERRUPTILBE configured not be preemptible ?)

The information in, out, and ex is then cyclically scanned (which fd cares about the read event, which fd cares about the write event, and which fd cares about the commit t event ), call the driver-related poll function of the specific fd to obtain the status of the fd event. Based on the returned status, set the result to res_in, res_out, and res_ex. In fact, it is very simple. If the nth bit in is one, the descriptor that identifies fd = n cares about the read event, and after calling the driver poll corresponding to fd = n, if there is a read event, the n position in res_in is 1.

(What is the function cond_resched ?)

After a round of processing (the in, out, and ex requests are processed), if the fd request event occurs, the system returns. If none of the events occur, schedule_timeout is called to sleep, wait until the event arrives.

Okay. Let's see how do_select is awakened when there is an event. Before that, let's first think about how to use the queue to implement it if we do it ourselves. In general, we should define a waiting queue wait_queue_head_t queue. When select does not have an event, define a wait_queue_t object, put the wait in the queue, and then schedule to sleep. In the driver, when the event arrives, traverse the wait waiting in the queue and wake up. In fact, the kernel implementation is like this. In the implementation of drivers that support IO blocking, three waiting queues are usually defined, which correspond to read, write, commit T, and select calls to poll, if there is no event, a wait_queue_t wait will be defined and put into the waiting queue. When the driver checks that an event occurs, it will wake up the process sleeping in the waiting queue.

Next, let's take a look at the preparations made by select before going to bed and how to add wait to the waiting queue.

First, let's take a look at a data structure used in do_select.

Struct poll_wqueues {

Poll_table pt;

Struct poll_table_page * table;

Struct task_struct * polling_task;

Int triggered;

Int error;

Int inline_index;

Struct poll_table_entry inline_entries [N_INLINE_POLL_ENTRIES];

}

Do_select declares an object table of this type, and then initializes its member polling_task = current, pt-> qproc = _ pollwait.

Next, when calling the poll driver corresponding to each fd, pass table.pt (poll_table type) as the parameter.

We know that the poll functions implemented by each driver module call the poll_wait function if they do not have the read, write, and handle T events. The wait_address parameter is the waiting queue declared in the driver, p is the table.pt passed in when poll is called. The following is the implementation of poll_wait:

Static inline void poll_wait (struct file * filp, wait_queue_head_t * wait_address, poll_table * p)

{

P-> qproc (file, wait_address, p );

}

We can see that the p-> qproc called in poll_wait is the specified _ pollwait function when poll_wqueue is initialized.

Static void _ pollwait (struct file * filp, wait_queue_head_t * wait_address, poll_table * p );

Struct poll_table_entry {

Struct file * filp;

Unsigned long key;

Wait_queue_t wait;

Wait_queue_head_t * wait_address;

}

_ Pollwait first obtains a variable entry of the poll_table_entry type, which is actually obtained in inline_entries of poll_wqueue. Then initialize entry, entry-> file = file; entry-> key = p-> key; entry-> wait. func = pollwake, and finally add entry-> wait to the waiting queue wait_address.

All preparations are complete. If no event is generated, the do_select scheduling schedule goes to sleep.

The wake-up function is generally handled during an interruption or Soft Interrupt. When an event is detected, the driver calls the wake_up function. The parameter is the waiting queue defined in the driver.

After tracing the wake_up function, the _ wake_up_common function is called. In this function, nodes in wait_queue_head_t are traversed. Each node is of the wait_queue_t type and the func pointer pointing to each node is called. We know that the func Pointer Points to pollwake, and pollwake finally calls try_wake_up to wake up the process.

Pollwake->__ pollwake-> default_wake_function-> try_to_wake_up

Wait_queue_t records the task_struct structure of the process to be awakened. Therefore, the wake-up of the sleep process is realized through the above series of calls.

POLL

The poll and select processes are basically the same. The call path is sys_poll-> do_sys_poll-> do_pollfd.

Do_sys_poll copies the pollfd of the user space to the kernel space and initializes the poll_wqueues table object, which is used the same as the select object. Call do_poll to obtain the fd status to be monitored, copy the status to the user space, and return the result.

Similar to do_select, do_poll queries events without sleep. Only pollfd is used in do_poll, and do_select uses the status of each bit in the long type.

Do_pollfd implements the call to poll and records the status to pollfd.

 

Let's look at the differences between select and poll.

Select uses fd_set to record the descriptor to be checked. The structure itself is 1024 bits, which limits the detection of up to 1024 descriptors.

Poll uses an array of the pollfd structure to check the number of descriptors, and then it will pass the size of the array.

Struct pollfd {

Int fd;

Short events;

Short revents;

};

The fd_set used by select records the input and output. After each return, the returned results overwrite the information passed in during the system call. Therefore, fd_set must be assigned each call to select.

Poll uses the pollfd structure. events records the events to be detected, revents records the results, and pollfd can be initialized once. In the future, poll calls do not need to re-Initialize pollfd.

 

If you write so much without knowing it, You can explore epoll again.

Because it is also a learning process to check data and view code and organize it, and the thinking is a little inconsistent. You are welcome to make a brick. Next, I will sort it out again and think twice.


What is I/O multiplexing in linux network programming? How should I use it?

I/O multiplexing for linux network programming. The select () function is provided by the system.
Select the activated descriptor to perform the operation.
For example, a process has multiple client connections, that is, multiple TCP socket descriptors exist. Select () function Blocking
Until any descriptor is activated, there is data transmission. This avoids the process from waiting for a data already connected.
Unable to process other connections. Therefore, this is a time-division multiplexing method. from the user's point of view, it implements a process
Or concurrent processing in the thread.
The biggest advantage of I/O multiplexing technology is that the system overhead is small, and the system does not need to create processes, threads, or maintain this
Process/thread, which greatly reduces the overhead of the system.
The select () function is used to achieve I/O multiplexing. It allows the process to indicate that the system kernel is waiting for any one of multiple events.
And wake up the process only after one or more tasks are sent or after a specified time.
Its prototype is as follows,
# Include <sys/time. h>
Int select (int nfds, fd_set * readfds, fd_set * writefds, fd_set * errorfds, struct timeval * timeout );
Ndfs: Maximum number of descriptors monitored by the select () function. Depends on the number of opened descriptors in the process. It is generally set
The maximum number of monitored descriptors plus 1.
Readfds: a set of readable descriptors monitored by the select () function.
Writefds: a set of writable descriptors monitored by the select () function.
Errorfds: a collection of exception descriptors monitored by the select () function.
Timeout: select () function timeout End Time
Return Value. If the total number of BITs is returned successfully, these bits correspond to the prepared descriptors. Otherwise,-1 is returned and
.
FD_ZERO (fd_set * fdset): clears the contact between fdset and all descriptors.
FD_SET (int fd, fd_set * fdset): Establishes the connection between the descriptor fd and fdset.
FD_CLR (int fd, fd_set * fdset): revokes the connection between the descriptor fd and fdset.
FD_ISSET (int fd, fd_set * fdset): checks whether the descriptor fd associated with fdset can be read and written. If the returned value is not 0, it indicates that it can be read and written.
The basic steps for implementing I/O multiplexing using the select () function are as follows:
(1) Clear the descriptor set
(2) establish the connection between the descriptor to be monitored and the descriptor set
(3) Call the select () function
(4) check all the descriptors to be monitored and use FD_ISSET to determine whether the descriptors are ready.
(5) perform I/O operations on the prepared Descriptors

The select function and poll function can view a maximum of multiple files at a time.

If it is 64-bit, you can view 1024 at a time. One subsidy, which can be changed. Select can change the threshold value by technical means. In addition, there is no limit to the poll function.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.