Windows core programming 5th Reading Notes-Chapter 1 synchronous device I/O and asynchronous device I/O

Last Update:2018-12-05 Source: Internet

Author: User

Tags apc

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Asynchronous device I/O Basics

Assume that a thread sends an asynchronous I/O Request to the device. This request is sent to the device driver, which is responsible for actual I/O operations. When the driver is waiting for the response from the device, the application thread is not suspended because it is waiting for the completion of the I/O request. The thread will continue to run and execute other useful tasks.At a certain time point, the device driver completes the processing of I/O requests in the queue. At this time, it must notify the application that the data has been sent, the data has been received, or an error has occurred, these notifications are called "receive I/O request Completion notifications"

Notification of I/O request Completion

Method for receiving I/O completion notifications

Technology	Summary
Device Kernel Object triggering	This method is useless when multiple I/O requests are sent to a device at the same time. It allows one thread to send an I/O request and the other thread to process the result.
Trigger event Kernel Object	This method allows us to send multiple I/O requests to a device at the same time. It allows one thread to send an I/O request and the other thread to process the result.
Use reminder I/O	This method allows us to send multiple I/O requests to a device at the same time. The thread that sends an I/O request must process the result.
Use I/O to complete the port	This method allows us to send multiple I/O requests to a device at the same time. It allows one thread to send an I/O request and the other thread to process the result. This technology provides high scalability and optimal flexibility.

Device Kernel Object triggering

Before adding an I/O Request to a queue, the ReadFile and WriteFile functions set the kernel object of the device to the untriggered status. When the device driver completes the request, the driver sets the device kernel object to the trigger state. Therefore, we can use the device kernel object for thread synchronization.

Trigger event Kernel Object

The last hEvent member in the OVERLAPPED structure in asynchronous I/O identifies an event kernel object. When an asynchronous I/O request is completed, the device driver checks whether the hEvent Member of the OVERLAPPED structure is NULL. If hEvent is not NULL, the driver calls SetEvent to trigger the event. Like waiting for the device kernel object, we can wait for the hEvent event object in the OVERLAPPED structure.

Reminder I/O

When the system creates a thread, a queue associated with the thread is created. This queue is called an asynchronous procedure call (APC) queue.When an I/O request is sent, we can tell the device driver to add. To add the I/O completion notification to the APC queue of the thread, we should call the ReadFileEx and WriteFileEx functions:

BOOL ReadFileEx(HANDLE hFile,PVOID pvBuffer,DWORD nNumBytesToRead,OVERLAPPED* pOverlapped,LPOVERLAPPED_COMPLETION_ROUTINE pfnCompletionRoutine) ;BOOL WriteFileEx(HANDLE hFile,CONST VOID *pvBuffer,DWORD nNumBytesToWrite,OVERLAPPED* pOverlapped,LPOVERLAPPED_COMPLETION_ROUTINE pfnCompletionRoutine) ;

Both of the above functions contain one of the most important parameters, which requires us to transmit the address of a callback function. This callback function is called the completion routine ). Its prototype must conform to the following forms:

VOID WINAPI CompletionRoutine(DWORD dwError,DWORD dwNumBytes,OVERLAPPED * po) ;

When a thread is in the reminder state (six functions provided by Windows can be called to place the thread in the reminder state), the system checks its APC queue for each item in the queue, the system calls the completion function and passes in the I/O error code, the number of transmitted bytes, and the address of the OVERLAPPED structure.When calling the six functions, note that when calling any of these functions, as long as there is at least one item in the APC queue of the thread, the thread will not enter the sleep state.. When calling these functions, these functions will suspend the thread only when none of the items in the APC queue of the thread exist. When a thread is suspended, the thread will be awakened if the (or those) kernel object we are waiting for is triggered or an item appears in the APC queue of the thread. Because our threads are in a reminder state, once an item appears in the APC queue, the system will wake up our process and clear the queue by calling the callback function. Then the function will immediately return --- the thread will not sleep again to wait until the kernel object is triggered.

I/O advantages and disadvantages

Callback Function: I/O requires that we must create a callback function, which makes the implementation of the Code more complex. Because these callback functions do not have sufficient context information related to a problem,Therefore, we have to put a lot of information in global variables..
Thread Problems: In fact, it can be reminded that the major problem of I/O is that the thread of the I/O request must process the completed notification at the same time. If a thread sends multiple requests, the thread must respond to the completion notification of each request even if other threads are completely idle.Because there is no load balancing mechanism, application scalability is not very good.

I/O completion port

1. Create an I/O completion port

The theory behind the I/O completion port is that there must be an upper limit on the number of concurrent threads.-- That is to say, 500 running threads should not be allowed for the 500 client requests simultaneously.Once the number of runable threads is greater than the number of available CPUs, the system must take the time to perform thread switching up and down, which wastes valuable CPU cycles.-- This is also a potential disadvantage of the concurrency model.

Another disadvantage of the concurrency model is that a new thread needs to be created for each customer request, which still consumes a certain amount of money. If you can create a thread pool during application initialization and keep the threads in the thread pool available during application running, the performance of the service application can be improved. I/O port is designed to work with the thread pool.

2. I/O completion port peripheral Architecture

When the service application is initialized, the application should then create a thread pool to process customer requests.In terms of standard empirical rules, the number of threads in the thread pool should be the number of CPU of the host and multiplied by 2..

3. How to manage the thread pool (Working Mechanism) on the I/O completion port)

First, when I/O is created, you must specify the number of threads allowed to run concurrently. As mentioned above, we usually set this value to the number of CPUs of the host (Different from the number of threads in the thread pool). When completed I/O items are added to the queue, the I/O completion port wants to wake up the waiting thread. However, the number of threads that wake up the port does not exceed the number we specified. Therefore, if four I/O requests have been completed and four threads are waiting for GetQueuedCompletionStatus, the I/O completion port will only wake up two threads, let the other two threads continue to sleep. When each thread finishes processing a completed I/O item, it will call GetQueuedCompletionSatus again. At this time, the system finds that there are other items in the queue, so it will wake up the same thread to process the remaining items.

When the completion port wakes up a thread, it stores the thread identifier in the 4th data structures associated with the completion port, that isList of released threads(Released thread list ). This enables the completion port to remember which threads have been awakened and monitor their execution. If any function called by a released thread switches the thread to the waiting state, the completion port will detect this situation and update the internal data structure, remove the thread Identifier from the released thread list and add itPaused thread list(Paused
Thread list.

The goal of port completion is to keep as many threads as possible in the list of released threads based on the number of concurrent threads specified when the port is created. If a released thread enters the waiting state for any reason, the list of released threads will be reduced, and another waiting thread can be released after the port is completed. If a paused thread is awakened, it leaves the paused thread list and enters the released thread list again. This means that the number of threads in the released thread list is greater than the maximum number of concurrent threads allowed.

Suppose we are running on a machine with two CPUs. We create a complete port that allows only two threads to be awakened at the same time, and also create four threads to wait for completed I/O requests. If three completed I/O requests are added to the port queue, only two threads will be awakened to process the requests, which reduces the number of runnable requests, and saves context switching time. Now, if a runable thread calls Sleep, WaitForSingleObject, WaitForMultipleObjects, and SignalObjectAndWait, an asynchronous I/O call or any function that can cause the thread to be not running, the I/O completed port detects this situation and immediately wakes up 3rd threads.The goal of port completion is to keep the CPU at full load.

Summary:

The IOCP mechanism is probably: When a thread calls GetQueuedeCompletionStatus to wait for an item in the queue to be completed, the system will add this thread to IOCPWaiting thread queue. WhenI/O Completion queueThe system will first wake up the recently waited thread (LIFO) and add this threadReleased thread queue. When an item continues to appear in the I/O Completion queue, if there is no thread pool and there are threads waiting, and the released threads are not greater than the number of created ports, IOCP will wake up a new thread to process a new item. Otherwise, wait for the running thread to process the previous item, and then let the thread that processes the data process the new item in the I/O Completion queue (This reduces the cost of context switching for different threads.). When a running thread calls other functions and suspends itself or is in the waiting state, IOCP will find this situation and put these threads in its own"Thread queue suspended", That is, the number of released threads is reduced. When a new entry appears in the I/O Completion queue, IOCP can wake up a thread.Four data structures associated with ICOP (in bold)This is probably how it works. From this we can see that IOCP can not only allow the CPU to handle full-load work, but also minimize the cost of context switching between threads.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More