Understanding I/O Completion Port (zz)

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Read this IOCP tutorial. I will first give the definition of IOCP, then give its implementation method, and finally analyze an Echo program to open the IOCP puzzle cloud for you, remove your worries about IOCP. OK, but I cannot guarantee that you understand everything about IOCP, but I will do my best. The following are the related technologies I will mention in this article:
I/O port
Synchronous/asynchronous
Blocking/non-blocking
Server/client
Multi-Thread Programming
Winsock APIs 2.0

Before that, I have developed a project, one of which requires network support. At that time, considering code portability, we only need to use select, connect, accept, listen, send, and recv, add a few # ifdef packages to handle the incompatibility between Winsock and BSD socket [socket]. A network subsystem writes a few hours of code, so far, I am very impressed. Then we will not touch it again for a long time.
A few days ago, we planned to create an online game. I took the initiative to take over the network and thought it was not a small case. I was so happy. Online Games are good. Online Games provide hundreds of thousands of players with fun and a mysterious gaming experience. They fight online or join teams to defeat common enemies. I am confident that I am ready to write my network, so I found that the previous blocking synchronization mode could not get in the architecture of a massive multi-player [MMP], and I was directly rejected. As a result, there is IOCP. If you can easily introduce IOCP, there will be no such tutorial. Please follow me to get started.

What is IOCP?
Let's take a look at the IOCP evaluation.
The I/O completion port may be the most complex Kernel Object provided by Win32.
[Advanced Windows 3rd] Jeffrey Richter
This is the best way to [IOCP] implement high-capacity network servers.
[Windows sockets2.0: Write scalable Winsock apps using Completion Ports]
Microsoft Corporation
The port model provides the best scalability. This model is very suitable for processing hundreds or even thousands of sockets.
[Windows Network Programming 2nd] Anthony Jones & Jim ohlund
I/O Completion Ports are particularly important because they are the only technology suitable for high-load servers that [must maintain many connection lines at the same time. Completion Ports utilize some threads to help balance the Load Caused by I/O requests. This architecture is especially suitable for the "scalable" server generated in the SMP system.
[Win32 Multi-thread program design] Jim Beveridge & Robert wearn

It seems that we have reason to believe that iocp is the first choice for large-scale network architecture. What is iocp?
Microsoft introduced the concept of iocp in Winsock2. Iocp stands for I/O completion port. Iocp is an asynchronous I/o api that can efficiently notify applications of I/O events. Different from using select () or other asynchronous methods, a socket [socket] is associated with a complete port, and then the normal Winsock operation can continue. However, when an event occurs, the port is added to a queue by the operating system. Then the application can query the core layer to obtain the complete port.
Here, I want to add some of the above concepts. Before explaining the terms [complete], I would like to briefly introduce the two concepts of synchronization and Asynchronization, logically speaking, synchronization is another thing after an event is done, and asynchronous is used to do two or more things at the same time. You can also use a single thread and multiple threads as a metaphor. However, we must distinguish between synchronization and blocking, asynchronous and non-blocking. The so-called blocking functions such as accept (...), When this function is called, the thread will be suspended until the operating system notifies it. "HEY, someone is connected." The suspended thread will continue to work, it also conforms to the "producer-consumer" model. Blocking and synchronization seem to have two similarities, but they are completely different concepts. As we all know, I/O is a relatively slow device, regardless of the printer, modem, or even hard disk, which is extremely slow compared with the CPU, it is unwise to sit down and wait for I/O to complete. Sometimes the data flow rate is amazing, and the data is removed from your file server at an Ethernet speed, the speed may be as high as 1 million bytes per second. If you try to read Kb from the file server, it is almost done in an instant from the user's perspective. However, you need to know that your thread executes this command, 1 million CPU cycles have been wasted. Therefore, we generally use another thread for I/O. Overlapping IO [overlapped I/O] is a Win32 technique. You can ask the operating system to transmit data for you and notify you when the transfer is complete. This is the meaning of [Completion. This technology enables your program to continue to process transactions during I/O operations. In fact, the operating system completes overlapped I/O using threads. You can get all the benefits of the thread without any pain.

The so-called [port] in the completed port is not the port we mentioned in TCP/IP. It can be said that it is completely irrelevant. I haven't thought about the relationship between an I/O Device [I/O Device] and Port [Port in IOCP. It is estimated that this port is also confusing for many people. IOCP is only used for read/write operations, which is similar to file I/O. Since it is a read/write device, we can only require it to be efficient in reading and writing. In the third part of the article, you can easily find out the true intention of IOCP design.

What is the relationship between IOCP and the network?
Int main ()
{
Wsastartup (makeword (2, 2), & wsadata );
Listeningsocket = socket (af_inet, sock_stream, 0 );
BIND (listeningsocket, (sockaddr *) & serveraddr, sizeof (serveraddr ));
Listen (listeningsocket, 5 );
Int nlistenaddrlen = sizeof (clientaddr );
While (true)
{
Newconnection = accept (listeningsocket, (sockaddr *) & clientaddr, & nlistenaddrlen );
Handle hthread = createthread (null, 0, threadfunc, (void *) newconnection, 0, & dwtreadid );
Closehandle (hthread );
}
Return 0;
}
I believe that anyone who has ever written about the network should be familiar with such a structure. After accept, the thread is suspended and waits for a client to send a request. Then, a new thread is created to process the request. When a new thread processes a customer request, the initial thread loops back and waits for another customer request. The thread processing the customer request ends after processing.
In the preceding concurrency model, a thread is created for each customer request. The advantage is that the thread waiting for the request only needs to do a little work. Most of the time, this thread is sleeping [because the Recv is blocked].
However, when the concurrent model is applied on the server [based on Windows NT], the Windows NT team noticed that the performance of these applications was not as high as expected. In particular, processing many concurrent client requests means that they run in the system in multiple threads concurrently. Because all these threads are runable [not suspended or waiting for something to happen], Microsoft realizes that the NT kernel spends too much time converting the context of the running thread [Context], threads do not get much CPU time to do their work.

You may also feel that the bottleneck of the parallel model is that it creates a new thread for each customer request. Creating a thread has a lower overhead than creating a process, but it is far from having no overhead.
Let's imagine that if n threads are enabled in advance to hold [blocking] them there, all user requests can be delivered to a message queue. Then the n threads extract messages from the Message Queue one by one and process them. You can avoid opening threads for every user request. This not only reduces thread resources, but also improves thread utilization. In theory, it's very good. How can Microsoft not take it into consideration when I come up with all the questions I can come up ?!
The solution to this problem is a kernel object called the I/O completion port, which is introduced in Windows nt3.5 for the first time.
In fact, the above idea should be about the Design Mechanism of iocp. In fact, iocp is not a message queue! What do you say is the connection between this and the [port. I understand that iocp is at most an interface for communication between applications and the operating system.
As for the specific design of iocp, it is hard to say that, after all, I have not read the implementation code, but you can simulate it completely, but the performance may be ..., If you want to gain a deep understanding of iocp, Jeffrey ritchter's advanced windows 3rd contains a lot of valuable content in Chapter 1 and Chapter 13th, you can take a look at how the system completes it.

Implementation Method
Microsoft provides corresponding API functions for iocp. There are two main functions. Let's take a look at them one by one:
Handle createiocompletionport (
Handle filehandle, // handle to file
Handle existingcompletionport, // handle to I/O completion port
Ulong_ptr completionkey, // completion key
DWORD numberofconcurrentthreads // Number of threads to execute concurrently
);
Before discussing the parameters, note that the function is actually used for two distinct purposes:
1. Used to create a complete port object
2. associate a handle [handle] With the completion port
When creating a port, you only need to enter the numberofconcurrentthreads parameter. It tells the system the maximum number of threads allowed to run simultaneously on a completion port. By default, the number of threads opened is the same as the number of cpus. However, we can use the following formula:
Number of threads = number of CPUs * 2 + 2
To make the completed port useful, you must associate it with one or more devices. This is also done by calling CreateIoCompletionPort. You need to pass an existing finished port HANDLE to the function. Since we want to HANDLE network events, that is, to pass the customer's socket as a HANDLE. And a completion key [A 32-bit value that makes sense to you, that is, a pointer. The operating system does not care about what you upload]. Whenever you associate a device with a port, the system adds an information record to the list of devices on the port.

Another API is
BOOL GetQueuedCompletionStatus (
HANDLE CompletionPort, // handle to completion port
LPDWORD lpNumberOfBytes, // bytes transferred
PULONG_PTR lpCompletionKey, // file completion key
LPOVERLAPPED * lpOverlapped, // buffer
DWORD dwMilliseconds // optional timeout value
);
The first parameter specifies the port to be monitored by the thread. Many service applications only use an I/O port. All notifications after I/O requests are completed are sent to this port. To put it simply, GetQueuedCompletionStatus suspends the call thread until one of the I/O completion queues of the specified port appears or times out. The 3rd data structures associated with the I/O completion port are the information in the completed I/O items for the thread: the number of transmitted bytes, the address of the completion key and the OVERLAPPED structure. This information is returned to the thread through the lpdwNumberOfBytesTransferred, lpdwCompletionKey, and lpOverlapped parameters passed to GetQueuedCompletionSatatus.

Based on what has been mentioned so far, first build a frame. The following describes how to use the complete port to develop an echo server. It is roughly as follows:
1. initialize Winsock
2. Create a complete port
3. Create a certain number of threads based on the number of server threads
4. Prepare a socket for bind and then listen
5. Enter the cyclic accept and wait for the customer's request
6. Create a data structure to accommodate socket and other related information
7. Associate the connected socket with the completed Port
8. Deliver a request to be accepted
In the future, we will continue to repeat the process from 5 to 8.
Well, we will use specific code to demonstrate detailed operations.
WOW, if the program code is pasted here, it is really great, and can not CTRL + V can not F7, if you need the source code can send mail to my o_nono@163.net
So far, this article has come to an end. I took you on a tornado tour and visited the so-called complete port.
Many details cannot be detailed due to the length. But I hope this article will give you more thoughts. If you have any questions, you can send to o_nono@163.net.

Can I discuss http://expert.csdn.net/Expert/topic/2659/2659726.xml now? Temp =. 3871271.

[[The above is the original English version of Windows network programming, which is collected in Chinese .]]]]
The "complete port" model is by far the most complex-I/O model. However. If an application needs to manage a large number of sockets at the same time, this model is used. However, this model is applicable only to the following operating systems (Microsoft): Windows NT and Windows 2000. Due to the complexity of its design, only when your applications need to manage hundreds or even thousands of sockets at the same time, and you want to linearly increase the number of CPUs installed in the system and the performance of applications, to use the "complete port" model. A basic principle to remember is to develop high-performance server applications for Windows NT or Windows 2000, in addition, if you want to provide services for a large number of socket I/O requests (Web servers are typical examples), the I/O complete port model is the best choice.
Essentially, the completion port model requires that we create a Win32 completion port object to manage overlapping I/O requests through a specified number of threads. To provide services for completed overlapping I/O requests. Note that. The so-called "completion port" is actually an I/O constructor used by Win32, Windows NT, and Windows 2000. In addition to the socket handle, it can actually accept other things. However, this section only describes how to use a socket handle to exert the power of the port model. Before using this model, you must first create an I/O port object to address any number of socket handles. Manage multiple I/O requests. To do this, call the createiocompletionport function. The function is defined as follows:
Handle createiocompletionport (
Handle filehandle,
Handle existingcompletionport,
DWORD completionkey,
DWORD numberofconcurrentthreads
);
Before in-depth discussion of each parameter, we should first note that this function is actually used for two distinct purposes:
■ Used to create a completed port object.
■ Associate a handle with the completion port.
When a port is created at the beginning, the only parameter of interest is the number of concurrent threads in numberofconcurrentthreads. The first three parameters are ignored. The numberofconcurrentthreads parameter is special in that it defines the number of threads allowed to be executed simultaneously on a completion port. Ideally, we want each processor to be responsible for the running of each thread to provide services for the completion of the port and avoid too frequent thread "scenario" switching. If this parameter is set to 0, it indicates how many processors are installed in the system and how many threads are allowed to run at the same time! Use the following code to create an I/O completion port:
Completionport = createiocompletionport (invalid_handle_value, null, 0, 0 );
This addition is used to return a handle. After a socket handle is assigned to the port, it is used to calibrate (reference) the port ).
1. Worker thread and completion port
After a port is successfully created, You can associate the socket handle with the object. However, before associating a socket, you must first create one or more "worker threads" to deliver the I/O Request to the completed port object. Provide services for the completed port. At this time, you may wonder how many threads should be created. So as to provide services for port completion? This is actually an "complicated" aspect of completing the port model, because the number of I/O requests required for the service depends on the overall design of the application. The important thing to remember here is that the number of concurrent threads we specify when calling CreateIoComletionPort is different from the number of worker threads we intend to create. Early on, we suggested that you use the CreateIoCompletionPort function to specify a thread for each processor (the number of processors is equal to the number of threads) to avoid frequent thread "scenario" exchange activities, thus affecting the overall performance of the system.
The numberofconcurrentthreads parameter of the createiocompletionport function explicitly indicates that only N worker threads can run at a time on a completion port. If the number of worker threads created on the completed terminal exceeds N, at the same time, only n threads are allowed to run. But in fact, in a short period of time, the system may exceed this value. But it will soon be reduced to the value set in the createiocompletionport function in advance. So, why is the number of actually created worker threads sometimes more than the number set by the createiocompletionport function? Is this necessary? As described earlier. This mainly depends on the overall design of the application. Suppose our worker thread calls a function, such as sleep () or waitforsingleobject (), but it is paused (locked or suspended) status, then allow another thread to replace its location. We want to execute as many threads as possible at any time. Of course, the maximum number of threads is set in createiocompletonport call. So-come. If it is expected that your thread may be temporarily paused, it is better to create a thread with more values than the numberofconcurrentthreads parameter of createiocompletionport so that the system's potential can be fully realized at that time. -Once there are enough worker threads on the completion port to provide services for I/O requests, You can associate the socket handle with the completion port. This requires us to call the createiocompletionport function on an existing completion port and provide socket information for the first three parameters: filehandle, existingcompletionport, and completionkey. The filehandle parameter specifies the socket handle to be associated with the completion port.
The existingcompletionport parameter specifies an existing completion port. The completionkey (Completion key) parameter specifies the "Single Handle data" to be associated with a specific socket handle. In this parameter, the application can save any type of information corresponding to-socket. It is called "Single Handle data" because it only corresponds to the data associated with the socket handle. It can be used as a pointer to a data structure to save the socket handle. In that structure, it also contains the socket handle and other information related to that socket. As described later in this chapter, this parameter can be used by the thread routines that provide services to the port. Obtains information related to the character handle.
Based on what we have learned so far. First, build a basic application framework. The procedure 8-9 explains how to use the complete port model. To develop-response (or "reflection") server applications. In this program. Follow these steps:
1) create a complete port. The fourth parameter is set to 0, indicating that only one worker thread can be executed at a time on the completion port.
2) determine the number of processors installed in the system.
3) create a worker thread and provide services for completed I/O requests based on the processor information obtained in step 2. In this simple example, we create only one worker thread for each processor. This is because it has been predicted that no threads will be suspended at that time, resulting in insufficient number of threads, the processor is idle (there are not enough threads available for execution ). When calling the createthread function, a worker thread must be provided at the same time, which is created and executed by the thread. This section will discuss thread responsibilities in detail later.
4) Prepare a listening socket. Listen for incoming connection requests on port 5150.
5) use the accept function to accept incoming connection requests.
6) create a data structure to accommodate "Single Handle data ". Store the accepted socket handle in the structure at the same time.
7) Call CreateIoCompletionPort to associate the new socket handle returned from accept with the completion port. Pass the data structure of the handle to CreateIoCompletionPort through the CompletionKey parameter.
8) Start the I/O operation on an accepted connection. Here, we want to deliver one or more asynchronous WSARecv or WSASend requests on the new socket through the overlapping I/O mechanism. After these I/O requests are completed, a worker thread will provide services for the I/O requests and continue to process future I/O requests. Later, in step 3) in the specified worker routine. Experience this.
9) Repeat Step 5)-8 ). Until the server is terminated.

Program list 8. 9. Complete port Creation
StartWinsock ()
// Step 1: create a complete port
CompletionPort = CreateIoCompletionPort (INVALI_HANDLE_VALUE, NULL, 0, 0 );
// Step 2: determine the number of processors
GetSystemInfo (& SystemInfo );
// Step 3: Create a worker thread based on the number of processors. In this example, the number of worker threads is the same as the number of processors.
For (I = 0; I <SystemInfo. dwNumberOfProcessers, I ++)
{
HANDLE ThreadHandle;
// Create a worker thread and pass the completion port as a parameter to the thread
ThreadHandle = CreateThread (NULL, 0, ServerWorkerThread, CompletionPort, 0, & ThreadID );
// Close the thread handle (only close the handle, not the thread itself)
CloseHandle (ThreadHandle );
}
// Step 4: Create a listening socket
Listen = WSASocket (AF_INET, S0CK_STREAM, 0, NULL, WSA_FLAG_OVERLAPPED );
InternetAddr. sin_famlly = AF_INET;
InternetAddr. sin_addr.s_addr = htonl (INADDR_ANY );
InternetAddr. sln_port = htons (5150 );
Bind (Listen, (PSOCKADDR) & InternetAddr, sizeof (InternetAddr ));
// Prepare the listening socket
Listen (Listen, 5 );
While (TRUE)
{
// Step 5: access the Socket and associate it with the port
Accept = WSAAccept (Listen, NULL, 0 );
// Create a perhandle structure in Step 6 and associate it with the port
PerHandleData = (LPPER_HANDLE_DATA) GlobalAlloc (GPTR, sizeof (PER_HANDLE_DATA ));
Printf ("Socket number % d connected/n", Accept );
PerHandleData-> Socket = Accept;
// Step 7: Access socket and complete Port Association
CreateIoCompletionPort (HANDLE) Accept, CompletionPort, (DWORD) PerHandleData, 0 );
// Step 8
// Start the I/O operation and send some WSASend () and WSARecv () with overlapping I/O ()
WSARecv (...);
}

2. Complete port and overlapping I/O
After the socket handle is associated with a complete port, you can use the socket handle as the basis. Delivery and delivery
Receive requests. Start processing I/O requests. Next, you can start to rely on the completion port to receive notifications about the completion of I/O operations. Essentially, the complete interface model uses Win32 overlapping I/O mechanism. In this mechanism. Winsock API calls such as WSASend and WSARecv will return immediately. At this point, our application will be responsible for some time in the future. An OVERLAPPED structure is used to receive the call results. In the completed port model. To do this, you need to use the GetQueuedCompletionStatus function. Let one or more worker threads wait on the completion port. The function is defined as follows:
B00L GetQueuedCompletionStatus (
HANDLE CompletionPort,
Lpdword lpnumber0fbytestransferred,
Lpdword lpcompletionkey,
Lpoverlapped * lpoverlapped,
DWORD dwmilliseconds
};

The completionport parameter corresponds to the completion port to be waited on. the lpnumberofbytestransferred parameter is used to receive the actual transmitted bytes after I/O operations (such as wsasend or wsarecv) are completed. The lpcompletionkey parameter returns "Single Handle data" for the socket originally passed into the createcompletionport function ". As we mentioned earlier, we 'd better save the socket handle in this key. The lpoverlapped parameter is used to receive overlapping results of completed I/O operations. This is actually a very important parameter, because it is used to obtain data for each I/O operation. The last parameter. Dwmilliseconds is used to specify the time when the caller wants to wait for a complete packet to appear on the completion port. Set it to infinite. The call will never stop waiting.
3. Single Handle data and single I/O operation data
-A worker thread receives the I/O completion notification from the API call getqueuedcompletionstatus. The lpcompletionkey and lpoverlapped parameters contain the required socket information. With this information, you can complete the port and continue the I/O processing on a socket through these parameters. Two important aspects of socket data are available: Single Handle data and single I/O operation data.
The lpcompletionkey parameter contains "Single Handle data", because the first time a socket is associated with the completion port. The data corresponds to a specific socket handle. The data is passed through the completionkey parameter when calling the createiocompletionport API. As described earlier. The application can pass any type of data through this parameter. Normally, the application stores the socket handle related to the I/O Request here.
The lpoverlapped parameter contains an overlapped structure followed by "single I/O operation data ".
When our worker thread processes a complete data packet (it forwards the data intact, accepts the connection, ships another thread, and so on ). this information must be known. single I/O operation data can be any number of bytes appended to the end of an overlapped structure. If a function requires an overlapped structure, we must pass such a structure to satisfy its game. To achieve this, a simple method is to define a structure. The overlapped structure is used as the first element of the new structure. For example. You can define the above data structure to manage the single I/O operation data:
Typedef struct {
Overlapped;
Wsabuf databuf;
Char bufferl [DATA-BUFSIZE];
Bool operationtype;
} Per_io_operation_data;
This structure demonstrates some important data elements that are usually associated with I/O operations, such as the type of the I/O operation just completed (sending or receiving requests ). in this structure. We think that the data buffer for completed I/O operations is very useful. To call a Winsock API function and assign it an OVERLAPPED structure, you can "shape" Your Own structure as an OVERLAPPED pointer, you can also simply unreference the OVBRLAPPED element in the structure. For example:

PER_IO_OPERATION_DATA PerIoData;

You can call a function as follows:

WSARecv (socket ,..., (OVERLAPPED *) & PerIoData;

Or, as shown below
WSARecv (socket ,..., & (PerIoData. Overlapped ));
In the later part of the worker thread. After the GetQueuedCompletionStatus function returns an overlapping structure (and completion key. You can unreference an OperationType member. Investigate which operation is delivered to this handle (you only need to shape the returned overlapping structure as the self-defined PER_IO_OPERATlON_DATA structure ). For a single I/O operation data, its biggest advantage is that it allows us to be on the same handle. Manage multiple I/O operations at the same time (read/write, multiple reads, multiple writes, and so on ). At this time, you may have the following question: is it really necessary to deliver multiple I/O operations at the same time on the same socket? The answer is the system's "scalability" or "scalability ". For example, assume that our machine has multiple central processors installed. Each processor is working on a worker thread. At the same time, there may be several different processors running on the same socket to send and receive data.
To complete the preceding simple response server example, we need to provide a ServerWorkerThread function. In program consumption ticket 8.10, we show how to design a worker thread routine to use single handle data and single I/O operation data to provide services for I/O requests.
Program list 8-10 completion port worker thread
Dword winapi ServerWokerThread (LPVOID CompletionPortID)
{
HANDLE CompletionPort = (HANDLE) ComleTionPortID;
DWORD BytesTransferred;
LPOVERLAPPED Overlapped;
LPPER_HANDLE_DATA PerHandleData;
LPPER_IO_OPERATION_DATA PerIoData;
DWORD SendBytes, RecvBytes;
DWORD Flages;
While (TRUE)
{
// Wait for I/O to Complete on any socket
// Associated with the completionport
GetQueuedCompletionStatus (CompletionPort,
& BytesTransferred,
(LPDWORD) & PerHandleData,
(LPOVERLAPPED *) & PerIoData, INFINITE );
// First check o see whether an error has occurr
// On the socket, if so, close the socke and clearup
// Per-handle and Per-I/O operation data associated
// Socket
If (BytesTransferred = 0 )&&
(PerIoData-> OperationType = RECV_POSTED )&&
(PerIoData-> OperationType = SEND_POSTED)
{
// A Zero BytesTransferred indicates that
// Socke has been closed by the peer, so you should
// Close the socket
// Note: Per-handle Data was used to refresence
// Socket associated with the I/O operation;
Closesocket (PerHandleData-> Socket );
GlobalFree (PerHandleData );
GlobalFree (PerIoData );
Continue;
}
// Service the completed I/O request; You
// Detemine which I/O request has just completed
// By looking as the operationType field contained
// The per-I/O operation data
If (PerIoData-> OperationType = RECV_POSTED)
{
// Do someting with the specified ed data
// In PerIoData-> Buffer

}

// Post another WSASend or WSARecv operation
// As a example we will post another wsarecv ()

Flags = 0;

// Set up the per-I/O operation data for a next
// Overlapped call

Zeromemory (& (periodata-> overlapped), sizeof (overlapped ));
Periodata-> databuf. Len = data_buffer_len;
Periodata-> databuf. Buf = periodata-> buffer;
Periodata-> operationtype = recv_posted;

Wsarecv (perhandledata-> socket,
& (Periodata-> databuf), 1, & recvbytes,
& Flags, & (periodata-> overlapped), null );
}
}
In simple server examples listed in program list 8-9 and program list 8-10 (supporting CDs are also available ), the final processing details should be noted: how to properly close the I/O completion port 1-especially when one or more threads are running at the same time, when I/O operations are performed on several different sockets. An important problem to avoid is to forcibly release an OVERLAPPED structure while performing overlapping I/O operations. To avoid this situation, the best way is to call the closesocket function for each socket handle. Any overlapping I/O operations that have not been performed will be completed. -Once all socket handles are closed. To terminate the running of all worker threads on the port. To achieve this, you need to use
The PostQueuedCompletionStatus function sends a special completion packet to each worker thread. This function indicates that each thread "ends immediately and exits". The following is the definition of the PostQueuedCompletionStatus function:
BOOL PostQueuedCompletionStatus (
HANDLE CompletlonPort,
DW0RD dwNumberOfBytesTrlansferred,
DWORD dwCompletlonKey,
LPOVERLAPPED lpoverlapped,
);
The CompletionPort parameter specifies the completion port object to which the packet is sent. For the three parameters dwNumberOfBytesTransferred, dwCompletionKey, and lpOverlapped, each parameter allows us to specify a value and pass it directly to the corresponding parameter in the GetQueuedCompletionStatus function. So-come. -After receiving the passed three GetQueuedCompletionStatus function parameters, the worker threads can determine when to exit based on the special values set by one of the three parameters. For example, the value 0 can be passed using the dwCompletionPort parameter, and the-worker thread will interpret it as a abort command. Once all worker threads are closed, you can use the CloseHandle function to close the completion port. Finally, exit the program safely.
4. Other problems
There are also several valuable technologies. You can use the following steps to improve the overall I/O performance of the socket application. One technique worth considering is to experiment with different Socket buffer sizes to improve I/O performance and application scalability. For example, if a program only uses a relatively large buffer zone, it can only support a wSARecv request, rather than setting three smaller buffers at the same time. With support for three WSARecv requests, the program's scalability is not very good, especially after being transferred to a machine with multiple processors installed. This is because a single buffer can only process one thread at a time! In addition, the single-buffer design will also cause certain performance interference, if the-time can only perform-time receiving operations. The potential of the network protocol driver cannot be fully utilized (it is often "idle "). In other words, if you need to wait for the completion of the second WSARecv operation before receiving more data, the entire protocol is actually in the "Resting" State between the completion of WSARecv and the next receipt.

In addition, the recommended performance improvement measures are to use the SO_SNDBUF and SO_RCVBUF interning options to control the size of the internal Socket buffer. With these options, the application can change the size of the internal data buffer of-sockets. If this parameter is set to 0, Winsock directly uses the application buffer in the overlapping I/O calls to transmit data in the protocol stack. In this way, the secondary buffer replication is avoided between the application and Winsock. The following code snippet illustrates how to use the SO_SNDBUF option to call the setsockopt function:
Setsockopt (socket, SOL_S0CKET, SO_SNDBUF, (char *) & nZero, sizeof (nZero ));
It should be noted that after the buffer size is set to 0, there will be a positive effect only when multiple I/O Requests exist for a given period of time. In chapter 2, we will give you a more in-depth introduction to socket options. The last measure to improve performance is to use the AcceptEx API

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding I/O Completion Port (zz)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding I/O Completion Port (zz)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support