C + + server Design (ii): Application layer I/O buffering

Last Update:2016-06-15 Source: Internet

Author: User

Tags epoll

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Data integrity Discussions

We have chosen the I/O multiplexing model as the system underlying I/O model. However, we do not specifically address the read and write problem, that is, in our reactor mode, how we read and write operations to ensure that the data sent or received for each connection is complete, and when a connection to read and write to the entire system of the other connection processing effect as small as possible.

Before we discussed why non-blocking I/O was not selected as the underlying I/O model. Also, non-blocking I/O can not be used in I/O multiplexing. Because system calls such as read, write, accept, and connect in non-blocking I/O are likely to block the current thread, if multiple events are registered in the reactor reactor, one of the event handlers is blocked by invoking a system call, even if there are still multiple pending events in the thread, These events cannot be processed further.

Therefore, the core of nonblocking I/O in reactor mode is to avoid blocking the system on the I/O system call, so that the most reusable thread can be used to serve multiple socket connections. In non-blocking I/O, the application-layer buffering for each I/O is also necessary because the amount of data that can be read and written in system calls such as read, write, and so on is unknown.

Application layer I/O buffering usage scenarios

Let's consider a scenario for output buffering: The event handler wants to send 80kb of data, but in the write call, the system can only accept 50kb of data due to the sending window. The event handler should surrender control as soon as possible because it cannot block to continue waiting. In this case, what should the remaining 30KB data do?

As far as business logic is concerned, we should always send the data in the usual way, without needing to care whether the data we need to send is sent out at once or divided into several times. Therefore, we should take over the 80KB data through the server System Application layer buffering mechanism, and register the corresponding socket writable event to the reactor at this time. Once a socket-writable event is generated, it tries to send the buffered data from that application layer. Of course, this send may only send data that is part of the buffer, and if the data in the application-tier buffer is not sent, continue to focus on the socket-writable event until the next writable time the data in the buffer continues to be sent. The socket writable event is not stopped until the data in the application layer buffer has been completely sent.

The business logic needs to send 20kb of data to the connection when the 80KB data in the application-tier buffer is only sent out of 30KB. We still simply append this 20kb of data to the application layer buffer and then refer to the previous process to send the application-layer buffered data. Since the TCP protocol we adopt has an orderly character, and the data in the application layer buffer is also sent sequentially, so long as we add the appended data to the buffer, we can guarantee that the received data is received in the order we first send and the second sent.

Let's consider a scenario for input buffering: Because TCP is a non-boundary byte stream protocol, in general network transmission we will develop the relevant application Layer network protocol to determine the boundaries of each message. According to the TCP receive window size is dynamic change, when the reactor received a socket-readable event, if we read the socket, perhaps read the data is not enough to form a complete message. However, since we are the chosen level trigger Epoll method, we must read the data of the readable socket at once, otherwise epoll will be repeatedly activated by the level signal and notify the socket readable event, so that our entire system is degraded to polling mode.

So when we get an "incomplete" data message, the application layer input buffering comes in handy. Whenever a socket-readable event is received, the data that is received by the socket read operation can be placed in the end of the application layer input buffer of the socket. Then the data in the buffer is analyzed, whether it can form a complete message. If not, return control directly from the event handler. If a message boundary is checked, the message data is fetched from the application-tier buffer and the specific application business logic code is invoked.

Application layer I/O buffering requirements analysis

Based on our analysis of the system, Application layer I/O buffering should meet the following requirements:

L is similar to a queue container, which writes data from the end and reads data from the head.
L performance is a contiguous memory, and the length can be automatically increased to accommodate different sizes of messages.
L can support as input buffer, also can support as output buffer.

Common data structures such as vectors, deque, and List containers in STL can meet the 1th requirement.

Vector as a one-way continuous storage container, need to maintain the relevant subscript index, record the current read position and write position respectively as head and tail. At the same time, the vector itself is contiguous memory, which can be used as an incoming parameter to read and write system calls directly. The vector supports dynamic growth, but if it exceeds the allocated memory size itself, it will reallocate memory and replicate the old data for a certain amount of overhead. At the same time, with the data read and write in the container leading to the reading and writing subscript movement, the container head will appear empty and the data behind the phenomenon, we need to dynamically maintain the data location, prevent the data behind the head of the vector space wasted.

Deque as a dynamic growth piecewise continuous two-way container, we can directly use its characteristics, a section as the head for reading data, the other end of the tail to write data. Because the deque is also dynamic growth, so by the deque itself to maintain the way, without the similar vectors need to maintain their own subscript trouble. At the same time, deque will not occur due to the data movement caused by space waste phenomenon. But deque internal data is not necessarily the way of continuous memory storage, that is, if we expect to read a piece of data in deque, and by the read, write and other system calls, we must open a new memory, and the data into the head pointer to char* p, The length is in the form of int Len to be passed to the specific system invocation parameters. Therefore, we do not consider the deque as the underlying structure of I/O buffering.

List, as a doubly linked list, is not memory contiguous and is also not suitable as an I/O buffer. The reason is the same as deque analysis, do not repeat interpretation.

At the same time, we study the application environment of input buffering and output buffering, although we divide the buffer into two parts, in which the input buffer reads the data from the socket and writes it, then leaves the business logic to read the data from the buffer, and the output buffer writes the data from the business logic and writes to the socket. Both the input buffer and the input buffer are for the customer code, which is essentially the same design logic, but read-write instead.

Based on demand, we make a trade-off between ease of use and performance, and finally use STL's std::vector<char> as the underlying container for application-layer buffering to hold buffered data.

Application layer I/O buffering design

Inside the application layer I/O buffering, is a std::vector<char>, which is a contiguous memory that can be used directly as a parameter to a basic read-write system call. Also maintain two subscript indexes in the buffer, pointing to the elements in the vector, indicating the current readable position and the current writable position. It is important to note that these two subscript indexes are not traditional pointers, but rather the int types that record the subscript values directly. Because the memory in the vector may be redistributed as it expands, the original pointer iterator will be invalidated when the memory expansion occurs.

Figure 3-7 Initializing buffers

3-7 is a buffered data structure that initializes the size of a byte. The structure body size is vector<char> of a byte, and there are two index, respectively, Readindex and Writeindex. With these two subscripts, the entire contiguous memory space can be divided into three parts of the buffer head, readable and writable.

Where the memory space starts from part 0 to the Readindex portion as the buffered header. The head is generally due to the actual data behind the shift, since the data will only be written from the tail, so the head space will not be buffered directly to use, resulting in space waste. Therefore, we should adjust the mobile data position through some strategy to eliminate the buffer head.

The Readindex to Writeindex section is readable, which is the current data buffer that is actually stored. Each time the data in the buffer begins to be read from the Readindex, the number of bits is read and the Readindex is shifted to the right. Where the offset from Readindex to Writeindex is the amount of data currently buffered. When the Readindex and Writeindex phases are present, there is no buffer data available for reading in the buffer at this time.

From Writeindex to the end of memory is writable, which is the buffer space currently available to be written to new data. There is a size limit here, but because vectors can grow dynamically, when the writable size is not large enough to accommodate the size of the data the application wants to write buffered, it will in some cases reallocate a larger contiguous memory space through the vector expansion and repopulate the previous data. Ensure that more buffered data can be written. However, in the current implementation can only meet the dynamic growth, after the growth if the data is read, the memory space at this time can not be reduced.

Figure 3-8 Buffer for writing data

As shown in 3-8, if the initialized buffer is written to 800 bytes of data, the Readindex is unchanged, Writeindex 800 bytes, where the area shown readable is the memory that stores the 800 bytes of data. At this point, the entire memory portion has 224 bytes left, this is the writable region, if you want to append more data, just copy the new data to Writeindex referred to memory, and continue to move back writeindex subscript.

Figure 3-9 Buffer for reading data

3-9, the 400-byte data is read from the original buffer Readindex and the Readindex is shifted by 400 bytes. At this point the writeindex is unchanged, the writable area is still 224 bytes in size, but since the readable region is read 400 bytes, the new readable area is reduced from 800 bytes to 400 byte size.

At this point in the memory space at the beginning of the Readindex, there are 400 byte-sized buffer head. Since the newly written data is written to Writeindex, and the read data starts from Readindex, the space taken up by the buffer head is not actually exploited. With the further reading and writing of buffered data, the Writeindex and Readindex two subscript indexes will be further shifted, resulting in a larger buffer head area, resulting in a more serious memory waste. So we need some mechanism to dynamically adjust the buffer data position, eliminate the buffer head, let this part of the memory re-use. But moving the buffered data also has a certain overhead, more than 3-9, when there are 400 bytes of data in the readable region, and if the data is moved to the beginning of the memory, it will incur 400 bytes of data copy overhead. So we can't make this kind of data adjustment frequently.

If we continue to read 400 bytes of data on the basis of Figure 3-9, Readindex again moves back 400 bytes, coincident with Writeindex. In this case, the readable area in the buffer is 0, with no readable data, and the buffer head expands to a size of 800 bytes. We move the Readindex and Writeindex to the beginning of the memory because there is no data in the buffer, so there is no overhead of data movement. The buffer head size reverts to 0, and the previously wasted 800 bytes of memory space can be used again, and the entire buffer looks back to its initialized state.

Using readable as 0 o'clock to adjust the buffered subscript index we are able to reuse the buffer head memory in case of minimal overhead. However, we do not know when the system will be able to reach the readable area of 0 this condition. Maybe there's a buffer. Because frequent write and read operations result in a long-term inability to reach the condition, if this happens, the buffer will have a wasted buffer header area for a long time. We need to further refine the buffer head adjustment strategy.

On the basis of Figure 3-9, we continue to write 300 bytes of data. At this point, the remaining buffer writable area is only 224 bytes in size and cannot be written directly to 300 bytes of data. We can expand the writable area by allowing the vector to enlarge. But vector expansion overhead is great, we need to apply for a larger contiguous memory area, then transfer all the data in the original memory to the new memory and finally release the old memory.

It can be observed that, although the writable region is only 224 bytes in size, the buffer head has a 400-byte free memory in the entire memory. The buffer head plus the writable area has 624 bytes, and if adjusted slightly, the original memory size can be completely re-written to 300 bytes of data.

Figure 3-10 Adjusting the buffer of the buffer head

So when the writable region is not enough to accommodate the new data, but the writable plus the buffer head size can accommodate the new data, we again adjust the buffer head. Move the Readindex to the start of the memory and also copy the data from the original readable area to the start of the memory, eliminating the buffering head. The appended new data is added at the end of the data, and the current Writeindex location is determined.

The final result is shown in 3-10. After adding 300 bytes of new data, the Writeindex is located at 700 bytes, at which time the remaining writable memory size in the buffer is 324 bytes. Although we were forced to move the entire old data part in the process of eliminating the buffer head, these costs were relatively small and acceptable relative to the way the vector was scaled.

Figure 3-11 the buffer after expansion

On the basis of Figure 3-10, we continue to write 500 bytes of new data. At this point, the remaining buffer writable area is only 324 bytes in size, and the buffer head is empty, the entire contiguous memory area can not provide 500-byte size space. So there is only a way to enlarge the vector. Here, we extend contiguous memory to 2048 bytes through the reserve call of the vector, because the reserve operation can help us to complete the move copy operation of the old data, so we just add 500 bytes of new data after the new Writeindex, and adjust the Writeindex again.

3-11 shows the data structure after expansion, at which time the Writeindex at 1200 bytes, the data size in the readable region is 1200 bytes, and the whole buffer can add up to 848 bytes of new data.

C + + server Design (ii): Application layer I/O buffering

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More