For the buffer design, the practice of chaos and other network libraries is slightly different, take the libevent-stable-1.4.13 (the libevent mentioned below is the version, it uses a buffer that can be automatically expanded. The basic strategy is as follows:
1. When the buffer zone is insufficient to store new data, it will first perform internal marshal to see if it can free up enough space.
2. If not, the buffer is expand, which is twice the original size. After expansion, the buffer will not be reduced.
In addition, libevent has two buffers, input and output, which respectively represent read buffer and write buffer. libevent has slightly different policies for their expansion.
For input, libevent limits its maximum buffer size to 4096, which is not enough for the current network environment, especially the Intranet environment.
For output, the libevent is not limited in size.
In this way, there will be no problem in normal stable transmission, but if the upper layer puts a large data block (more than 1 MB) into the output, or the upper layer quickly puts small pieces of data into the output, in some cases, the underlying I/O reusable threads may not be able to respond in time, resulting in output growth to a very large extent. However, the buffer can only be stretched out and cannot be shrunk, the memory usage will be very low during subsequent transmission.
In this scenario, an application sends a large volume of data (2 MB) to the peer for initialization after receiving a new connection, however, the upper layer does not send the 2 mb data in blocks and directly put the data into the output. The output is directly increased to 2 MB (at least 2 MB). Of course, this is okay, the transfer can be completed smoothly, but after the data Initialization is complete, the application continues to send packets, so the output of more than 2 MB will be wasted.
However, the input may be slightly better, but there are also some problems. If the upper layer keeps the input full without reading data from the input, the libevent will not read data from the socket to the input. If we use the epoll model on the Linux platform, this socket will exist every time epoll_wait returns in the LT mode, for the et mode, we need to keep reading the eagain or ewouldblock mark on the socket. However, if the input in the middle is full, the embarrassing situation will also happen.
So how can we solve this problem to meet external requirements that can be put into the database no matter how big the data is, and at the same time improve the memory usage of the buffer?
A buffer list is designed for the chaos network library to better solve this problem.
Solution-
First, let the outside feel the buffer "full" status, which will cause the buffer to be allocated memory once there is data that exceeds the remaining space to be stored, but if the buffer is continuous, it is difficult for us to reclaim unnecessary memory to improve memory usage. After all, we need to reduce a piece of continuous memory. The only way is to copy the original data to another small memory, it is not recommended to release the original memory. But if we concatenate a buffer to form a buffer linked list, can it be effectively solved?
Let's look at a scenario-
The initial status of the buffer list (BL) is only one expand buffer (basically same as the Buffer Design of libevent), and the upper layer continuously invests data into BL, at this time, no reader removes data from BL. At this time, the buffer in BL is filled, but the upper layer is still inputting data, the following data will not expand the original buffer, but will re-generate a new buffer, which will be placed at the end of the linked list as the second node of BL, new data can be put into this new buffer. If the new data is very large, it is enough to be n times that of a single buffer, then BL will generate n buffers to store data.
Remember, our purpose is to ensure that data of any size can be stored and memory usage can be ensured. The above shows how BL supports "infinite" space, now let's take a look at how to ensure memory usage
The data structure of the linked list is highly efficient in addition and deletion, and each node is independent and can be recycled separately. This solves the problem of low usage caused by the failure to recycle part of memory in a whole continuous buffer.
When the reader starts to remove data from BL, the memory occupied by this buffer is released after all the data in the first buffer is removed from the linked list header, then the reader keeps reading the data until the last buffer in BL (Note: The last buffer is not released)
After this design, let's take a look at the scenario mentioned above.
1. receives a connection and sends more than 2 mb data to the peer for initialization. Multiple Buffer blocks exist in BL (assuming the size of each buffer is 8 K). The total size occupied by BL is greater than or equal to 2 MB.
2. The network layer starts to extract data from BL and send it to the peer end. Each time a buffer is sent, the buffer node is released from BL, and only one buffer is available at the end.
3. The connection enters the stable transmission phase, and the data packets are all packets (smaller than the size of a single buffer). Before the connection is disconnected, the size occupied by BL remains 8 KB.
Okay, this problem is solved, but you may say, the advantage of buffer continuity is that it can transmit all the data on the buffer to the peer end only by calling send once (more strictly speaking, copying data to the kernel buffer). If the BL structure is used, it is possible to call multiple send operations, affecting the overall performance.
My answer is:
The BL design solves the problem of increasing the buffer caused by a small amount of data transmission during the connection lifecycle, while the vast majority of the time is packet transmission.
If the data is continuously transmitted with a large amount of data, you can dynamically set the maximum limit of a single buffer as much as possible, even if Bl has multiple buffer nodes during transmission, it will call multiple send requests, but it will not become a bottleneck, resulting in performance degradation.
Data mobility Optimization
When you want to add data to a buffer but the remaining space at the end is insufficient, the buffer first checks whether marshal can meet the requirements (overwrite the read data ), there is a problem here: if our buffer is large, it may be set to 1 MB, and the data in the buffer is filled with data, then the reader reads 10 bytes from the buffer, then we want to store 10 bytes in the buffer, but the buffer tail segment is no longer available. Although the header has 10 bytes available, we need to perform the marshal operation first, that is to say, it is time-consuming to move the memory of nearly 1 MB of data. Now we have the BL design, so we can do this, if this operation may cause a large amount of memory Movement (which can be set by yourself), a new BL buffer node will be added to put data into the memory pool.AlgorithmWhen such a bl buffer node is allocated, almost only a pointer operation is performed.
Individual data packets are stored across Buffer
This may happen. When we read data from the socket, a data packet is divided into two parts (or N parts) and stored in multiple consecutive buffers in Bl, at this time, We Need To concatenate n consecutive buffer data into a complete package, and then pass it to the logic layer. Chaos does not process the spliced data in Bl, because BL only does its work well, in other words, it simply cannot distinguish the data content. This part of work is handed over to conn_strategy on the upper layer. What needs to be mentioned here is, when a data packet is in a single buffer, chaos directly sends zero copy to the logic layer.
Write down the buffer design of the chaos network library. The buffer list is complete.Code, You can
Https://github.com/lyjdamzwf/chaos/tree/master/chaos/network/buffer_list.h
Https://github.com/lyjdamzwf/chaos/tree/master/chaos/network/buffer_list.cpp
View
Address: http://www.cppthinker.com/chaos/94/buffer/