Today, most CPUs have multiple cores, in order to maximize the performance of multi-core processor, improve the concurrency of the server, to ensure that the system for multi-threaded support is very necessary. Our previous design is based on single-threaded, in this article we will improve the system to further improve the performance of the system while ensuring that the system for multi-threaded support.
First of all, we have selected the reactor mode based on I/O multiplexing, so how do we handle these I/O in a multithreaded environment? Does multithreading handle the same socket descriptor securely at the same time? Does the reactor mode support multithreading?
According to the lookup document, common system calls to file descriptors such as read and write are thread-safe, and we do not have to worry that multiple threads manipulating file descriptors at the same time can cause a process crash to occur. The read and write operations on the same socket in two threads are thread-safe, as described by UNPv1, because TCP sockets are bidirectional I/O.
However, we still have to consider the following concentration situation:
- Two threads read the same socket at the same time, when two threads each receive part of the same message, how do you combine the two pieces of data into a single complete message?
- Two threads write the same socket at the same time, when each thread sends out only half a message, how will the receiver process the received data?
If we give each socket a lock so that only one thread can get a lock at a time to read or write the socket, this solves the problem. However, in reactor mode, we should try to avoid blocking the operation of the thread. If the event handler in a thread at this point is blocked from competing for a lock, it will cause other event processing to be blocked after that thread.
Therefore, we think that while descriptor common system calls are thread-safe, because placing a descriptor in a multithreaded environment will complicate the entire business logic, although to some extent we can be resolved through the application layer I/O buffering locking mechanism, but this will still lead to thread blocking and server performance degradation, This is not worth the candle. So we think in a multithreaded environment we still have to make sure that each file descriptor has only one thread to operate on. This can not only solve the problem of the order of message sending and receiving, but also avoid the various lock competition phenomena.
In accordance with the above principles, we will continue to register the read and write operation of each connection socket in a single reactor reactor. As we have described in the previous article, each reactor pattern contains a thread cycle, so each reactor reactor should be single-threaded and can support registering multiple connection sockets. But if all the connection sockets are registered in one thread, our system is degraded to a single-threaded server. So we should distribute each new connection evenly across the different reactor event loops, have multiple threads register different connection events on average, and let each thread handle all the reactor events within that thread.
One loop per thread model introduction
Based on previous analysis, the multithreaded model of our server system has been broadly clear, that is, using non-blocking IO + one loop per Thread mode. In this mode, multiple threads are created, and each thread creates a reactor reactor, and each reactor has an event loop, which is used to wait for the registration event and to handle the read and write of the event. When we need to get a thread to work, we register a new connection socket in the same reactor as the thread.
In this mode, although we have to note that each socket can only be registered to a single threaded reactor, can not be used across the reactor, but this can be divided into the same way the thread of the connection will still give our system a great load elasticity. For example, a high-real-time connection can take up a single thread, and a connection that handles a large amount of data can also monopolize one thread and allocate some data processing tasks to several other compute threads, while some relatively minor auxiliary connections can share a single thread, as long as each connected processor is guaranteed to be non-blocking. can still guarantee that the event processing delay is not too high.
We can summarize the advantages of this model as follows:
- The number of threads is set at program startup, the number is determined, and the thread pool is managed, and the cost of destroying threads is not frequently created.
- It is convenient to adjust the load between threads.
- For the same TCP connection, threads are fixed during the entire connection, regardless of the likelihood of event concurrency.
Design of inter-thread task queue model
The inter-thread task queue model is a form of multithreaded processing. During processing, tasks that need to be run in a thread are registered to the thread's task queue, and the task is removed and executed when the thread detects that a task exists in the task queue. When all the tasks in the task queue have been completed, the thread will be blocked until a new task is registered causing the thread to wake up.
First, we do not consider the reactor model, design a model that meets the above requirements. The key data structure in the thread of this model is the task queue.
Task queues are similar to buffering infinitely large multi-producer multi-consumer models. Buffering is multi-threaded protection through conditional variables. Producers and consumers are in different threads, and the producer adds a task to the buffer tail via post, and the consumer gets the task from the buffer head through the take operation. There may be multiple producers, so if a producer expects to add a task, it needs to acquire a sync lock before it can be added. and the consumer not only need to obtain a synchronization lock, but also to check whether the current task queue there are available tasks, if there is to take out, if not exist by the condition variable is blocked until a producer has added a new task and perform a wake operation.
The buffering portion of the task queue should support read from the header, and write the queue function from the tail. At the same time, it is best to support buffering of dynamic growth so that it appears to the producer that the buffer should be similar to infinity to ensure that there is no case where the producer writes too many tasks causing the operation to be blocked. In STL Library, deque structure as the dynamic growth segment continuous bidirectional container, can meet the above requirements, so we use the STL library std::d eque as a buffer implementation.
The Task data section, which is buffered at the same time, is similar to the callback function we analyzed earlier in this article. It is an object that can be written to buffer in the form of a data structure, and when read from the buffer, it can be called in a function-like form, preferably with its own parameter management. The Function<void> function object in the Boost library implements this function by assigning a value through a normal function, or by binding a parametric function or a member function to the BIND function in the Boost library. It can be used as an object to be saved as a buffer container, and can be executed as a callback function.
Through the above research design, our task queue is a buffer of multi-threaded protection with conditional variables, the underlying data structure of the buffer is implemented as std::d eque<boost::function<void () > >.
The final design implementation of the inter-thread task queue. A thread body is a task loop that repeatedly take the available tasks from the task queue. If the current task queue does not have a task, the take operation will cause the thread to block until a new available task is added to the task queue in the other thread, and the thread will wake up and get the task. When a thread acquires a task, it executes a task callback in this thread. When the task executes, the thread re-enters the loop and expects to take from the task queue to the available tasks again.
With this inter-thread task queue model, we can move the expected task action from one thread to another.
Threading model combined with reactor mode
Prior to the design of the inter-thread task queue model, we did not take into account the characteristics of the reactor pattern, nor the specific requirements scenario of the server system. So we still need to transform this model into our entire server system.
The system prototype in reactor mode is similar to Figure xxx, whose body is the event separator listener event generated under the event loop, and the handler of the specific event is recalled for processing. We add a task queue structure to each reactor that buffers the tasks that other threads register with the thread. At the same time we need to know when other threads have added tasks to the reactor reactor. Because the previous reactor reactor does not listen to the number of task queues, and reactor may be blocked in the Epoll event monitoring, if the long-term no event is monitored, the entire reactor thread will be blocked for a long time, even if there are other threads to add a task to the reactor, Can not be implemented in a timely manner.
We create an additional pipeline for each reactor and register the descriptor-readable event for that pipeline with the reactor. The descriptor is also exposed to other threads, and when another thread adds a new task to it through the reactor's task queue, it obtains the pipeline descriptor for the reactor and performs a write operation. At this point we simply write a random byte of data to wake up the reactor that might be blocked by the event listener, notifying it that the task queue has a usable task and needs to perform the processing.
We designed the reactor reactor to support the task queue. Create a pipeline during the reactor initialization phase and register the pipe descriptor with the reactor so that other threads can wake the reactor. At the same time, after the reactor has processed the handler of all activation events, it checks whether its task queue is empty. Here is different from the previous threading model design, if the task queue is empty, indicating that there is no task to execute, the reactor can not be blocked in this, but directly skip to the next cycle; If the task queue is not empty, all the tasks in the task queue are read out, and then callback execution, after execution into the next round. Because the reactor's requirements are as non-blocking as possible, its core is event handling, and our task queue is similar to special event handling that belongs to the pipeline descriptor. Therefore, for this event, it is different from the threading model, where there are tasks to handle, and no tasks to skip.
In other threads, if you want to add a task to a reactor, you just need to get the reactor's task queue, add a thread to the task queue, get the pipeline descriptor for the reactor, and wake up the reactor by writing arbitrary data.
Application of multi-threading in Server system
In the server system, we use the reactor mode, which supports multithreading, and combine the new connection creation and thread allocation business scenarios to determine the final server underlying model.
, there is a main reactor in the system that listens for the accept connection. Whenever a new connection is generated, the reactor callback listens to the socket processor and creates a task in which the task is to register the new connection with a specified reactor and send a wake-up event to the reactor.
At the same time the system through the thread pool management of multiple working reactors, the number of working reactors can be set, according to the number of CPUs to determine the appropriate number. Whenever a new connection is generated in the listener reactor, a working reactor is selected from the thread pool through the round robin polling schedule as the sending object for the new task. The selected work reactor will also act as the actual manager of the connection, all of which will be done in the thread of the working reactor.
Through the above design, our system not only can make full use of multi-core CPU performance through multi-thread, but also avoids the total processing ability of the system without decreasing with the increase of the number of connections by fixing the number of threads. At the same time, because a connection is completely managed by a thread, it ensures that the read-write and event handling of the connection can be executed sequentially, simplifying the process of the actual business logic under multithreading.
C + + server Design (iii): Multithreading support