Chapter 8 high-performance Server programming in Linux (high-performance server program framework)

Source: Internet
Author: User
Tags epoll

Chapter 8 high-performance Server programming in Linux (high-performance server program framework)
8. High-performance server program framework


The server is reconstructed into three main modules:
IO processing unit. Four IO models and two efficient event processing modes.
Logical unit. Two Efficient concurrency modes.
Storage unit. (Not discussed for the moment)


IO model:
Blocking IO
Non-blocking IO
IO multiplexing // programs block IO multiplexing system calls, but can listen to multiple IO events at the same time.
SIGIO signal // The signal triggers the read/write readiness event. The user program performs the read/write operation and the program is not blocked.
Asynchronous IO // The kernel executes the read/write operation and triggers the read/write completion event. The program is not blocked.


Two Efficient event processing modes:
The server usually processes three types of events: IO events, signal events, and scheduled events.
The synchronous IO model is usually used to implement the Reactor mode, and the asynchronous IO model is used to implement the Proactor mode.


Reactor mode:
The workflow of the Reactor Mode Implemented by synchronous I/O model (taking epoll_wait as an example:
1. The main thread registers the read-ready event on the socket to the epoll kernel event table.
2. The main thread calls epoll_wait to wait for data on the socket to be readable.
3. When data is readable on the socket, epoll_wait notifies the main thread. The main thread puts the socket readable event into the request queue.
4. The working thread sleeping on the Request queue is awakened. It reads data from the socket, processes the customer request, and registers the write-ready event on the socket in the epoll kernel event table.
5. The main thread calls epoll_wait and waits for the socket to be writable.
6. When the socket is writable, epoll_wait notifies the main thread. The main thread puts the socket writable event into the request queue.
7. A worker thread sleeping in the Request queue is awakened. It writes the result of the server processing the customer request to the socket.


After a worker thread extracts an event from the queue, it reads and writes data and processes the request according to whether the event is readable or writable. Therefore, in the Reactor mode, there is no need to distinguish between the so-called "read worker thread" and "Write worker thread ".




Proactor mode:
Unlike the Reactor mode, the Proactor mode delivers all IO operations to the main thread and kernel for processing, and the worker thread is only responsible for the business logic.
(Take aio_read and aio_write as examples) workflow:
1. The main thread calls the aio_read function to register the read completion event on the socket with the kernel and tell the kernel user the location of the read buffer, and how to notify the application when the read operation is complete (here, the signal is used as an example. For more information, see the sigevent man manual)
2. The main thread continues to process other logic.
3. When the data on the socket is read into the user buffer, the kernel sends a signal to the application to notify the application that the data is available.
4. The signal processing function pre-defined by the application selects a working thread to process customer requests. After the worker thread finishes processing the client request, it calls the aio_write function to register the write completion event on the socket to the kernel and tells the kernel user to write the buffer location, and how to notify the application when the write operation is complete (take the signal as an example)
5. The main thread continues to process other logic.
6. When data in the user buffer is written to a socket, the kernel sends a signal to the application to notify the application that the data has been sent.
7. Select a working thread for the signal processing function pre-defined by the application to handle the problem, for example, decide whether to close the socket.




The workflow for simulating the Proactor mode using the synchronous I/O model (taking epoll_wait as an example) is as follows:
1. The main thread registers the read-ready event on the socket to the epoll kernel event table.
2. The main thread calls epoll_wait to wait for data on the socket to be readable.
3. When data is readable on the socket, epoll_wait notifies the main thread. The main thread reads data cyclically from the socket and knows that no more data is readable. Then, it encapsulates the read data into a request object and inserts it into the request queue.
4. A worker thread sleeping in the Request queue is awakened. It obtains the request object and processes the customer request, and then registers the write-ready event on the socket in the online epoll kernel event table.
5. The main thread calls epoll_wait and waits for the socket to be writable.
6. When the socket is writable, epoll_wait notifies the main thread. The result of the client request processing by the server is written to the socket of the main thread network.




Two Efficient concurrency modes:
Semi-synchronous semi-asynchronous mode:

Here, "synchronous" and "Asynchronous" are completely different from the previous IO "synchronous" and "Asynchronous" concepts. In the I/O model, "synchronous" and "Asynchronous" distinguish between the IO events that the kernel notifies the application (whether it is a ready event or a completed event ), and who should perform IO read/write (application or kernel ). In the concurrency mode, "synchronization" means that the program is executed in full sequence according to the code sequence; "Asynchronous" means that the program execution needs to be driven by system events. Common system events include interruptions and signals.
Obviously, asynchronous threads have high execution efficiency and real-time performance. They are models used by many embedded systems. However, the Program for asynchronous execution is relatively complex, difficult to debug and expand, and not suitable for a large number of concurrency. On the contrary, the synchronization thread has a relatively low efficiency and poor real-time performance, but its logic is simple.
In semi-synchronous semi-asynchronous mode, synchronous threads are used to process customer logic, while asynchronous threads are used to process IO events. After an asynchronous thread monitors a customer request, it encapsulates the request object and inserts it into the request queue. The request queue notifies a working thread in synchronization mode to read and process the request object.




The semi-synchronous semi-asynchronous mode has the following Disadvantages:
1. The main thread and working thread share the Request queue.
2. Each worker thread can only process one customer request at a time.
Efficient Mode: each worker thread can process multiple customer connections at the same time.
The main thread only manages the listening socket, and the socket connection is managed by the working thread. When a new connection comes, the main thread accepts it and distributes the newly returned connection socket to a worker thread, after that, any IO operations on the socket are processed by the selected worker thread until the client closes the connection. The simplest way for a main thread to dispatch a socket to a worker thread is to write data to the pipeline between the master thread and the worker thread. When the worker thread detects that data in the pipeline is readable, it will analyze whether a new customer connection request has arrived. If yes, register the read/write events on the new socket to your epoll kernel event table.
Each thread (main thread and working thread) maintains its own event loop and listens to different events independently. Therefore, in this mode, every thread is working in the asynchronous mode, so it is not strictly in the semi-synchronous semi-asynchronous mode.



Leader followers mode:
It is a mode in which multiple worker threads acquire event sources in turn, and listen, distribute, and process events in turn. At any point in time, the program has only one leader thread, which is responsible for listening to IO events. While other threads are followers, they sleep in the thread pool and wait to become a new leader. If the current leader detects an IO event, he must first select a new leader thread from the thread pool and then process the IO event. At this time, the new leader waits for a new IO event, while the original leader handles the IO event, both of which implement concurrency.
It contains the following components: HandleSet, ThreadSet, EventHandler, and ConcreteEventHandler ).




Other suggestions for improving server performance:


Pool:


If the server hardware resources are relatively "abundant", a very direct way to improve the server performance is to change the space for time, that is, to "waste" the server hardware resources in exchange for its operational efficiency. This is the concept of a pool. A pool is a collection of resources, which are completely created and initialized at the beginning of the server startup. This is called static resource allocation. The speed is much faster, because system calls to allocate system resources are time-consuming. After the server processes a client connection, it can put the related resources back into the pool without executing a system call to release the resources. In the final result, the pool is equivalent to the application layer facility used by the server to manage system resources, which avoids frequent server access to the kernel.


Categories by resource type:
Memory Pool: it is usually used for receiving cache and sending cache of socket.
Process pool and thread pool: common "tricks" for concurrent programming ".
Connection Pool: used for permanent connections within a server or server cluster.




Data Replication:


Unnecessary data replication should be avoided, especially when data replication occurs between the user code and the kernel. If the kernel can directly process the data read from the socket or file, the application does not need to copy the data from the kernel buffer to the application buffer. For example, on an ftp server, the server only needs to check whether the target file exists and whether the client has the read permission, without worrying about the specific content of the file. You can use "zero copy" sendfile to directly send it to the customer.
In addition, data replication within the user code (without accessing the kernel) should also be avoided. For example, when two worker processes need to transmit a large amount of data, we should consider using the shared memory to directly share the data between them, rather than using pipelines or message queues for transmission.




Context switching and lock:
Concurrent Programs must consider the context switch problem, that is, the system overhead caused by process thread switching. Even IO-intensive servers should not use too many working threads (or processes, the same below). Otherwise, switching will take a lot of CPU time, the CPU time used by the server for business logic is insufficient. Therefore, it is not advisable to create a server thread model for each client connection. The semi-synchronous semi-asynchronous model described previously is a reasonable solution that allows a thread to process multiple client connections at the same time. In addition, a multi-threaded server has the advantage that different threads can run on different CPUs at the same time. Context switching is not a problem when the number of threads is no greater than the number of CPUs.
Another issue that needs to be considered for concurrent programs is the shackles of shared resources. Locks are generally considered to be a factor that causes low server efficiency because the Code introduced by the lock not only does not process any business logic, but also needs to access kernel resources. Therefore, if the server has a better solution, the lock should be avoided. If the server must use a lock, you can reduce the lock granularity, for example, using a read/write lock. When all worker threads read only one shared memory, the read/write lock does not increase the additional overhead of the system. The system must lock this area only when one of the worker threads needs to write this memory.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.