Netty series of Netty threading model

Source: Internet
Author: User
Tags server port

1. Background 1.1. Evolution of the Java threading Model 1.1.1. Single Thread

Time back more than 10 years ago, when the mainstream CPU was still single-core (in addition to commercial high-performance small machines), CPU core frequency is one of the most important indicators of the machine.

In the Java world it was popular for single-threaded programming, and for CPU-intensive applications, frequent multi-threaded collaboration and preemptive time-slices would degrade performance.

1.1.2. Multithreading

With the increase in hardware performance, the CPU has more and more cores, many servers standard has reached 32 or 64 cores. Through multi-threaded concurrent programming, the processing power of multi-core CPU can be fully utilized to improve the processing efficiency and concurrency performance of the system.

Related Vendor Content

The construction of the operation and maintenance monitoring system behind the Cheng Hai volume online transaction The distributed database synchronization technology in the multi-data center based on storm using idle resources to build real-time computing platform

Related Sponsors

Since 2005, with the gradual popularization of multi-core processors, Java's multi-threaded concurrent programming has become popular, when the commercial mainstream JDK version is 1.4, users can create new thread through the way new threads ().

Since JDK1.4 does not provide a thread-like pool such as a threading management container, the synchronization, collaboration, creation, and destruction of multiple threads requires the user to implement them themselves. Because creating and destroying threads is a relatively heavyweight operation, this raw multithreaded programming efficiency and performance is not high.

1.1.3. Thread pool

In order to improve the efficiency and performance of Java multithreaded programming, reduce user development difficulty. JDK1.5 introduced the Java.util.concurrent concurrent programming package. In the Concurrent Programming class library, it provides a new class library such as thread pool, threaded safe container and Atom class, which greatly improves the efficiency of Java multithreaded programming and reduces the development difficulty.

Starting with JDK1.5, concurrent programming based on thread pooling has become the mainstream of Java multicore programming.

1.2. Reactor model

Whether it is a C + + or Java-written network framework, most are designed and developed based on the reactor pattern, and the reactor mode is event-driven, especially suited for handling massive I/O events.

1.2.1. Single Threading model

Reactor single-threaded model, which means that all IO operations are done on the same NIO thread, the functions of the NIO thread are as follows:

1) as a NIO server, receive TCP connections from clients;

2) as a NIO client, initiate a TCP connection to the server;

3) Read the communication to the end of the request or reply message;

4) Send a message request or reply message to the communication peer.

The reactor single-threaded model is as follows:

Figure 1-1 Reactor single-threaded model

Since the reactor mode uses asynchronous non-blocking IO, all IO operations do not cause blocking, and in theory a thread can handle all IO-related operations independently. At the architectural level, a NIO thread can do its job. For example, the Acceptor class receives a TCP connection request message from the client, and after the link is established successfully, the corresponding Bytebuffer is distributed to the specified handler by dispatch for message decoding. The user thread can send messages to the client through the NIO thread through message encoding.

For some small-capacity scenarios, you can use a single-threaded model. But for high-load, large concurrency scenarios, the main reasons are as follows:

1) A NIO thread handles hundreds of links at the same time, unable to support performance, even if the CPU load of the NIO thread reaches 100%, it can not satisfy the encoding, decoding, reading and sending of mass messages;

2) When the NIO thread is overloaded, the processing speed slows down, which causes a large number of client connections to time out, which tends to be re-sent after time-out, which increases the load on the NIO thread and eventually leads to a large number of message backlogs and processing timeouts, which becomes the performance bottleneck of the system.

3) Reliability problem: Once the NIO thread accidentally flies, or enters the dead loop, it will cause the whole system communication module is not available, can not receive and process external messages, resulting in node failure.

In order to solve these problems, we have evolved the reactor multithreaded model, below we learn the next reactor multithreading model.

1.2.2. Multithreaded Models

The biggest difference between the Rector multithreaded model and the single-threaded model is that there is a set of NIO threading processing IO operations, and its schematic diagram is as follows:

Figure 1-2 Reactor multithreaded model

Features of the reactor multithreaded model:

1) There is a specific NIO thread-acceptor thread used to listen to the server, receiving TCP connection requests from the client;

2) Network IO Operations-read, write, etc. are the responsibility of a NIO thread pool, which can be implemented using a standard JDK thread pooling, which consists of a task queue and n available threads, which are responsible for reading, decoding, encoding, and sending messages;

3) 1 NiO threads can handle n links at the same time, but 1 links only correspond to 1 NIO threads, preventing concurrent operation problems.

In most scenarios, the reactor multithreaded model can meet performance requirements, but in a very specific scenario, a NIO thread is responsible for listening to and handling all client connections that may have performance problems. For example, there are millions of client connections, or the server needs to authenticate the client handshake securely, but the authentication itself is very lossy. In such scenarios, a single acceptor thread may have a performance problem, and in order to solve the performance problem, a third reactor threading model-master-slave reactor multithreaded model-is produced.

1.2.3. master-Slave multithreading model

The main feature of the master-slave reactor threading model is that the server is no longer a 1 separate NIO thread for receiving client connections, but rather a separate NIO thread pool. Acceptor the client TCP connection request processing is completed (possibly including access authentication, etc.), the newly created Socketchannel is registered to an IO thread in the IO-thread pool (sub reactor thread pool), It is responsible for the reading and decoding work of Socketchannel. The acceptor thread pool is only used for client login, handshake, and security authentication, and once the link is established successfully, the link is registered on the IO thread of the backend Subreactor thread pool, which is responsible for subsequent IO operations by the IO threads.

Its threading model is as follows:

Figure 1-3 Multithreading model of master-slave reactor

With the master-slave NIO threading model, 1 server-side listener threads are unable to effectively handle all client connection performance problems.

Its work flow is summarized as follows:

    1. Randomly select a reactor thread from the main thread pool as the acceptor thread, which binds the listening port and receives the client connection;
    2. The acceptor thread receives the client connection request and creates a new socketchannel, registering it on the other reactor thread of the main thread pool, which is responsible for access authentication, IP black-and-white list filtering, handshake, and other operations;
    3. After step 2 is complete, the business layer's link is formally established, the Socketchannel is removed from the reactor thread's multiplexer on the main thread pool and re-registered on the thread of the sub thread pool to handle I/O read and write operations.
2. Netty Threading Model 2.1. Netty Threading Model Classification

In fact, the threading model of Netty is similar to the three reactor threading models described in section 1.2, and the following section describes the threading model of Netty by Netty the server and client thread-handling flowcharts.

2.1.1. Service Thread Model

A more popular approach is the server-side listener thread and IO thread separation, similar to the reactor multithreaded model, which works as follows:

Figure 2-1 Netty Service Thread workflow

Below we combine Netty's source code, the service side creates the thread workflow to introduce:

The first step is to create a server-side operation from the user thread, with the following code:

Figure 2-2 User thread creation service-side code example

Typically, the creation of a server is done when the user process starts, so it is generally the responsibility of the main function or startup class to create the server, which is the responsibility of the business thread. In the creation of the server, the instantiation of 2 eventloopgroup,1 Eventloopgroup is actually a EventLoop thread group, responsible for managing the application and release of EventLoop.

The number of threads managed by the Eventloopgroup can be set by the constructor, if not set, by default to-dio.netty.eventloopthreads, and if the system parameter is not specified, the number of CPU cores available x2.

The Bossgroup thread group is actually a pool of acceptor threads that handles TCP connection requests from clients, and it is recommended that Bossgroup thread group threads be set to 1 if the system has only one server port to listen on.

Workergroup is a thread group that is really responsible for I/O Read and write operations, and is set through the Serverbootstrap group method for subsequent channel bindings.

The second step, acceptor thread binding listening port, start the NIO server, the relevant code is as follows:

Figure 2-3 Selecting a acceptor thread from Bossgroup to monitor the server

where group () returns the Bossgroup, its next method is used to get the available threads from the thread group, the code is as follows:

Figure 2-4 Selecting the acceptor thread

After the service-side channel is created, it is registered to the Multiplexer selector for receiving TCP connections from the client, the core code is as follows:

Figure 2-5 registering Serversocketchannel to Selector

Third, if the client connection is heard, the client Socketchannel connection is created and re-registered to the Workergroup IO thread. First look at how the acceptor handles client access:

Figure 2-6 Handling Read or connection events

Call unsafe's Read () method, and for Nioserversocketchannel, it calls the Read () method of Niomessageunsafe, which is the following code:

Figure 2-7 Nioserversocketchannel's Read () method

Eventually it will call Nioserversocketchannel's Doreadmessages method, the code is as follows:

Figure 2-8 Creating a client connection Socketchannel

Where Childeventloopgroup is the previous workergroup, select an I/O thread responsible for reading and writing network messages.

The fourth step, after selecting the IO thread, registers the Socketchannel with the multiplexer and listens to the read operation.

Figure 2-9 Listening for network read events

The fifth step is to handle the I/O Read and write events of the network, the core code:

Figure 2-10 Handling read-write events

2.1.2. Client Threading Model

The client's threading model is simpler than the server, and it works as follows:

Figure 2-11 Netty Client Threading Model

In the first step, the client connection is initiated by the user thread and the sample code is as follows:

Figure 2-12 Netty Client Creation code example

It is found that, compared to the server side, the client only needs to create a eventloopgroup, because it does not require a separate thread to listen to the client connection, and there is no need to connect to the server through a separate client thread. Netty is an asynchronous event-driven NIO framework whose connections and all IO operations are asynchronous, so there is no need to create a separate connection thread. The relevant code is as follows:

Figure 2-13 Binding the client connection thread

The current group () is the eventloopgroup that was previously passed in, which gets the available IO thread eventloop and then sets it as a parameter to the newly created Niosocketchannel.

The second step, initiates the connection operation, determines the connection result, the code is as follows:

Figure 2-14 Connection operation

Determine the connection result, and if no connection is successful, listen to the connection network operation bit Selectionkey.op_connect. If the connection succeeds, call Pipeline (). Firechannelactive () modifies the listener bit to read.

The third step is to poll the results of the connection operation by the Nioeventloop Multiplexer, with the following code:

Figure 2-15 Selector initiating a polling operation

To determine the connection result, if or the connection succeeds, reset the listening bit to read:

Figure 2-16 Determining the result of the connection operation

Figure 2-17 Setting the operation bit to read

The fourth step, the Nioeventloop thread is responsible for I/O reading and writing, with the server.

Summary: Client creation, the threading model is as follows:

    1. The user thread is responsible for initializing the client resource and initiating the connection operation;
    2. If the connection succeeds, the Socketchannel is registered to the Nioeventloop thread of the IO thread group, and the read operation bit is monitored;
    3. If no immediate connection succeeds, the Socketchannel is registered to the Nioeventloop thread of the IO thread group, listening to the connection operation bit;
    4. After the connection is successful, modify the listening bit to read, but you do not need to switch threads.
2.2. Reactor thread NioEventLoop2.2.1. Nioeventloop Introduction

Nioeventloop is Netty's reactor thread, and its responsibilities are as follows:

    1. As a server acceptor thread, responsible for processing the client's request access;
    2. As the client Connecor thread, it is responsible for registering the listening connection operation bit, which is used to judge the asynchronous connection result;
    3. As an IO thread, listens to the network read operation bit, is responsible for reads the message from the Socketchannel;
    4. As an IO thread, responsible for writing to the Socketchannel to the other side, if a write half of the packet, will automatically register a listening write event, for subsequent continue to send half-packet data, until the completion of the data sent;
    5. As a timed task thread, you can perform timed tasks, such as link idle detection and heartbeat message delivery;
    6. As a thread executor, you can perform a normal task thread (Runnable).

In the server and client threading Models section we have described in detail how Nioeventloop handles network IO events, and here's a quick look at how it handles timed tasks and performs common runnable.

First Nioeventloop inherits Singlethreadeventexecutor, which means that it is actually a thread pool with a number of threads of 1, and the class inheritance relationship is as follows:

Figure 2-18 Nioeventloop Inheritance Relationship

Figure 2-19 thread pool and task queue definitions

For the user, a custom task can be executed directly by invoking the Execute (Runnable Task) method of Nioeventloop, which is implemented as follows:

Figure 2-20 Executing a user-defined task

Figure 2-21 Nioeventloop Implementation Scheduledexecutorservice

By calling Singlethreadeventexecutor's schedule series method, you can perform netty or user-defined scheduled tasks in Nioeventloop, with the interface defined as follows:

Figure 2-22 Timing Task execution Interface definition for Nioeventloop

2.3. Nioeventloop design principle 2.3.1. Serialization design avoids thread contention

We know that when the system is running, there is additional performance loss if the thread context switches frequently. Multi-threaded concurrent execution of a business process, business developers also need to be constantly on the thread security vigilance, which data can be modified concurrently, how to protect? This not only reduces the development efficiency, but also brings additional performance loss.

Serial execution Handler Chain

In order to solve the above problems, Netty adopted a serial design concept, from the message reading, encoding and subsequent handler execution, always by the IO thread nioeventloop responsible, which is unexpectedly the entire process does not switch thread context, the data will not face the risk of concurrent modification, For the user, it is really a good idea to do not even need to know the thread details of Netty, which works as follows:

Figure 2-23 Nioeventloop serial Execution Channelhandler

A nioeventloop aggregates a multiplexer selector, so it can handle hundreds or thousands of client connections, and Netty's processing strategy is to have a new client access whenever The Nioeventloop thread group in order to obtain an available nioeventloop, when the upper bound of the array to return to 0, in this way, you can basically guarantee the load balance of each nioeventloop. A client connection is registered to only one nioeventloop, which avoids multiple IO threads to operate concurrently.

Netty reduces the user's development difficulty and improves processing performance through the serialization design concept. The use of thread groups enables multiple serialization threads to be executed horizontally in parallel, with no intersection between threads, which can take advantage of the multi-core boost parallel processing capability while avoiding the additional performance loss of thread context switching and concurrency protection.

2.3.2. Timing task and time Wheel algorithm

In Netty, there are many functions that rely on timed tasks, which are typical of two types:

    1. Client connection timeout control;
    2. Link idle detection.

A more common design concept is to aggregate the JDK's timed task thread pool Scheduledexecutorservice in Nioeventloop to perform timed tasks. This is purely from a performance point of view is not optimal, the reasons are as follows three points:

    1. A separate timed task thread pool is aggregated in the IO thread so that there is a thread context switch problem during processing, which breaks the netty design concept;
    2. There are multi-threaded concurrency operations issues, because the task and IO thread nioeventloop may simultaneously access and modify the same data;
    3. The scheduledexecutorservice of the JDK has performance-optimized space from a performance perspective.

The first of these problems is the operating system and protocol stack, such as the TCP protocol stack, its reliable transmission relies on the time-out retransmission mechanism, so every packet over TCP requires a timer to dispatch the timeout event. Such timeouts can be massive, and if you create a timer for each timeout, it is unreasonable in terms of performance and resource consumption.

According to George Varghese and Tony Lauck's 1996-year paper, Hashed and hierarchical Timing wheels:data structures to efficiently implement a TI MER facility presents a timing wheel to manage and maintain a large number of timer schedules. Netty Scheduled task scheduling is based on time-wheel algorithm scheduling, the following we look at the implementation of Netty.

A timing wheel is a data structure whose main body is a circular list, each containing a structure called a slot, which has the following schematic diagram:

Figure 2-24 How the time wheel works

The working principle of a timing wheel can be analogous to a clock, such as an arrow (a pointer) that moves at a fixed frequency in a certain direction, and each tick is referred to as ticks. This shows that the timing wheel consists of 3 important attribute parameters: Ticksperwheel (tick number of ticks), tickduration (duration of a tick), and Timeunit (time unit), for example when ticksperwheel= 60,tickduration=1,timeunit= seconds, which is exactly the same as the clock's second hand move.

The following is a detailed analysis of the implementation of the Netty: time-wheel execution by the Nioeventloop to complex detection, first look at the task queue to see if there are timed tasks and common tasks, if there is a proportional loop to perform these tasks, the code is as follows:

Figure 2-25 Performing a task queue

If there is no need to understand the task being performed, call selector's Select method to wait for the time delay of the first timed task in the scheduled task queue, with the following code:

Figure 2-26 Calculation delay

Eject the minimum delay task from the task queue of the scheduled Tasks, calculate the time-out, and the code is as follows:

Figure 2-27 Getting the time-out from the scheduled task queue

Timed task execution: After cycle tick, scan timed task list, remove timed task to normal task queue, wait for execution, related code is as follows:

Figure 2-28 Timed task for detecting timeouts

After the detection and copy tasks are complete, the timed task of executing the timeout is executed with the following code:

Figure 2-29 Performing a timed task

In order to ensure that the execution of timed tasks does not occur due to excessive crowding out of IO events, Netty provides the IO execution ratio for user settings, which allows the user to set the scale of execution assigned to IO, preventing IO processing time-outs or backlogs due to the execution of massive scheduled tasks.

Because acquiring the system's nanosecond time is a lengthy operation, the Netty detects whether the maximum execution time is reached and exits when each 64 scheduled task is executed. If it is not done, put it on the next selector poll and then process it, providing an opportunity to handle the IO event, with the following code:

Figure 2-30 Execution time limit detection

2.3.3. Focus rather than swelling

Netty is an asynchronous, high-performance NIO framework that is not a business runtime container, so it does not need and should not provide business containers and business threads. The reasonable design pattern is that Netty is only responsible for providing and managing the NIO thread, and the other business layer threading models are integrated by the user themselves, Netty should not provide such functionality, as long as the hierarchy is clearly divided, it will be more conducive to user integration and extension.

Unfortunately, in the Netty 3 series, Netty provides a executionhandler similar to Mina asynchronous filter. It aggregates the thread pool java.util.concurrent.Executor of the JDK, and the user executes the subsequent handler asynchronously.

Executionhandler is designed to address the possibility that some user handler may have an execution time uncertainty that causes the IO thread to be accidentally blocked or suspended, and it is reasonable to analyze such requirements from the perspective of demand rationality, but it is not appropriate for Netty to provide this functionality. The reasons are summarized as follows:

1. It broke the Netty adhere to the serialization of the design concept, in the process of receiving and processing the message thread switching and introducing a new thread pool, breaking the design principles of their own architecture, is actually an architectural compromise;

2. Potential thread concurrency security issues, if the asynchronous handler also operate its front user handler, and the user handler is not thread security, which will lead to covert and fatal thread security problems;

3. The complexity of user development, introduced Executionhandler, broke the original channelpipeline serial execution mode, the user needs to understand netty the bottom of the implementation details, concern about line Cheng and other issues, which will lead to outweigh the gains.

For these reasons, the subsequent version of Netty completely removed the Executionhandler, and did not provide similar related functional classes, focusing on the Netty IO thread nioeventloop, which is undoubtedly a great progress, Instead of providing a user-related business threading model, Netty starts focusing on the IO thread itself.

2.4. Netty Thread development Best Practices 2.4.1. Simple, time-controllable operations are handled directly on the IO thread

If the business is very simple, the execution time is very short, do not need to interact with the external network elements, access to the database and disk, do not need to wait for other resources, it is recommended to execute directly in the business Channelhandler, do not need to restart the business thread or thread pool. Avoid thread context switching, and there is no thread concurrency problem.

2.4.2. Complex and time-controlled business recommendations post to the backend business thread pool Unified processing

For such businesses, it is not recommended to start threads or thread pool processing directly in the business Channelhandler, and it is recommended that the different services be uniformly packaged into tasks, which are uniformly delivered to the backend business thread pool for processing.

Too many business Channelhandler will lead to development efficiency and maintainability issues, do not treat Netty as a business container, for most complex business products, still need to integrate or develop their own business containers, good and netty architecture layering.

2.4.3. Business threads Avoid direct manipulation Channelhandler

It is possible for channelhandler,io threads and business threads to operate because the business is usually multithreaded, so there is a multithreaded operation Channelhandler. In order to avoid multithreading concurrency, it is recommended to follow Netty's own approach by encapsulating the operation as a separate task by Nioeventloop and not by the business thread directly, as shown in the following code:

Figure 2-31 Encapsulating a task to prevent multithreading concurrent operations

If you confirm that the data that is accessed concurrently or the concurrency operation is secure, there is no need for superfluous, which needs to be handled flexibly according to the specific business scenario.

3. Summary

Although the threading model of Netty is not complex, it is still a challenging task to exploit Netty to develop high-performance, high-concurrency business products. Only by fully understanding the threading model and design principles of Netty can we develop high-quality products.

4. Netty Study Recommended Books

There are many articles on the market that introduce Netty, if readers want systematic study Netty, recommend two books:

1) "Netty in Action", it is recommended to read the original English.

2) "Netty authoritative guide", proposed through the theory and practice way study.

5. Introduction of the author

Li Linfeng, who graduated from Tohoku University in 2007 and joined Huawei in the design and development of high-performance communications software in 2008, has 6 years of experience in NIO design and development, proficient in Netty, Mina and other NIO frameworks, Netty Chinese community founders and Netty framework promoter.

Sina Weibo: nettying:nettying Netty Learning Group: 195820454

Netty threading Model for the Netty series

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.