Client NiO practice analysis

Source: Internet
Author: User

Introduction: NiO applications on the server have been widely used. However, there are not many instructions on the use of NiO on the client. At the same time, in my opinion, NiO is used on the client to add an event-driven framework to the original persistent connection mode. Compared with the short connection pool mode, is the performance really outstanding in any environment, actually not.

Recently, the TB cache client needs to be optimized. The original code is written in NIO, but the efficiency is not high and the performance is average. Therefore, the performance of the server is dragged down. During the optimization process, after reading nio2, which is the outstanding AIO in JDK 7, it has also been optimized and tested repeatedly. Here, we will talk about some of our gains when applying NiO to the client.

Differences between traditional Io operations and NiO operations

To put it simply: 1. for data processing, the stream method is called the block method. 2. The event-driven mechanism replaces the traditional thread processing mode.

The first type of transition is not suitable for scenarios that require processing byte streams. (The Bytes need to be fully processed. For example, I used lazy Analysis for byte streams in another optimization, and filtered out invalid requests in advance by parsing and performing verification, reduces the performance consumption caused by invalid requests for analyzing large data packets ). However, the change in block transmission and processing conforms to the real mode of the operating system, so that Java can fully utilize the implementation of various operating systems to optimize performance, at the same time, the idea of pipelines is also in line with the actual implementation of the Operating System (Java originally split two-way channels into in and out ). Event-driven, the complete processing flow is split into pipeline jobs, maximizing the use of resources, preventing backend processing from becoming the bottleneck of front-end requests and reducing the server throughput, at the same time, developers can optimize the process to the maximum extent and shorten the opportunities for key paths.

The following table roughly lists the requirements and respective advantages of traditional I/O and NiO in the client (the requirements of both are not included, such as fault tolerance and recovery)

  Requirement Advantages
Io (connection pool) 1. Manage the connection pool. 2. The number of sockets is huge under high concurrency and high consumption of file handles 1. The data sending and receiving mode is simple and single-threaded. 2. You can parse byte streams one by one to avoid unnecessary memory consumption.
NIO 1. The interaction protocol must support sessions. (Or not, this will degrade the processing mode and reduce the efficiency, which will be discussed later) 2. multithreading is required for receiving and sending, improving efficiency. need to be familiar with Channel and block 1. maximize resource utilization and improve performance (full use of message channels and use of OS I/O optimization) 2. fully and flexibly separates processing into multiple work items, streamline jobs, and reduce the impact of Service Processing on server service request receiving throughput.

NIO on the client

It describes the data interaction between the client and the server implemented by NIO. It can be seen that the NIO client must have the following aspects to maximize the use of the Message Channel:
1. Message session support.
2. multi-threaded access control.
3. Message filtering error tolerance.
4. Timeout control.
5 .....
Message session support refers to the communication protocol used for message transmission and receiving. Nio can be easily used on the server because the NIO server maintains the Service Processing session, for the client, we can see that different threads use NiO client, the order in which messages are received after being sent is not necessarily the same as the order in which messages are sent, therefore, you must embed a session code in the protocol to notify the message recipient after the result is returned and parsed.

Multi-threaded Access Control. messages can be sent and received through a single thread, but this will become a processing bottleneck in high concurrency (the subsequent optimization process will be described in detail ). If multiple threads are allowed to send and receive messages, concurrent access control is required for both the sending queue and receiving cache. When data needs to be split, small transactions are required to ensure data consistency.

Message filtering error tolerance. Because a data channel is shared, it is important to filter erroneous data to prevent parsing exceptions and deadlocks when a problem occurs in returned data. This guarantees that multi-thread processing does not affect each other.

Timeout control. Because the shared data channel is an asynchronous service receiving mode, the request queue will consume resources in high concurrency. It is necessary to delete the timeout requests in the queue, it is guaranteed that network or server exceptions will not be overwhelmed.

There are many other design details and highlights, which will not be listed here.

To sum up, how to efficiently coordinate the sending, receiving, and analysis of multi-threaded data is the most important implementation focus of the NIO client that shares data channels.

NIO client Optimization Analysis
Before optimization, the client structure is the earliest traditional NiO mode. The following two figures show the approximate structure.

The first figure is the concept model of NiO, and the second figure describes the role position of each role in the actual model from the perspective of actual use. The detailed introduction of NiO role work will not be detailed. Here we will describe the problem. In high concurrency, we will find that the threads in the thread pool waiting for the message to return keep accumulating, and the processing performance is also declining, shows a vicious circle.

In traditional mode, event processing is performed in a single thread in a serial manner. For example, if readevent is before writeevent, when readevent times out or is time-consuming, as a result, other events cannot be processed or the processing is slow.

Preliminary Identification reason: serialized event processing results in mutual impact between events.
Solution: Process events in a threaded manner, separate event triggering from event processing, and improve the processing capability of different events.

We can see that the dispatcher calls handler to handle events and event listening have been separated, which should seem to be a good solution to the current problem.

A new executor is added to the dispatcher (the cache is not used, and the fix is used to prevent a large number of threads from crashing applications due to soaring resources ), then, the channel proxy class in attachment is modified for thread security.

I was surprised by the test results, which not only did not improve the efficiency but reduced the efficiency under high concurrency. After careful observation, it is found that connection, read, and write events are all registered. In high concurrency, read/write events are frequently performed, so a large number of threads are generated, at the same time, the thread pool is fixed (if the cache is used, oom is used directly). The consumption in the creation thread is much greater than the sequential execution that was originally considered as a bottleneck, in addition, the thread security transformation (resource locks and other security optimizations) of the channel proxy makes the final performance not in line with the original initial architecture.

It seems that it is not suitable for dispatcher to strip the framework and business logic by means of thread pool. Therefore, consider other aspects. On the other hand, if we can make the internal processing of Read and Write events light enough, even the sequential processing will have a good effect, so we have the following design:
 

Lightweighthander is a lightweight Message Processing handler that splits the original business data processing so that even serial event processing can prevent the interaction between events.

Work splitting has actually become the most important part of optimization. The following describes the analysis points of work splitting (it is also the final strategy continuously obtained through the test results)
 

It is the most common method in persistent connection mode. The system has a buffer zone for sending and receiving, but it does not adopt multi-thread processing for sending, but only batch processing (set a certain threshold value, in high concurrency, when the buffer content suddenly increases, the batch flush method is used to improve the efficiency). Writing without multithreading is actually tested and it is found that the performance will be lost during the thread creation process, the write operation consumes a limited amount of time and does not need to be optimized. However, for the read part, there are many operations such as data analysis, copying, and object deserialization. Therefore, we need to cut the work and support parallel processing to improve the feedback speed. The read and analysis process is divided into three parts. The first two parts are Operations completed by a single thread, and the last step is executed by the multi-thread pool.

The first step is that lightweighthandler reads data from the event and then reads the data packets into the queue. There are two reasons for not using multithreading here. The first reason is the thread creation consumption problem. The second reason is that when the packet is too large, it needs to be subcontracted for receiving. At this time, multithreading is used for receiving, the message order will be disordered (the message package is read according to the specified size of the receiving window, so there is no orchestration ). The second step is to divide the read data packets into logical units according to the packet Protocol rules (various features of bytebuffer need to be used here to avoid creating new data blocks and copying each other to improve efficiency ). The single-thread mechanism is that pipeline jobs cannot be implemented here, which is a serial task and the Key Path cannot be shortened. Step 3: allocate the data packets after logical splitting to each thread for the most time-consuming parsing and callback.

 Finally, the test results are good. The following experiences are summarized:

1. multithreading sometimes exceeds your bottleneck, and you also need to prevent high concurrency resource application problems.
2. Carefully split the task and combine the serial and parallel processing of the task to find the shortest path.
3. fully utilizes bytebuffer and other buffers in NIO processing to segment and isolate data logically, and uses the copy and creation modes as little as possible to improve processing speed. 4. (Note the usage of the buffer's relative operation functions to avoid exceptions caused by buffer multiplexing)
Batch Processing requires tests to find reasonable thresholds to prevent batch delay and peak effects.
5. multi-threaded debugging. Try to print the output status at runtime to understand the performance bottleneck. (Debugging is no longer necessary, So you basically cannot understand the internal high-speed operation details)

Nio2
Recently, I have also paid attention to many features in JDK 7. Of course, nio2 has to say that I would like to mention AIO which is more relevant to this article. In nio2, asynchronous Io interfaces and implementations are formally introduced. For this asynchronous mode, the Framework provides two methods to obtain results:
Use the future method.
Use the callback method (completionhandler) to obtain the data.

The above Optimization Framework also has two mechanisms for implementation. The second method is directly registered to the key's attachment. The first implementation is as follows:

It is implemented through the session code and the wait and policy of the object.
 
Is the structure of AIO:

In fact, Aio adds better asynchronous callback encapsulation after the structure of several versions of NiO is updated. I will not talk much about asynchronous callback encapsulation. Here we will talk about the newly added roles in several versions of NiO iteration.

In fact, channelfacade is the same as the most common input/output Cache Design Using persistent connections during the optimization process. It encapsulates the channel and provides an entrance for optimization and interception. Inputhandler is responsible for a logical analysis of the input content and determining whether flush is required. This is similar to batch flush in optimization.

Remarks
In general, NiO's application on the client is still consistent with what I initially thought, that is, the persistent connection mode and the event-driven framework, at the same time, how to handle the multi-thread concurrency control and task splitting is the key to the NIO channel multiplexing. Otherwise, it is easy to use the client connection pool.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.