Performance analysis of the Java I/O API

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Performance to:

The scalability of the IO API is of great importance to Web applications. In the previous API for Java version 1.4, blocking I/O was disappointing to many people. Starting with the J2SE 1.4 version, Java finally has a scalable I/O API. This paper analyzes and calculates the variability of the new and old I/O APIs in terms of scalability.

Outline:

I. Overview
Ii. HTTP servers written with the old API
Third, non-blocking HTTP server
Iv. registration and processing of detailed procedures
Quantitative analysis and comparison of scalability

Body:

I. Overview

The scalability of the IO API is of great importance to Web applications. In the previous API for Java version 1.4, blocking I/O was disappointing to many people. Starting with the J2SE 1.4 version, Java finally has a scalable I/O API. This paper analyzes and calculates the scalability differences between the new and old IO APIs. When Java writes data to the socket, it must invoke the associated OutputStream write () method. The Write () method call is returned only if all of the data is written. This call may take some time to complete if the send buffer is full and the connection speed is low. If the program uses only a single thread, other connections must wait, even if those connections are ready to invoke write (). To solve this problem, you have to associate each socket with a thread; After this method, another thread can still run when one thread is blocked by an I/O-related task.

Although threads are less expensive than processes, threads and processes are part of a large resource-consuming program structure, given the underlying operating platform. Each thread consumes a certain amount of memory, and beyond that, multiple threads also imply a switch to the thread context, which also requires expensive resource overhead. As a result, Java needs a new API to isolate the connection between the socket and the thread too tightly. In the new Java I/O API (java.nio.*), this goal is finally realized.

This article analyzes and compares a simple Web server written with both new and old I/O APIs. Because HTTP as a Web protocol is no longer used for simple purposes, the examples presented here contain only key features, or they do not take security into account and do not strictly comply with protocol specifications.

Ii. HTTP servers written with the old API

First, let's look at HTTP servers written with legacy APIs. This implementation uses only one class. The main () method first creates a serversocket bound to port 8080:

public static void Main () throws IOException {
ServerSocket serversocket = new ServerSocket (8080);
for (int i=0 i < Integer.parseint (args[0)); i++) {
New HTTPD (ServerSocket);
}
}

Next, the main () method creates a series of httpd objects and initializes them with the shared serversocket. In the HTTPd constructor, we guarantee that each instance has a meaningful name, sets the default protocol, and then starts the server by invoking the start () method of its superclass thread. This leads to an asynchronous call to the run () method, while the run () method contains an infinite loop.

In the infinite loop of the Run () method, the ServerSocket blocking Accpet () method is invoked. When the client connects to the server's 8080 port, the Accept () method returns a Socket object. Each socket is associated with a inputstream and a outputstream, both of which are used in subsequent handlerequest () method calls. This method will read the client's request, be checked and processed, and then send the appropriate answer to the client program. If the client's request is legitimate, the file requested by the client is returned through the Sendfile () method, otherwise the client receives the appropriate error message (call Senderror ()) method.

while (true) {
...
Socket = Serversocket.accept ();
...
HandleRequest ();
...
Socket.close ();
}

Now let's analyze this implementation. Is it able to do the job well? The answer is basically yes. Of course, the request analysis process can be further optimized because the StringTokenizer reputation has been poor in terms of performance. However, this program has at least shut down the TCP latency (which is not appropriate for a short connection), and has a buffer for outgoing files. And more importantly, all thread operations are independent of each other. Which thread the new connection request handles is determined by the native (and thus the faster) accept () method. In addition to ServerSocket objects, each thread does not share any other resources that might need to be synchronized. The solution is faster, but unfortunately, it is not very scalable, because, obviously, threads are a limited resource, "he explained."

Third, non-blocking HTTP server

Let's look at another scenario that uses a new non-blocking I/O API. The new scenario is a little more complex than the original, and it requires the collaboration of each thread. It contains the following four classes:

· Niohttpd
· Acceptor
· Connection
· Connectionselector

The primary task of NIOHTTPD is to start the server. Just like the previous httpd, a server socket is bound to port 8080. The main difference is that the new version of the server uses Java.nio.channels.ServerSocketChannel rather than serversocket. Before you can use the bind () method to explicitly bind a socket to a port, you must first open a pipe (Channel). The main () method then instantiates a connectionselector and a acceptor. In this way, each connectionselector can be registered with a acceptor and serversocketchannel is also provided when instantiating acceptor.

public static void Main () throws IOException {
Serversocketchannel SSC = Serversocketchannel.open ();
Ssc.socket (). bind (New inetsocketaddress (8080));
Connectionselector cs = new Connectionselector ();
New Acceptor (SSC, CS);
}

In order to understand the interaction between these two threads, first let's look at the acceptor. The primary task of acceptor is to accept incoming connection requests and register them through Connectionselector. The Acceptor constructor calls the superclass's start () method, and the Run () method contains the required infinite loops. In this loop, a blocking accept () method is invoked, which eventually returns a socket object-a process that is almost identical to the httpd process, but where the Serversocketchannel accept () method is used, Rather than the ServerSocket accept () method. Finally, the Socketchannel object obtained by calling the Accept () method creates a connection object for the parameter and registers it with the Connectionselector queue () method.

while (true) {
...
Socketchannel = Serversocketchannel.accept ();
Connectionselector.queue (New Connection (Socketchannel));
...
}

In summary: Acceptor can only accept connection requests and register connections through connectionselector in an infinite loop. Like Acceptor, Connectionselector is also a thread. In the constructor, it constructs a queue and opens a java.nio.channels.Selector with the Selector.open () method. Selector is one of the most important parts of the entire server, enabling programs to register connections and get a list of connections that have been allowed to read and write operations.

After the constructor calls the start () method, the infinite loop inside the Run () method begins execution. In this loop, the program invokes the selector Select () method. This method blocks until one of the registered connections is ready for I/O operations, or the selector wakeup () method is invoked.

while (true) {
...
int i = Selector.select ();
Registerqueuedconnections ();
...
Process Connection ...
}

When a connectionselector thread executes a select (), no acceptor thread can register the connection with that selector, because it is important to understand that the corresponding method is the synchronization method. Therefore, queues are used here, and acceptor threads join the queue if necessary.

public void Queue (Connection Connection) {
Synchronized (queue) {
Queue.add (connection);
}
Selector.wakeup ();
}

Immediately after putting the connection into the queue, acceptor calls the selector wakeup () method. This call causes the Connectionselector thread to continue executing, returning from the blocked select () call. Because the selector is no longer blocked, Connectionselector is now able to register the connection from the queue. In the Registerqueuedconnections () method, the implementation process is as follows:

if (!queue.isempty ()) {
Synchronized (queue) {
while (!queue.isempty ()) {
Connection Connection =
(Connection) Queue.remove (Queue.size ()-1);
Connection.register (selector);
}
}
}

Iv. registration and processing of detailed procedures

Next we will analyze the register () method of connection. We always say that the connection with selector registration, in fact, this is a simplification of the argument. In fact, a Java.nio.channels.SocketChannel object is registered with selector, but only for specific I/O operations. After registering, a java.nio.channels.SelectionKey is returned. This selection key can be associated with any object through the Attach () method. In order to get the connection through the key, the connection object is associated with the key. In this way, we can indirectly obtain a connection from selector.

public void Register (Selector Selector)
Throws IOException {
Key = Socketchannel.register (selector, selectionkey.op_read);
Key.attach (this);
}

Look back at Connectionselector. The return value of the Select () method indicates how many connections are ready for I/O operations. Returns if the return value is 0, otherwise, the call to Selectedkeys () gets the collection of keys (set) from which the previously associated connection object is obtained, and then calls its readrequest () or Writeresponse () method. Specifically, which method is called by the connection is registered as a read operation or a write operation.

Now let's look at the connection class. The connection class represents a connection that handles all the details of the protocol. In the constructor, the Socketchannel passed through the parameter is set to Non-blocking mode, which is important for the server. In addition, the constructor also sets some default values and allocates buffer requestlinebuffer. Because the allocation of direct buffers is slightly more expensive, and each connection here is a new buffer, use Java.nio.ByteBuffer.allocate () instead of Bytebuffer.allocatedirect () here. If the buffer is reused, the direct buffer may be more efficient.

Public Connection (Socketchannel Socketchannel)
Throws IOException {
This.socketchannel = Socketchannel;
...
Socketchannel.configureblocking (FALSE);
Requestlinebuffer = bytebuffer.allocate (512);
...
}

After all initialization is done and Socketchannel is ready to read, Connectionselector invokes the Readrequest () method, using Socketchannel.read (Requestlinebuffer) method to read all available data into the buffer. If the full row cannot be read, the connectionselector that makes the call is returned, allowing another connection to enter the processing, and conversely, if the entire row is successfully read, the next step should be to parse the request as it would in httpd. If the current request is valid, the program creates a java.nio.Channels.FileChannel for the request target file and calls the Prepareforresponse () method.

private void Prepareforresponse () throws IOException {
StringBuffer responseline = new StringBuffer (128);
...
Responselinebuffer = Bytebuffer.wrap (
Responseline.tostring (). GetBytes ("ASCII")
);
Key.interestops (Selectionkey.op_write);
Key.selector (). Wakeup ();
}

The Prepareforresponse () method constructs a buffer responseline and, if necessary, an answer header or error message, and writes the data to the Responselinebuffer. This bytebuffer is a simple wrapper for a byte array. After generating the data to be exported, we also want to inform Connectionselector that no longer reads the data from now on, but writes the data. This notification is done by calling the Interestedops (Selectionkey.op_write) method of the selection key. To ensure that the selector can quickly recognize the change in the state of the connection operation, the wakeup () method is called. Next Connectionselector calls the Writeresponse () method of the connection. First, Responselinebuffer is written to the socket pipe. If the contents of the buffer are all written, and the requested file needs to be sent, then the Transferto () method of the previously opened FileChannel is invoked. The TransferTo () method is often efficient at transferring data from a file to a pipeline, but the actual transmission efficiency relies on the underlying operating system. At any time, the amount of data being transmitted is at most the amount of data that can be written to the target pipeline without blocking. For security and to ensure fairness between the various connections, the upper limit is set to KB.

If all data has been transmitted, close () performs cleanup work. Canceling connection registration is the main task here, specifically by calling the Cancel () method of the key.

public void Close () {
...
if (key!= null) Key.cancel ();
...
}

What about the performance of this new program? The answer is yes. In principle, a acceptor and a connectionselector are sufficient to support any number of open connections. As a result, the new implementation has an advantage in terms of scalability. However, since two threads must communicate through the Synchronized queue () method, they may block each other. There are two ways of solving this problem:

• Improved way to implement queues
• Adopting multiple Acceptor/connectionselector pairs

One disadvantage of niohttpd, compared to httpd, is that for each request, a new, buffered connection object is created. This leads to the additional CPU footprint generated by the garbage collector, which is more specific to the VM type. However, Sun took pains to stress that, with hotspot, the short-term survival of the object is no longer a problem.

Quantitative analysis and comparison of scalability

How much better is niohttpd than httpd in terms of scalability? Let's look at the specific numbers below. The first thing to declare is that the numbers here have a lot of speculative elements, and some important environmental factors, such as thread synchronization, context switching, page change, hard drive speed and buffering, are not considered. First, evaluate how long it takes to process r concurrent requests, assuming the requested file size is s byte and the client's bandwidth is b bytes/sec. For httpd, this time obviously relies directly on the number of threads T, because only t requests can be processed at the same time. So the httpd processing time can be obtained from the formula, where C is the overhead constant for performing the request analysis, which is the same for each request. In addition, it is assumed that the speed at which data is read from disk is always faster than that of the socket, and server bandwidth is always greater than the sum of the client's bandwidth and the CPU is not fully loaded. Therefore, factors such as server-side bandwidth, buffering, and hard drive speed are not considered in this formula.

Figure I: Formula One

However, the processing time of NIOHTTPD is no longer dependent on T. For NIOHTTPD, the transfer time L relies heavily on client bandwidth B, file size S, and the previously mentioned constant C. The formula Two can be obtained, and the minimum transmission time of niohttpd can be obtained from the formula.

Figure II: Formula Two

Note the ratio D of Formula Three, which measures the performance comparison between NIOHTTPD and httpd.

Figure Three: Formula Three

Further analysis shows that if s, B, t and C are constants, r tends to infinity when D tends to a limit, which can be easily computed from Formula four.

Figure IV: Formula Four

Therefore, in addition to the number of threads and the overhead of a constant, the length of the connection s/b has an extremely important effect on D. The longer the connection lasts, the smaller the D value, and the higher the advantage of the niohttpd contrast httpd. Table one shows that when c=10ms,t=100,s=1mb,b=8kb/s, NIOHTTPD is 126 times times faster than httpd. If the connection lasts for a long time, niohttpd shows great advantages. When the connection time is short, for example, in MB LAN, if the file is large, niohttpd shows 10% advantage; If the file is small, the advantage is not obvious.

The above calculations assume that the constant cost of niohttpd and httpd is roughly the same, and that the different implementations of the server do not incur new overhead. As mentioned earlier, this comparison is an ideal condition for comparison. However, the above comparison is sufficient for the concept of what form of implementation holds more advantages. It is worth noting that most Web files are small in size, but HTTP 1.1 clients try to keep the connection as long as possible (the keep-alive option is turned on). Many times, many connections that no longer transmit any data remain open. Assuming that each thread on the server corresponds to a connection, this can lead to an incredible waste of resources. So, especially for HTTP servers, leveraging the new Java I/O APIs can dramatically increase scalability.

Conclusion: Java's new I/O API can effectively improve the scalability of the server. Compared to the old APIs, the new APIs are more complex and require a deeper understanding of multithreading and synchronization. However, once you've crossed these barriers, you'll find that the new I/O API is a necessary, useful improvement on the Java 2 platform.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Performance analysis of the Java I/O API

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Performance analysis of the Java I/O API

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support