HTTP persistent connection

Source: Internet
Author: User
Tags response code keep alive

Try to translate technical documents.
What is HTTP Persistent Connections? HTTP persistent connections, also called HTTP keep-alive, or HTTP connection reuse, is the idea of using the same TCP connection to send and receive multiple HTTP requests/responses, as opposed to opening a new one for every single request/response pair. using persistent connections is very important for improving HTTP performance.
What is HTTP persistent connection? An HTTP persistent connection is different from a tcp connection used to initiate an http request or response. An http persistent connection uses the same tcp connection to process multiple http requests and responses, it is also called HTTP keep-alive or http connection reuse. Http persistent connections can improve the performance of http requests/responses.

There are several advantages of using persistent connections, including:

Network friendly. less network traffic due to fewer setting up and tearing down of TCP connections. CED latency on subsequent request. due to avoidance of initial TCP handshake Long lasting connections allowing TCP sufficient time to determine the congestion state of the network, thus to react appropriately.
Http persistent connections have many advantages, including:
Setting up and disabling tcp connections can reduce network traffic. The established tcp handshake reduces the latency of subsequent requests. A long connection gives tcp sufficient time to determine network congestion and facilitate the next step.
The advantages are even more obvious with HTTPS or HTTP over SSL/TLS. there, persistent connections may reduce the number of costly SSL/TLS handshake to establish security associations, in addition to the initial TCP connection set up. in HTTP/1.1, persistent connections are the default behavior of any connection. that is, unless otherwise indicated, the client shocould assume that the server will maintain a persistent connection, even after error responses from the server. however, the protocol provides means for a client and a server to signal the closing of a TCP connection.
These advantages are more significant when using https connections. It can reduce the number of times that a high-consumption SSL/TLS handshake is established. In HTTP/1.1, persistent connections are used by default. By default, the client server maintains a persistent connection, even if an error response is returned, unless explicitly indicated that the persistent connection is not used. In addition, the Protocol also specifies that the client can send a closing signal to the server to close the TCP connection.
What makes a connection reusable? Since TCP by its nature is a stream based protocol, in order to reuse an existing connection, the HTTP protocol has to have a way to indicate the end of the previous response and the beginning of the next one. thus, it is required that all messages on the connection MUST have a self-defined message length (I. e ., one not defined by closure of the connection ). self demarcation is achieved by either setting the Content-Length header, or in the case of chunked transfer encoded entity body, each chunk starts with a size, and the response body ends with a special last chunk.
How can connections be reused? Because TCP is a stream-based protocol, the HTTP protocol requires a way to indicate the end of the previous response and the start of the next response to reuse the established connection. Therefore, it requires that the information transmitted in the connection must have a custom message length. To customize the message Length, you can set the Content-Length message header. If the encoded entity Content block is transmitted, the size of each data block is indicated, the response body ends with a special data block.
What happens if there are proxy servers in? Since persistent connections applies to only one transport link, it is important that proxy servers correctly signal persistent/or-non-persistent connections separately with its clients and the origin servers (or to other proxy servers ). from a HTTP client or server's perspective, as far as persistence connection is concerned, the presence or absence of proxy servers is transparent.
What if a proxy server exists in the middle? Because persistent connections only occupy one transmission link, it is particularly important for the proxy server to correctly send long connections or non-persistent connections to the client and server (or other proxy servers. But from the HTTP client or server side, the proxy server is transparent to them, even if the persistent connection requires attention.

What does the current JDK do for Keep-Alive? The JDK supports both HTTP/1.1 and HTTP/1.0 persistent connections.

When the application finishes reading the response body or when the application callclose () on the InputStream returned by URLConnection. getInputStream (), the JDK's HTTP protocol handler will try to clean up the connection and if successful, put the connection into a connection cache for reuse by future HTTP requests.

The support for HTTP keep-Alive is done transparently. however, it can be controlled by system properties http. keepAlive, and http. maxConnections, as well as by HTTP/1.1 specified request and response headers.

How does the current JDK handle Keep-Alive? JDK supports both HTTP/1.1 and HTTP/1.0. When the application reads the response body content, or calls close () to close the URLConnection. for the stream returned by getInputStream (), the HTTP protocol handle in JDK closes the connection and puts the connection in the connection cache for later HTTP requests to use. The support for HTTP keep-Alive is transparent. However, you can also control http. keepAlive, http. maxConnections, and specific request Response Headers in HTTP/1.1.

The system properties that control the behavior of Keep-Alive are: http. keepAlive = <boolean> default: true

Indicates if keep alive (persistent) connections shoshould be supported. http. maxConnections = <int> default: 5

Indicates the maximum number of connections per destination to be kept alive at any given time

HTTP header that influences connection persistence is: Connection: close

If the "Connection" header is specified with the value "close" in either the request or the response header fields, it indicates that the connection shoshould not be considered 'persistent' after the current request/response is complete.
System attributes that control the performance of Keep-Alive include:
Http. keepAlive = <Boolean value> default: true indicates whether persistent connections are supported.
Http. maxConnections = <integer> default: 5 specifies the maximum number of persistent connections to the same server.
The HTTP header that affects persistent connections is Connection: close. If the Connection header in the request or response is specified as close, the TCP Connection will be closed after the current request or response is complete.
The current implementation doesn't buffer the response body. which means that the application has to finish reading the response body or call close () to abandon the rest of the response body, in order for that connection to be reused. furthermore, current implementation will not try block-reading when cleaning up the connection, meaning if the whole response body is not available, the connection will not be reused.
The current implementation in JDK does not support caching the response body. Therefore, the application must read the response body content or call close () to close the stream and discard the unread content to reuse the connection. In addition, the current implementation does not use blocking reading when clearing the connection, which means that if the response body is unavailable, the connection cannot be reused.

What's new in Tiger? When the application encounters a HTTP 400 or 500 response, it may ignore the IOException and then may issue another HTTP request. in this case, the underlying TCP connection won't be Kept-Alive because the response body is still there to be consumed, so the socket connection is not cleared, therefore not available for reuse. what the application needs to do is call HttpURLConnection. getErrorStream () after catching the IOException, read the response body, then close the stream. however, some existing applications are not doing this. as a result, they do not benefit from persistent connections. to address this problem, we have introduced a workaround.

The workaround involves buffering the response body if the response is> = 400, up to a certain amount and within a time limit, thus freeing up the underlying socket connection for reuse. the rationale behind this is that when the server responds with a> = 400 error (client error or server error. one example is "404: File Not Found" error), the server usually sends a small response body to explain whom to contact and what to do to recover.
When an application receives an HTTP Response of 400 or 500, it ignores IOException and sends an HTTP request. In this case, the underlying TCP connection will not be retained because the response content is still waiting for reading, and the socket connection is not cleaned up and cannot be reused. The application can call HttpURLConnection. getErrorStream () after capturing IOException, read the response content, and then close the stream. However, the existing applications do not do this and cannot reflect the advantages of persistent connections. To solve this problem, we will introduce workaround.
When the status code of the response body is greater than or equal to 400, workaround caches a certain amount of response content within a certain period of time and releases the underlying socket connection for reuse. The basic principle is that when the response status code is greater than or equal to 400, the server sends a brief response body to indicate who the connection is and how to restore the connection.

Several new Sun implementation specific properties are introduced to help clean up the connections after error response from the server.

The major one is:

Sun.net. http. errorstream. enableBuffering = <boolean> default: false

With the above system property set to true (default is false), when the response code is> = 400, the HTTP handler will try to buffer the response body. thus freeing up the underlying socket connection for reuse. thus, even if the application doesn't call getErrorStream (), read the response body, and then call close (), the underlying socket connection may still be kept-alive and reused.

The following two system properties provide further control to the error stream buffering behavior:

Sun.net. http. errorstream. timeout = <int> in millisecond default: 300 millisecond

Sun.net. http. errorstream. bufferSize = <int> in bytes default: 4096 bytes

The following describes some specific attributes of SUN implementation to help clear connections after an error response body is received: sun.net. http. errorstream. enableBuffering = <Boolean value> default: false

When the preceding attribute is set to true, the HTTP handle tries to cache the response content when the received response code is greater than or equal to 400. Releases underlying socket connections for reuse. Therefore, even if the application does not call getErrorStream () to read the response content, or calls close () to close the stream, the underlying socket connection will remain connected.
The following two system attributes further control the cache behavior of error streams: sun.net. http. errorstream. timeout = <int> in millisecond default: 300 milliseconds

Sun.net. http. errorstream. bufferSize = <int> in bytes default: 4096 bytes
What can you do to help with Keep-Alive? Do not abandon a connection by ignoring the response body. Doing so may results in idle TCP connections. That needs to be garbage collected when they are no longer referenced.

If getInputStream () successfully returns, read the entire response body.

When calling getInputStream () from HttpURLConnection, if an IOException occurs, catch the exception and call getErrorStream () to get the response body (if there is any ).

Reading the response body cleans up the connection even if you are not interested in the response content itself. but if the response body is long and you are not interested in the rest of it after seeing the beginning, you can close the InputStream. but you need to be aware that more data cocould be on its way. thus the connection may not be cleared for reuse.

Here's a code example that complies to the above recommendation:

How do you maintain the connection status? Do not ignore the response body and discard the connection. In this case, the TCP connection is idle and will be reclaimed by the garbage collector when it is no longer referenced. If getInputStream () returns success, all response content is read. If an IOException is thrown, capture the exception and call getErrorStream () to read the response content (if the response content exists ).
Even if you are not interested in the response content, read it to clear the connection. However, if the response content is long and you are not interested after reading the start part, you can call close () to close the stream. It is worth noting that the data of other parts is already being read, so the connection cannot be cleared and reused.
The following is a sample code based on the above suggestions:

1try {

2 URL a = new URL (args [0]);

3 URLConnection urlc =. openConnection (); 4 is = conn. getInputStream (); 5 int ret = 0; 6 while (ret = is. read (buf)> 0) {7 processBuf (buf); 8} 9 // close the inputstream10 is. close (); 11} catch (IOException e) {12 try {13 respCode = (HttpURLConnection) conn ). getResponseCode (); 14 es = (HttpURLConnection) conn ). getErrorStream (); 15 int ret = 0; 16 // read the response body17 while (ret = es. read (buf)> 0) {18 processBuf (buf); 19} 20 // close the errorstream21 es. close (); 22} catch (IOException ex) {23 // deal with the exception24} 25}

If you know ahead of time that you won't be interested in the response body, you shoshould issue a HEAD request instead of a GET request. for example when you are only interested in the meta info of the web resource or when testing for its validity, accessibility and recent modification. here's a code snippet:
If you are not interested in the response content in advance, you can use HEAD requests instead of GET requests. For example, you can obtain the meta information of a web resource or test its validity, accessibility, and recent modifications. The following is a code snippet:

1URL a = new URL (args [0]); 2 URLConnection urlc =. openConnection (); 3 HttpURLConnection httpc = (HttpURLConnection) urlc; 4 // only interested in the length of the resource5httpc. setRequestMethod ("HEAD"); 6int len = httpc. getContentLength ();

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.