2.1. Persistent connection
Two hosts the process of establishing a connection is a complex process that involves the exchange of multiple packets and is time consuming. The three-time handshake required for an HTTP connection is expensive, and this overhead is larger for smaller HTTP messages. But if we use an already established HTTP connection directly, the cost is smaller and the throughput is greater.
http/1.1 supports HTTP connection multiplexing by default. http/1.0-compatible terminals can also be used to maintain connectivity through declarations, enabling connection multiplexing. The HTTP proxy can also keep the connection free for a certain amount of time, facilitating subsequent HTTP requests to this host. The condition that keeps the connection from being released is actually a persistent connection established. HttpClient also supports persistent connections.
2.2.HTTP Connection Routing
HttpClient can establish connections both directly and through multiple brokered routes (hops) and target servers. HttpClient divides the route into three plain (clear text), tunneled (tunnel), and layered (layered). Multiple intermediary agents used in a tunnel connection are called proxy chains.
The client connects directly to the target host or only through an intermediary agent, which is the plain route. The client establishes a connection through the first proxy, which is tunnelling through the proxy chain, which is the case with tunneled routing. Routes that do not pass through intermediate proxies are not possible when tunneled. The client makes a protocol hierarchy on an existing connection, so that the established route is the layered route. The protocol can be layered on both links, either on the destination host, or directly (without a proxy).
2.2.1. Routing calculations
RouteInfo
The interface contains the routing information that passes through the packet as it is sent to the target host. HttpRoute
class inherits the RouteInfo
interface, which is RouteInfo
the concrete implementation, and this class is not allowed to be modified. The HttpTracker
HttpClient class also implements the RouteInfo
interface, which is mutable, and the class is used internally to detect the remaining routes of the target host. HttpRouteDirector
is an auxiliary class that helps you calculate the next routing information for a packet. This class is also used within the httpclient.
HttpRoutePlanner
An interface can be used to represent a client-to-server routing calculation policy based on the HTTP context. The httpclient has two HttpRoutePlanner
implementation classes. SystemDefaultRoutePlanner
This class is based on the java.net.ProxySelector
default use of the JVM's proxy configuration information, which typically comes from the system configuration or browser configuration. DefaultProxyRoutePlanner
This class does not use the configuration of the Java itself or the configuration of the system or browser. It typically calculates routing information through the default proxy.
2.2.2. Secure HTTP connection
To prevent information passing through HTTP messages from being acquired and intercepted by unauthorized third parties, HTTP can use the SSL/TLS protocol to secure HTTP transport, which is currently the most widely used protocol. Other encryption techniques can be used, of course. Typically, however, HTTP information is transmitted over an encrypted SSL/TLS connection.
2.3. HTTP Connection Manager
2.3.1. Managing connections and connection managers
An HTTP connection is a complex, stateful, thread-insecure object, so it must be properly managed. An HTTP connection can only be accessed by one thread at a time. HttpClient uses a special entity class called the HTTP Connection Manager to manage HTTP connections, which are implemented by the entity class HttpClientConnectionManager
. The HTTP Connection Manager acts as a factory class when creating a new HTTP connection, manages the lifetime of a persistent HTTP connection, and synchronizes persistent connections (ensuring thread safety, where an HTTP connection can only be accessed by one thread at a time). The HTTP Connection manager works with ManagedHttpClientConnection
instance classes, which ManagedHttpClientConnection
can be seen as a proxy server for HTTP connections, managing I/O operations. If an HTTP connection is released or is explicitly indicated by its consumer to close, the underlying connection is detached from its agent and the connection is returned to the connection manager. This is even if the service consumer still holds the proxy reference, it can no longer perform I/O operations, or change the status of the HTTP connection.
The following code shows how to get an HTTP connection from the Connection manager:
HttpClientContext context = HttpClientContext.create(); HttpClientConnectionManager connMrg = new BasicHttpClientConnectionManager(); HttpRoute route = new HttpRoute(new HttpHost("www.yeetrack.com", 80)); // 获取新的连接. 这里可能耗费很多时间 ConnectionRequest connRequest = connMrg.requestConnection(route, null); // 10秒超时 HttpClientConnection conn = connRequest.get(10, TimeUnit.SECONDS); try { // 如果创建连接失败 if (!conn.isOpen()) { // establish connection based on its route info connMrg.connect(conn, route, 1000, context); // and mark it as route complete connMrg.routeComplete(conn, route, context); } // 进行自己的操作. } finally { connMrg.releaseConnection(conn, null, 1, TimeUnit.MINUTES); }
If you want to terminate the connection, you can call ConnectionRequest
the cancel()
method. This method unlocks ConnectionRequest
get()
the thread that is blocked by the class method.
2.3.2. Simple Connection Manager
BasicHttpClientConnectionManager
is a simple connection manager that can manage only one connection at a time. Although this class is thread-safe, it can only be used by one thread at a time. The BasicHttpClientConnectionManager
old connection is reused as much as possible to send subsequent requests, and the same route is used. If the routing of the subsequent request does not match the route in the old connection, the BasicHttpClientConnectionManager
current connection is closed and the connection is re-established using the route in the request. If the current connection is in use, an exception is thrown java.lang.IllegalStateException
.
2.3.3. Connection Pool Manager
Relatively BasicHttpClientConnectionManager
speaking, PoolingHttpClientConnectionManager
it is a more complex class that manages connection pooling and can provide HTTP connection requests to many threads at the same time. Connections is pooled on a per route basis. When a new connection is requested, if a connection pool has a persistent connection available, the Connection Manager uses one of them instead of creating a new connection.
PoolingHttpClientConnectionManager
The number of connections maintained has a limit on each routing basis and total. By default, there are no more than 2 connections per route, and the total number of connections cannot exceed 20. In practical applications, this limit may be too small, especially if the server is using the HTTP protocol.
The following example shows if you are adjusting the parameters of a connection pool:
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(); // 将最大连接数增加到200 cm.setMaxTotal(200); // 将每个路由基础的连接增加到20 cm.setDefaultMaxPerRoute(20); //将目标主机的最大连接数增加到50 HttpHost localhost = new HttpHost("www.yeetrack.com", 80); cm.setMaxPerRoute(new HttpRoute(localhost), 50); CloseableHttpClient httpClient = HttpClients.custom() .setConnectionManager(cm) .build();
2.3.4. Closing the Connection Manager
When an instance of HttpClient is not in use or is out of its scope, we need to turn off its connection manager to close all connections and release the system resources that these connections occupy.
CloseableHttpClient httpClient = <...> httpClient.close();
2.4. Multi-threaded request execution
When a request Connection pool manager (for example) is used PoolingClientConnectionManager
, HttpClient can execute requests from multiple threads at the same time.
The
Poolingclientconnectionmanager
assigns the request connection according to its configuration. If all connections in the connection pool are occupied, subsequent requests are blocked until a connection is released back into the connection pool. To prevent an ever-blocking situation, we can set the value of http.conn-manager.timeout
to an integer. If there are no available connections within the time-out period, a connectionpooltimeoutexception
exception is thrown.
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(); CloseableHttpClient httpClient = HttpClients.custom() .setConnectionManager(cm) .build(); // URL列表数组 String[] urisToGet = { "http://www.domain1.com/", "http://www.domain2.com/", "http://www.domain3.com/", "http://www.domain4.com/" }; // 为每个url创建一个线程,GetThread是自定义的类 GetThread[] threads = new GetThread[urisToGet.length]; for (int i = 0; i < threads.length; i++) { HttpGet httpget = new HttpGet(urisToGet[i]); threads[i] = new GetThread(httpClient, httpget); } // 启动线程 for (int j = 0; j < threads.length; j++) { threads[j].start(); } // join the threads for (int j = 0; j < threads.length; j++) { threads[j].join(); }
Even if the instance of HttpClient is thread-safe, it can be accessed by multiple threads, but it is still recommended that each thread have a HttpContext of its own dedicated instance.
The following is the definition of the GetThread class:
static class GetThread extends Thread { private final CloseableHttpClient httpClient; private final HttpContext context; private final HttpGet httpget; public GetThread(CloseableHttpClient httpClient, HttpGet httpget) { this.httpClient = httpClient; this.context = HttpClientContext.create(); this.httpget = httpget; } @Override public void run() { try { CloseableHttpResponse response = httpClient.execute( httpget, context); try { HttpEntity entity = response.getEntity(); } finally { response.close(); } } catch (ClientProtocolException ex) { // Handle protocol errors } catch (IOException ex) { // Handle I/O errors } } }
2.5. Connection Recycling Policy
One of the main drawbacks of the classic blocking I/O model is that the socket can react to I/O events only when the group-side I/O occurs. When the connection is retracted by the manager, the connection remains alive, but the status of the socket cannot be monitored and feedback can be made to the I/O event. If the connection is closed by the server, the client will not be able to monitor the status change of the connection (and cannot close the local socket depending on the connection status).
HttpClient in order to mitigate the impact of this problem, the connection will be monitored if the connection is outdated before a connection is used, and if the connection is closed on the server side, the connections are invalidated. This obsolete check is not 100% effective and adds 10 to 30 milliseconds of additional overhead for each request. The only solution that works, and does not involve a one thread per socket model for idle connections, is to establish a monitoring thread that specifically reclaims connections that have been judged as invalid due to long inactivity. This monitoring thread can periodically invoke the ClientConnectionManager
method of the class closeExpiredConnections()
to close the expired connection and reclaim the connection that was closed in the connection pool. It can also optionally invoke a method of the ClientConnectionManager
class closeIdleConnections()
to close inactive connections over a period of time.
public static class Idleconnectionmonitorthread extends Thread {private final httpclientconnectionmanage R Connmgr; Private volatile Boolean shutdown; Public Idleconnectionmonitorthread (Httpclientconnectionmanager connmgr) {super (); This.connmgr = Connmgr; } @Override public void Run () {try {while (!shutdown) {synch Ronized (this) {Wait (5000); Close the failed connection connmgr.closeexpiredconnections (); Optional, close inactive connections within 30 seconds connmgr.closeidleconnections (timeunit.seconds); }}} catch (Interruptedexception ex) {//Terminate}} public void shutdown () {shutdown = true; Synchronized (this) {notifyall (); } } }
2.6. Connect Survival Policy
The HTTP specification does not specify how long a persistent connection should remain alive. Some HTTP servers use non-standard Keep-Alive
header messages to interact with the client, and the server keeps the connection for a few seconds. HttpClient will also use this header message. If the server returns a response that does not contain a Keep-Alive
header message, HttpClient will assume that the connection can be persisted forever. However, many servers can conserve server resources by shutting down inactive connections for a certain amount of time without notifying the client. In some cases the default policy seems too optimistic, and we may need to customize the connection survival policy.
Connectionkeepalivestrategy mystrategy = new Connectionkeepalivestrategy () {public long getkeepaliveduration (H Ttpresponse response, HttpContext context) {//Honor ' keep-alive ' header headerelementiterator it = New Basicheaderelementiterator (Response.headeriterator (HTTP. conn_keep_alive)); while (It.hasnext ()) {HeaderElement he = it.nextelement (); String param = He.getname (); String value = He.getvalue (); if (value! = null && param.equalsignorecase ("timeout")) {try {return L Ong.parselong (value) * 1000; } catch (NumberFormatException ignore) {}}} httphost target = (H Ttphost) Context.getattribute (httpclientcontext.http_target_host); if ("Www.naughty-server.com". Equalsignorecase (Target.gethostname ())) { Keep alive for 5 seconds only return 5 * 1000; } else {//otherwise keep alive for seconds return 30 * 1000; } } }; Closeablehttpclient client = Httpclients.custom (). Setkeepalivestrategy (Mystrategy). build ();
2.7.socket Connection Factory
HTTP connections use java.net.Socket
classes to transfer data. This relies on ConnectionSocketFactory
interfaces to create, initialize, and connect sockets. This also allows httpclient users to specify the code that the socket initializes when the code is run. PlainConnectionSocketFactory
is the default factory class that creates and initializes a plaintext socket (unencrypted).
Creating a socket and using a socket to connect to the target host the two processes are separate, so we can close the socket connection when the connection is blocked.
HttpClientContext clientContext = HttpClientContext.create(); PlainConnectionSocketFactory sf = PlainConnectionSocketFactory.getSocketFactory(); Socket socket = sf.createSocket(clientContext); int timeout = 1000; //ms HttpHost target = new HttpHost("www.yeetrack.com"); InetSocketAddress remoteAddress = new InetSocketAddress( InetAddress.getByName("www.yeetrack.com", 80); //connectSocket源码中,实际没有用到target参数 sf.connectSocket(timeout, socket, target, remoteAddress, null, clientContext);
2.7.1. Secure Socket Layering
LayeredConnectionSocketFactory
is ConnectionSocketFactory
the expansion interface. The layered socket factory class can create a socket connection based on a clear text socket. Layered sockets are primarily used to create secure sockets between proxy servers. HttpClient uses SSLSocketFactory
this class to implement a secure socket, which SSLSocketFactory
enables SSL/TLS tiering. Please be aware that HttpClient does not have any custom encryption algorithms. It relies entirely on Java Encryption Standard (JCE) and Secure Sockets (Jsee) expansion.
2.7.2. Integrated Connection Manager
The custom socket factory class can be associated with a specified protocol (Http, Https) to create a custom connection manager.
ConnectionSocketFactory plainsf = <...> LayeredConnectionSocketFactory sslsf = <...> Registry<ConnectionSocketFactory> r = RegistryBuilder.<ConnectionSocketFactory>create() .register("http", plainsf) .register("https", sslsf) .build(); HttpClientConnectionManager cm = new PoolingHttpClientConnectionManager(r); HttpClients.custom() .setConnectionManager(cm) .build();
2.7.3.SSL/TLS Custom
HttpClient use SSLSocketFactory
to create an SSL connection. SSLSocketFactory
allows users to be highly customizable. It can accept javax.net.ssl.SSLContext
instances of this class as parameters to create a custom SSL connection.
HttpClientContext clientContext = HttpClientContext.create(); KeyStore myTrustStore = <...> SSLContext sslContext = SSLContexts.custom() .useTLS() .loadTrustMaterial(myTrustStore) .build(); SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslContext);
2.7.4. Domain Validation
In addition to trust validation and client authentication on the SSL/TLS protocol layer, once the connection is established, httpclient can selectively verify that the target domain name and the domain name stored in the certificate are consistent. This validation can provide additional protection for server trusts. X509HostnameVerifier
the interface represents the policy for host name validation. In HttpClient, X509HostnameVerifier
there are three implementation classes. Important: Host name validation should not be confused with SSL trust validation.
2.8.HttpClient Proxy Server Configuration
Although HttpClient supports complex routing schemes and proxy chains, it also supports direct connections or only one-hop connections.
The simplest way to use a proxy server is to specify a default proxy parameter.
HttpHost proxy = new HttpHost("someproxy", 8080); DefaultProxyRoutePlanner routePlanner = new DefaultProxyRoutePlanner(proxy); CloseableHttpClient httpclient = HttpClients.custom() .setRoutePlanner(routePlanner) .build();
We can also get httpclient to use the JRE's proxy server.
SystemDefaultRoutePlanner routePlanner = new SystemDefaultRoutePlanner( ProxySelector.getDefault()); CloseableHttpClient httpclient = HttpClients.custom() .setRoutePlanner(routePlanner) .build();
Alternatively, we can also configure it manually RoutePlanner
so that the process of HTTP routing can be fully controlled.
HttpRoutePlanner routePlanner = new HttpRoutePlanner() { public HttpRoute determineRoute( HttpHost target, HttpRequest request, HttpContext context) throws HttpException { return new HttpRoute(target, null, new HttpHost("someproxy", 8080), "https".equalsIgnoreCase(target.getSchemeName())); } }; CloseableHttpClient httpclient = HttpClients.custom() .setRoutePlanner(routePlanner) .build(); } }http://www.yeetrack.com/?p=782
HttpClient4.3 Tutorial Chapter II Connection Management