1. Background
We have a business that calls an HTTP-based service from other departments, with a daily call volume of tens. HttpClient was used to complete the business. Before because the QPS did not go, I looked at the business code, and did some optimization, recorded here.
Before and after: optimization, the average execution time is 250ms, after optimization, the average execution time is 80ms, reduce the consumption of two-thirds, the container is no longer the alarm thread exhausted, refreshing ~
2. Analysis
The original implementation of the project is rough, that is, each time the request Initializes a httpclient, generates a HttpPost object, executes, and then extracts the entity from the returned result, saves it as a string, and finally closes the response and client explicitly. We analyze and optimize a little bit:
2.1 HttpClient Create overhead repeatedly
HttpClient is a thread-safe class, and it is not necessary for each thread to be created each time it is used, and one can be reserved globally.
2.2 Overhead of repeatedly creating TCP connections
TCP three handshake and four wave two binding cloth process, for high-frequency requests, consumption is too large. Imagine if each request we need to spend 5ms for the negotiation process, then for the QPS 100 single system, 1 seconds we will spend 500ms for shaking hands and waving. Is not a senior leader, we programmers do not make such a big pie, changed to keep alive way to achieve connection multiplexing!
2.3 Overhead of repeating cache entity
In the original logic, the following code was used:
httpentity entity == entityutils.tostring (entity);
Here we are the equivalent of an extra copy of the content into a string, and the original HttpResponse still retains a content, need to be consume off, in high concurrency and the content is very large, it consumes a lot of memory. And, we need to explicitly close the connection, ugly.
3. Implement
According to the above analysis, we mainly do three things: one is a single client, the second is the cached keepalive connection, and the third is better processing the return result. If you don't say it, say two.
Referring to the connection cache, it is easy to associate a database connection pool. HTTPCLIENT4 provides a poolinghttpclientconnectionmanager as a connection pool. Next we use the following steps to optimize:
3.1 Define a Keep alive strategy
Regarding keep-alive, this article does not unfold the explanation, only to mention a point, whether uses keep-alive according to the business situation to decide, it is not the panacea. Also, there are a lot of stories between Keep-alive and time_wait/close_wait.
In this business scenario, we are equivalent to a few fixed-client, long-time high-frequency access to the server, enabling keep-alive very suitable
One more mouth, HTTP keep-alive and TCP keepalive is not a thing. Back to the body, define a strategy as follows:
Connectionkeepalivestrategy Mystrategy =NewConnectionkeepalivestrategy () {@Override Public Longgetkeepaliveduration (HttpResponse response, HttpContext context) {Headerelementiterator it=NewBasicheaderelementiterator (Response.headeriterator (HTTP. conn_keep_alive)); while(It.hasnext ()) {headerelement He=it.nextelement (); String param=He.getname (); String value=He.getvalue (); if(Value! =NULL&&Param.equalsignorecase ("Timeout")) { returnLong.parselong (value) * 1000; } } return60 * 1000;//if there is no contract, the default definition is 60s }};
3.2 Configuring a Poolinghttpclientconnectionmanager
New Poolinghttpclientconnectionmanager (); Connectionmanager.setmaxtotal (+); Connectionmanager.setdefaultmaxperroute (50); // For example, the default per route is up to 50 concurrency, depending on the business
You can also set the number of concurrent numbers for each route.
3.3 Generating HttpClient
HttpClient = httpclients.custom () . Setconnectionmanager (ConnectionManager) . Setkeepalivestrategy ( Kastrategy) . Setdefaultrequestconfig (Requestconfig.custom (). setstaleconnectioncheckenabled (True ). Build ()) . Build ();
Note: Using the Setstaleconnectioncheckenabled method to evict a link that has been closed is not recommended. A better way is to manually enable a thread and run the Closeexpiredconnections and Closeidleconnections methods at timed intervals, as shown below.
Public Static classIdleconnectionmonitorthreadextendsThread {Private FinalHttpclientconnectionmanager connmgr; Private volatile Booleanshutdown; PublicIdleconnectionmonitorthread (Httpclientconnectionmanager connmgr) {Super(); This. Connmgr =Connmgr; } @Override Public voidrun () {Try { while(!shutdown) { synchronized( This) {Wait (5000); //Close Expired Connectionsconnmgr.closeexpiredconnections (); //Optionally, close connections//That has been idle longer than SECConnmgr.closeidleconnections (30, Timeunit.seconds); } } } Catch(Interruptedexception ex) {//Terminate } } Public voidshutdown () {shutdown=true; synchronized( This) {notifyall (); } } }
3.4 Reduce overhead when using HttpClient to execute method
It is important to note that do not close the connection.
A viable way to get content is similar to copying something in the entity:
res = entityutils.tostring (response.getentity (), "UTF-8"); Entityutils.consume (Response1.getentity ());
However, the more recommended way is to define a responsehandler that is convenient for you and me, and no longer own catch exceptions and close the stream. Here we can look at the relevant source code:
Public<T> T Execute (FinalHttphost Target,FinalHttpRequest request,Finalresponsehandler<?extendsT> ResponseHandler,FinalHttpContext context)throwsIOException, clientprotocolexception {args.notnull (ResponseHandler,"Response Handler"); FinalHttpResponse response =Execute (target, request, context); FinalT result; Try{result=Responsehandler.handleresponse (response); } Catch(FinalException T) { Finalhttpentity entity =response.getentity (); Try{Entityutils.consume (entity); } Catch(FinalException T2) { //Log this exception. The original exception is more//Important and'll is thrown to the caller. This. Log.warn ("Error consuming content after an exception.", T2); } if(tinstanceofruntimeexception) { Throw(runtimeexception) t; } if(tinstanceofIOException) { Throw(IOException) t; } Throw Newundeclaredthrowableexception (t); } //handling the response was successful. Ensure the content has//been fully consumed. Finalhttpentity entity =response.getentity (); Entityutils.consume (entity);//look here, look at this . returnresult; }
As you can see, if we use Resulthandler to execute the Execute method, the consume method is eventually called automatically, and the consume method is as follows:
public static void consume ( IOException { if (Entity = = null ; if (Entity.isstreaming ()) { final inputstream instream = Entity.getcontent (); if (instream! = null ) {Instream.close (); } } }
You can see that eventually it closes the input stream.
4. Other
Through the above steps, basically completed a support high concurrency of the httpclient, the following are some additional configuration and reminders:
Some timeout configurations for 4.1 httpclient
Connection_timeout is the connection timeout time, So_timeout is the socket timeout, which is different. The connection time-out is the wait time before the request is initiated, and the socket timeout is the time-out to wait for the data.
Httpparams params =Newbasichttpparams ();//Setting the connection time-out periodInteger connection_timeout = 2 * 1000;//Set Request Timeout 2 seconds based on business tuningInteger so_timeout = 2 * 1000;//set the wait data time-out time to 2 seconds based on business tuning//defines the millisecond time-out period used when retrieving managedclientconnection instances from Clientconnectionmanager//This parameter expects to get a value of type Java.lang.Long. If this parameter is not set, the default equals Connection_timeout, so be sure to set it. Long conn_manager_timeout = 500L;//in the httpclient4.2.3, I remember that it was changed to an object that led directly to a long error, which was later changed back.Params.setintparameter (coreconnectionpnames.connection_timeout, Connection_timeout);p Arams.setintparameter ( Coreconnectionpnames.so_timeout, So_timeout);p arams.setlongparameter (Clientpnames.conn_manager_timeout, CONN_ Manager_timeout);//test whether a connection is available before submitting a requestParams.setbooleanparameter (Coreconnectionpnames.stale_connection_check,true); //In addition, set the number of retries of the HTTP client, which is 3 by default, and is currently disabled (this is the default if the project volume is not available)Httpclient.sethttprequestretryhandler (NewDefaulthttprequestretryhandler (0,false));
4.2 If Nginx is configured, Nginx should also set the keep-alive for both ends.
In today's business, without Nginx, the situation is rather scarce. Nginx Default and client side open long connection and server side use short link. Note that the keepalive_timeout and keepalive_requests parameters of the client side, as well as the keepalive parameter settings at the upstream end, are not mentioned in the meaning of these three parameters.
These are all my settings. Through these settings, the original 250ms per request was successfully reduced to about 80 of the time, the effect is significant.
Reprint: Norman Bai
Optimal use of httpclient in high concurrency scenarios