HttpClient 4.3 tutorial Sixth HTTP cachingPosted on October 28, 2013
6.1. Basic concepts
The caching mechanism of the httpclient provides a cache layer compatible with the http/1.1 standard – equivalent to the Java browser cache. The implementation of the httpclient cache mechanism follows the design principles of the responsibility chain (Chain of Responsibility), the default httpclient is not cached, and the httpclient with the caching mechanism can be used to temporarily replace the default httpclient, If the cache is turned on, the results of our request are taken from the cache, not from the target server. If you set the or parameter in the GET request header If-Modified-Since
If-None-Match
, HttpClient automatically checks to the server if the cache is out of date.
The http/1.1 version of the cache is semantically transparent, meaning that the cache should not modify the request/response packets that are transferred between the client and the server, regardless. Therefore, using a httpclient with a cache in existing compliant client-server relationship should also be secure. Although the cache is part of the client, from the point of view of the HTTP protocol, the caching mechanism is designed to be compatible with transparent cache proxies.
Finally, the httpclient cache also supports the Cache-control extension (stale-while-revalidate ') as defined in RFC 5861 stale-if-error‘和
.
When you open a cached httpclient to execute an HTTP request, the following steps are taken:
- Check that HTTP requests meet the basic requirements of HTTP 1.1 and attempt to fix errors if they are not compliant.
- Refreshes the cache entry for which the request is not valid. (Flush any cache entries which would is invalidated by this request.)
- Detects whether the request can be fetched from the cache. If not, send the request directly to the destination server, get the response, and join the cache.
- If the request can be obtained from the cache, HttpClient attempts to read the data in the cache. If the read fails, the request is sent to the target server and, if possible, the response is cached.
- If the HttpClient cached response can be returned directly to the request, HttpClient constructs a contained
ByteArrayEntity
BasicHttpResponse
object and returns it to the HTTP request. Otherwise, HttpClient will re-check the cache to the server.
- If the HttpClient cache responds to a server checksum failure, the data is re-requested from the server and cached (if appropriate).
When the cached httpclient receives a response from the server, it passes the following steps:
- Check if the received response meets protocol compatibility
- Determines whether the received response can be cached
- If the response is cacheable, HttpClient will try to read the data from the response message (the size can be configured in the configuration file) and cache it.
- If the response data is too large, the cache or refactoring consumes less response space, and the response is returned directly without caching.
It is important to note that the httpclient with the cache is not another implementation of the httpclient, but is implemented by inserting additional processing components into the HTTP request execution pipeline.
6.2. RFC-2616 Compliance
HttpClient's caching mechanism and RFC-2626 documentation are unconditionally compatible. That is, as long as the specified, MUST
MUST NOT
SHOULD
or SHOULD NOT
these HTTP cache specifications, the HttpClient cache layer will be cached in the way specified. That is, when we use the HttpClient caching mechanism, the HttpClient cache module does not produce abnormal actions.
6.3. Usage examples
The following example describes how to create a basic httpclient that opens the cache. and configured with a maximum cache of 1000 object objects, each with a maximum of 8192 bytes of data. The data that appears in the code is only for presentation purposes, and is not a recommended configuration.
CacheConfig cacheconfig = Cacheconfig.custom (). Setmaxcacheentries (+). Setmaxobjectsize (8192) . build (); Requestconfig requestconfig = Requestconfig.custom (). Setconnecttimeout (30000). SetSocketTimeout (300 XX). build (); Closeablehttpclient cachingclient = Caching Httpclients.custom (). Setcacheconfig (cacheconfig). setDe Faultrequestconfig (Requestconfig). build (); Httpcachecontext context = Httpcachecontext.create (); HttpGet httpget = new HttpGet ("http://www.mydomain.com/content/"); Closeablehttpresponse response = Cachingclient.execute (httpget, context); try {cacheresponsestatus responsestatus = Context.getcacheresponsestatus (); Switch (responsestatus) {case CACHE_HIT:SYSTEM.OUT.PRINTLN ("A response is generated from the Cache with "+" no requests sent upstream "); Break CaseCACHE_MODULE_RESPONSE:System.out.println ("The RESPONSE is generated directly by the" + "C Aching module "); Break Case CACHE_MISS:SYSTEM.OUT.PRINTLN ("The response came from an upstream server"); Break Case VALIDATED:SYSTEM.OUT.PRINTLN ("The response is generated from the cache" + "after Val Idating the entry with the origin server "); Break }} finally {Response.close (); }
6.4. Configuration
A cached httpclient inherits all configuration items and parameters (including configuration items such as timeout time, connection pool size, etc.) for non-cached httpclient. If you need to configure the cache specifically, you can initialize an CacheConfig
object to customize the following parameters:
Cache size
(Cache size). If background storage is supported, we can specify the maximum number of bars to cache, and the maximum size of response stored in each cache.
Public/private cacheing
(Public/private cache). By default, caching is treated as a public cache by the cache module, so the caching mechanism does not cache either the authorization header message or Cache-Control:private
the specified response. However, if the cache is only used by a logical user (similar to the browser cache), we may want to turn off the cache sharing mechanism.
Heuristic caching
(heuristic caching). Even if the server does not explicitly set the cache control headers information, each RFC2616 cache also stores a certain number of caches. This feature is turned off by default in HttpClient, if the server does not set header information to control the cache, but we still want to cache the response, we need to open this feature in HttpClient. Activates the heuristic cache and then uses the default refresh time or custom refresh time. For more information on heuristic caching, refer to section 13.2.2 of the http/1.1 RFC, 13.2.4.
Background validation
(background check). The httpclient caching mechanism supports RFC5861 stale-while-revalidate
directives, which allow a certain number of caches to expire in the background check. We may need to adjust the maximum and minimum number of threads that can work in the background and set the maximum idle time for the thread before recycling. When there are not enough threads to verify that the cache is out of date, we can specify the size of the queued queue.
6.4. Storage media
By default, the HttpClient cache mechanism places cache entries and cached response in the JVM memory of the local program. While this provides high performance, it becomes less reasonable when there is a size limit to our program's memory. Because the cache lives in the middle of a short period, if the program restarts, the cache will fail. The current version of HttpClient uses Ehcache and memchached to store the cache, which allows caching to be placed on local disks or other storage media. If the memory, local disk, foreign disk, are not suitable for your application, HttpClient also supports custom storage media, only need to implement the HttpCacheStorage
interface, and then when creating httpclient, use the configuration of this interface. In this case, the cache is stored in the custom media, but you'll get to reuse all of the logic surrounding http/1.1 compliance and cache handling. In general, you can create a store that supports any key-value pair (similar to a Java map interface) for HttpCacheStorage
atomic updates.
Finally, with some extra work, you can build a multi-layered cache structure, cache in disk, cache in remote memcached, cache in virtual memory, cache in L1/L2 processor, etc.
easy to trace: A little progress every day
Reprint please keep the link address: http://www.yeetrack.com/?p=844
HttpClient 4.3 tutorial Sixth HTTP caching