The use of cache in program design

Source: Internet
Author: User
Caching is one of the most common ways to optimize system performance by adding caches before time-consuming parts, such as databases, to reduce the number of actual calls and reduce response times. But before you introduce a cache, be sure to think twice.

Getting resources over the internet is both slow and expensive. For this reason, the HTTP protocol contains a portion of the control cache so that HTTP clients can cache and reuse previously acquired resources to optimize performance and enhance the experience. Although the part about cache control in HTTP, there are some changes as the protocol evolves. But I think as a back-end programmer, when developing Web services, it is enough to focus on the request header If-none-match, the response header ETag, and the response header Cache-control. Because these three HTTP headers can meet your needs, and most of today's browsers support these three HTTP headers. All we have to do is make sure that each server response provides the correct HTTP header instructions to guide when the browser can cache the response and how long it can be cached.

Where is the cache?

There are three roles in the browser, Web proxy, and server, and HTTP caching exists in both the browser and Web proxy. Of course there are various caches inside the server, but this is not the HTTP cache that this article will discuss. The so-called HTTP cache control, is a convention, by setting different response header Cache-control to control the browser and Web Proxy usage policy of the cache, by setting the request header If-none-match and the response header ETag to verify the validity of the cache.

Response header ETag

The ETag is the full entity Tag, used to identify a resource. In a specific implementation, the ETag can be a hash value for a resource, or it can be an internally maintained version number. However, the etag should be able to reflect changes in resource content, which is the basis for HTTP caching to work properly.

As shown in the previous example, when the server returns a response, it usually contains some metadata information about the response in the HTTP header, where the ETag is one, in this case the ETag with the value X1323DDX is returned. When the content of the resource/file changes, the server should return a different etag.

Request Header If-none-match

For the same resource, such as/file in the previous example, after making a request, the browser already has a version of the content of/file, and this version of the ETag, when the next time the user needs this resource, the browser again to the server request, You can use the request header If-none-match to tell the server that it already has an etag of X1323DDX/file, so that if the/file on the server does not change, that is, the/file etag on the server is X1323DDX, Instead of returning the contents of the/file, the server returns a 304 response telling the browser that the resource has not changed and that the cache is valid.

As shown in the example above, after using If-none-match, the server needs only a small response to achieve the same result, which optimizes performance.

Response Header Cache-control

Each resource can define its own cache policy through HTTP header Cache-control, Cache-control control who can cache the response under what conditions and how long it can be cached. The fastest request is a request that does not have to communicate with the server: with a local copy of the response, we can avoid all network delays and data costs for the transmission. To do this, the HTTP specification allows the server to return a series of different cache-control instructions, controlling how long the browser or other relay cache caches a response and how much time it caches.

The Cache-control header is defined in the http/1.1 specification, replacing the header previously used to define the response cache policy (for example, Expires). All current browsers support Cache-control, so it's enough to use it.

Let me introduce the common directives that can be set in Cache-control again.

Max-age

This directive specifies the maximum amount of time (in seconds) that a given response is allowed to be reused, starting at the current request. For example, cache-control:max-age=60 indicates that the response can be cached and reused for 60 seconds. It is important to note that the browser does not send any requests to the server within the time specified by Max-age, including verifying that the cache is valid, that is, if the resources on the server have changed during that time, the browser will not be notified and the old version of the resource is used. So when setting the length of the cache time, you need to be cautious.

Public and private

If public is set to indicate that the response can be cached in the Web proxy of the browser or any relay, public is the default value, that is, cache-control:max-age=60 equals Cache-control:public, max-age=60.

In the case where the server is set to private such as Cache-control:private, max-age=60, it means that only the user's browser can cache the private response and not allow any relay Web proxy to cache it – for example, The user browser can cache HTML pages that contain user private information, but the CDN cannot be cached.

No-cache

If the server is setting No-cache-Cache-control:no-cache in the response, the browser must confirm with the server whether the returned response is changed before using the cached resource, and avoid downloading if the resource has not been changed. Whether the response before this verification is modified is implemented by the request header If-none-match and the response header ETag described above.

It is important to note that the name No-cache is a bit misleading. After setting the No-cache, it is not said that the browser will not cache the data, but the browser when using cached data, you need to confirm that the data is also consistent with the server. If No-cache is set, and the ETag implementation does not reflect a change in resources, it can result in the browser's cached data being kept from being updated.

No-store

If the server sets No-store-Cache-control:no-store in the response, the browser and any relay Web proxy will not store this data. The next time the resource is requested, the browser can only re-request the server and re-read the resource from the server.

How to determine the Cache-control strategy of a resource?

Here is a flowchart that can help you.

Common errors

Cache at startup

Sometimes we find that the application starts very slowly and finally discovers that one of the dependent service response times is very long.

In general, this type of problem indicates that the dependency service is unable to meet the requirements. If this is a third-party service, control is not on its own, and we may introduce a cache.

The problem with the introduction of caching is that the cache invalidation policy is difficult to take effect because the cache design is intended to be as few requests as possible to the dependent services.

Cache prematurely

This refers to "early", not the life cycle of the application, but the cycle of development. Sometimes we see that some developers have already estimated system bottlenecks and introduced caches in the early stages of development.

In fact, such a practice obscures the point at which performance optimizations might be performed. Anyway, the return value of this service will be cached, why should I take the time to optimize this part of the code?

Integrated cache

The "S" representation in the solid principle-single function principle (responsibility principle). After the application integrates the cache module, the cache module and the service layer are strongly coupled and cannot be run alone without the participation of the cache module.

Cache All Content

Sometimes in order to reduce the response delay, it is possible to blindly add cache to external calls. In fact, such behavior makes it easy for developers and maintainers to be unaware of the existence of the cache module and ultimately to evaluate the reliability of the underlying dependency module incorrectly.

Cascade Cache

Caching everything, or just caching most of the content, may cause other cached data to be included in the cached data.

If the application contains this cascading cache structure, it may result in an uncontrolled cache expiration time. The topmost cache needs to be updated after each level of cache is invalidated, and the final data returned will be completely updated.

Non-flush cache

Typically, the cache middleware provides a tool to flush the cache. For example, for Redis, maintenance personnel can delete part of the data and even refresh the entire cache through the tools they provide.

However, some temporary caches may not contain such a tool. For example, a simple cache that stores data in content does not normally allow external tools to modify or delete cached content. At this point, if the cache data is found to be abnormal, maintenance personnel can only take the way to restart the service, which will greatly increase operational costs and response time. What's more, some caches may write cache content to the file system for backup. In addition to restarting the service, you also need to ensure that the cache backup on the file system is deleted before the application starts.

The impact of caching

The above mentioned are the common errors that can be caused by introducing the cache, which are not considered in a cache-free system.

Deploying a system that relies heavily on caching may take a significant amount of time to wait for the cache to fail. For example, through CDN cache content, the system is released to refresh the CDN configuration, CDN cache content, may take several hours.

In addition, a performance bottleneck that takes precedence over caching can cause performance problems to be masked and not really resolved. In fact, there are times when tuning code takes a lot of time, and the introduction of cache components does not differ too much.

Finally, for systems that contain cache components, the cost of debugging is greatly increased. It often happens to trace a half-day code, and the resulting data comes from the cache, and there is no relation to the actual logically dependent component. The same problem may occur after all relevant test cases have been executed, and the modified code is not actually tested.

How to use good cache?

Discard the Cache!

Well, many times the cache is unavoidable. Internet-based systems, it is difficult to completely avoid the use of cache, and even the HTTP protocol header, including the cache configuration: Cache-control:max-age=xxx.

Understanding Data

If you want to access the data cache, you first need to understand the data update strategy. Only when you have a clear understanding of when the data needs to be updated, can you use the If-modified-since header to determine if the data requested by the client needs to be updated, is it simple to return 304 not Modified the local cache data before the client is reused, or to return the latest data. In addition, to make better use of the cache in the HTTP protocol, it is recommended to differentiate the data version, or use the ETag to mark the version of the cached data.

Optimize performance rather than using caching

As mentioned earlier, the use of caches tends to obscure potential performance issues. Use performance analysis tools whenever possible to find the real cause of slow application response and fix it. For example, reduce invalid code calls, optimize SQL based on SQL execution plans, and more.

Here's the code that clears all the cache for the application

/* * File Name: Datacleanmanager.java * Description: Main function is to clear the inside/out cache, clear the database, clear the sharedpreference, clear files and clear the custom directory */Package Com.test .    Dataclean;    Import Java.io.File;  Import Android.content.Context;    Import android.os.Environment;       /** * This app data Purge Manager */public class Datacleanmanager {/** * Clear the internal cache of the app (/data/data/com.xxx.xxx/cache) * * @param context */public static void Cleaninternalcache (context context) {Deletefilesbydirectory (c      Ontext.getcachedir ()); /** * Clear all databases for this application (/data/data/com.xxx.xxx/databases) * * @param context */public static void Cleandatabases (Context context) {Deletefilesbydirectory (New File ("/data/data/" + context.g      Etpackagename () + "/databases");      }/** * Clear this app sharedpreference (/data/data/com.xxx.xxx/shared_prefs) * * @param context */ public static void Cleansharedpreference (context context) {Deletefilesbydirectory (new File ("/data/data/" + context.getpackagename () + "/shared_prefs")); /** * Clear the application database by name * * @param context * @param dbName */public static void Cleanda      Tabasebyname (context context, String DbName) {context.deletedatabase (dbName); }/** * Clear content under/data/data/com.xxx.xxx/files * * @param context */public static void clean      Files (Context context) {Deletefilesbydirectory (Context.getfilesdir ()); }/** * Clears the contents of the external cache (/mnt/sdcard/android/data/com.xxx.xxx/cache) * * @param context */P                  ublic static void Cleanexternalcache (context context) {if (Environment.getexternalstoragestate (). Equals (          environment.media_mounted)) {deletefilesbydirectory (Context.getexternalcachedir ()); }}/** * Clear the file under the custom path, use caution, please do not delete it by mistake. And only supports file deletion in directory * * @param filePath */public static VoiD Cleancustomcache (String filePath) {deletefilesbydirectory (new File (FilePath)); /** * Clear all data for this app * * @param context * @param filepath */public static void Cleana          Pplicationdata (context context, String ... filepath) {Cleaninternalcache (context);          Cleanexternalcache (context);          Cleandatabases (context);          Cleansharedpreference (context);          Cleanfiles (context);          for (String Filepath:filepath) {cleancustomcache (FilePath); }}/** * Delete method will only delete files under a folder, if the incoming directory is a file, will not do processing * * @param directory */priv ate static void Deletefilesbydirectory (File directory) {if (directory! = null && directory.exists () &amp              ;& directory.isdirectory ()) {for (File item:directory.listFiles ()) {item.delete (); }          }      }  }

Summarize

Caching is a very useful tool, but it is extremely easy to misuse. Do not use caching at the last minute, and prioritize other ways to optimize application performance.

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.