Detailed descriptions of various cache commonly used in web Applications

Source: Internet
Author: User

This article uses Nginx, Rails, Mysql, and Redis as an example. It is similar to other web servers, languages, databases, and cache services.
The following are three layers to facilitate subsequent reference:

1. Client Cache

A client often accesses the same resource. For example, you can use a browser to access the homepage of a website, view the same article, or use an app to access the same api, if the resource does Not change anything that he has previously accessed, you can use the 304 Not Modified Response Header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5) in the http specification to directly use the client's cache, instead of generating content on the server.
The fresh_when method is built in Rails, and a line of code can be completed:

class ArticlesController def show  @article = Article.find(params[:id])  fresh_when :last_modified => @article.updated_at.utc, :etag => @article endend

The next time you access the service again, the request header will be compared with If-Modified-Since and If-None-Match. If they Match, 304 will be returned directly, instead of generating the response body.

However, this may cause a problem. Assume that our website has user information in navigation. A user has accessed the website before logging on to the topic, you will find that the status is not logged on. You can also access an article in the app and add it to the favorites folder. The next time you enter this article, you will still see the status of the article not added to the favorites folder. The solution to this problem is very simple. Add User-related variables to the etag calculation:

  fresh_when :etag => [@article.cache_key, current_user.id]  fresh_when :etag => [@article.cache_key, current_user_favorited]

In addition, if gzip is enabled for nginx and the rails execution result is compressed, The etag header output by rails is killed. nginx developers say that according to rfc specifications, the proxy_pass method must be handled in this way (because the content has changed), but I personally think this is not necessary, so I used a rough method, comment out this line of code in the src/http/modules/ngx_http_gzip_filter_module.c file and re-compile nginx:

  //ngx_http_clear_etag(r); 

Alternatively, you can choose not to change the nginx source code, remove gzip, and use Rack middleware for compression:

  config.middleware.use Rack::Deflater

Except for specifying fresh_when In the controller, the rails framework uses Rack: ETag middleware by default. It automatically adds etag to the response without etag, but compared with fresh_when, automatic etag only saves client time. The server will execute all the code, and compare it with curl.
Rack: ETag automatically added to etag:

Curl-v http: // localhost: 3000/articles/1 <Etag: "bf328447bcb2b8706193a50962035619" <X-Runtime: 0.286958 curl-v http: // localhost: 3000/articles/1 -- header 'if-None-Match: "bf328447bcb2b8706193a50962035619" '<X-Runtime: 0.293798 fresh_when: curl-v http: // localhost: 3000/articles/1 -- header 'if-None-Match: "bf328447bcb2b8706193a50962035619" '<X-Runtime: 0.033884

2. Nginx Cache

Some resources may be called many times, with no relation to the user status and rarely changed. For example, the List api on the news app and the ajax request category menu on the shopping website, you can use Nginx for caching.
There are two main implementation methods:
A. Dynamic Request for static file
After the rails request is complete, save the result as a static file. The static file content will be provided directly by nginx in subsequent requests. after_filter is used for implementation:

class CategoriesController < ActionController::Base after_filter :generate_static_file, :only => [:index] def index  @categories = Category.all end def generate_static_file  File.open(Rails.root.join('public', 'categories'), 'w') do |f|   f.write response.body  end endend

In addition, we need to delete this file during any category update to avoid cache refreshing:

class Category < ActiveRecord::Base after_save :delete_static_file after_destroy :delete_static_file def delete_static_file  File.delete Rails.root.join('public', 'categories') endend

Rails 4, to deal with this generated static File Cache can use the built-in caches_page, rails 4 becomes an independent gem actionpack-page_caching, compared with the manual code,

class CategoriesController < ActionController::Base caches_page :index def update  #...  expire_page action: 'index' endend

If there is only one server, this method is simple and practical, but if there are multiple servers, there will be a problem that the update category can only refresh its own server cache, you can use nfs to share static resource directories, or use the following 2nd types:

B. Static to centralized cache service
First, let Nginx have the ability to directly access the cache:

 upstream redis {  server redis_server_ip:6379; } upstream ruby_backend {  server unicorn_server_ip1 fail_timeout=0;  server unicorn_server_ip2 fail_timeout=0; } location /categories {  set $redis_key $uri;  default_type  text/html;  redis_pass redis;  error_page 404 = @httpapp; } location @httpapp {  proxy_pass http://ruby_backend; }

Nginx will first use the requested uri as the key to get it in redis. If it cannot get (404), it will forward it to unicorn for processing, and then rewrite the generate_static_file and delete_static_file methods:

 redis_cache.set('categories', response.body) redis_cache.del('categories')

In this way, in addition to centralized management, you can also set the cache expiration time. For some data with no timeliness requirements for updates, you can simply refresh the data at a fixed time without the need to process the refresh mechanism:

 redis_cache.setex('categories', 3.hours.to_i, response.body)

3. Full page Cache

Nginx cache is very difficult to process requests with parameter resources or with user statuses. In this case, you can use the full page cache.
For example, in the paging request list, we can add the page parameter to cache_path:

class CategoriesController caches_action :index, :expires_in => 1.day, :cache_path => proc {"categories/index/#{params[:page].to_i}"}end

For example, we only need to cache the rss output for 8 hours:

class ArticlesController caches_action :index, :expires_in => 8.hours, :if => proc {request.format.rss?}end

For example, for non-login users, we can cache the homepage:

class HomeController caches_action :index, :expires_in => 3.hours, :if => proc {!user_signed_in?}end

4. Fragment Cache

If the first two types of cache can be used in a limited number of scenarios, the fragment cache is the most suitable.

Scenario 1: we need an advertisement code segment on each page to display different advertisements. If the segment cache is not used, every page will query the advertisement code, it takes some time to generate html code:

- if advert = Advert.where(:name => request.controller_name + request.action_name, :enable => true).first div.ad  = advert.content

After the fragment cache is added, you can skip this query:

- cache "adverts/#{request.controller_name}/#{request.action_name}", :expires_in => 1.day do - if advert = Advert.where(:name => request.controller_name + request.action_name, :enable => true).first  div.ad   = advert.content

Scenario 2: when reading an article, the content of the article may not change for a long time. Frequent changes may be post comments. You can add the segment cache to the main part of the article:

- cache "articles/#{@article.id}/#{@article.updated_at.to_i}" do div.article  = @article.content.markdown2html

It saves the time for generating markdown syntax to convert to html. Here, the last update time of the article is used as a part of the cache key. If the content of the article changes, the cache will automatically become invalid, by default, the cache_key method of activerecord also uses updated_at. You can also add more parameters, such as the counter cache with the number of comments on the article. The article time is not updated when the number of comments is updated, you can add this counter to the key.

Scenario 3: Generation of complex page Structures
Pages with complex data structures can be generated without a large number of queries and html Rendering. Using segment caching can greatly save this part of time, to our website travel page http://chanyouji.com/trips/109123 (Please allow a small advertising, with some traffic):
You need to obtain weather data, photo data, text data, and other seo data, such as meta and keyword, and these content is different from other dynamic content, and the segment cache can be separated by multiple:

- cache "trips/show/seo/#{@trip.fragment_cache_key}", :expires_in => 1.day do title #{trip_name @trip} meta name="description" content="..." meta name="keywords" content="..."body div  ...- cache "trips/show/viewer/#{@trip.fragment_cache_key}", :expires_in => 1.day do - @trip.eager_load_all

TIPS: I added an eager_load_all METHOD TO THE trip object. When the cache is not hit, the n + 1 problem should be avoided during query:

 def eager_load_all  ActiveRecord::Associations::Preloader.new([self], {:trip_days => [:weather_station_data, :nodes => [:entry, :notes => [:photo, :video, :audio]]]}).run end

Tip 1: Conditional fragment caching
Unlike caches_action, rails's built-in fragment cache does not support conditions. For example, if we do not want to log on to the user to cache the fragment for him, But login users do not need to use it, it is very troublesome to write, we can rewrite helper:

 def cache_if (condition, name = {}, cache_options = {}, &block)  if condition   cache(name, cache_options, &block)  else   yield  end end- cache_if !user_signed_in?, "xxx", :expires_in => 1.day do

Tip 2: automatic update of associated objects
The object update_at timestamp is often used as the cache key. You can add the touch option on the correlated object to automatically update the timestamp of the correlated object. For example, when updating or deleting the article comment, automatic Updates:

class Article has_many :commentsendclass Comment belongs_to :article, :touch => trueend

5. Data Query Cache

Generally, the performance bottleneck of web applications occurs on db I/O. cache data queries to reduce the number of database queries and greatly improve the overall response time.
There are two types of data query cache:
A. cache for the same request cycle
For example, to display the article list, output the article title and article category. The corresponding code is as follows:

# controller def index  @articles = Article.first(10) end# view- @articles.each do |article| h1 = article.name span = article.category.name

10 similar SQL queries will occur:

SELECT `categories`.* FROM `categories` WHERE `categories`.`id` = ?

Rails has a built-in query cache (https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/abstract/query_cache.rb) that caches the same SQL query if there is no update/delete/insert operation in the same request cycle, if the document categories are the same, only one query is performed for the database.

If the article categories are different, there will be N + 1 queries (common performance bottlenecks), and the solution recommended by rails is to use Eager Loading Associations (http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations)

 

 def index  @articles = Article.includes(:category).first(10) end

The query statement is changed

SELECT `categories`.* FROM `categories` WHERE `categories`.`id` in (?,?,?...)

B. Cross-request Cache

Performance Optimization caused by cache in the same request cycle is very limited. In many cases, we need to cache some common data (such as the User model) across request cycles, for active record, it is easy to use the unified query interface to fetch the cache and callback to expire the cache, and some existing gem can be used.

For example, identity_cache (https://github.com/Shopify/identity_cache)

Class User <ActiveRecord: Base include IdentityCacheendclass Article <ActiveRecord: Base include IdentityCache cached_belongs_to: userend # will hit the cache User. fetch (1) Article. find (2). user

The advantage of this gem is that the Code implementation is simple, the cache settings are flexible, and the expansion is convenient. The disadvantage is that different query method names (fetch) and additional relationship definitions need to be used.

To seamlessly add caching functionality to applications without data caching, we recommend @ hooopo to do second_level_cache (https://github.com/hooopo/second_level_cache ).

class User < ActiveRecord::Base acts_as_cached(:version => 1, :expires_in => 1.week)end

# Using the find method will hit the cache

User. find (1)
# Different belongs_to definitions are not required
Article. find (2). user
The implementation principle is to extend the underlying arel SQL ast processing (https://github.com/hooopo/second_level_cache/blob/master/lib/second_level_cache/arel/wheres.rb) of active record)
It has the advantage of seamless access, but it is difficult to expand and cannot be cached for queries that only obtain a small number of fields.

6. database cache

Editing

These six caches are distributed in different locations from the client to the server, and the saved time is arranged in ascending order.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.