Django Caching mechanism

Source: Internet
Author: User
Static website content is simply static Web pages stored directly on the server, can be very easy to achieve a very amazing amount of traffic. However, Dynamic Web site is dynamic, that is, every time a user accesses a page, the server executes the database query, launches the template, executes the business logic to eventually generate a Web page that you said to see, all of which are dynamically generated instantly. From the perspective of processor resources, this is more expensive.


For most Web applications, overloading is not a big problem. Because most Web applications are not washingtopost.com or Slashdot, they are usually small, simple, or medium-sized sites with very little traffic. However, for medium to large-scale traffic sites, it is necessary to solve the overload problem as much as possible. This will require a cache.


The purpose of caching is to avoid duplication of calculations, especially for some time-consuming and resource-intensive calculations. The following is the code that shows how to cache the results of a dynamic page.

Given a URL, try finding that page in the Cacheif the page are in the Cache:return the cached pageelse:generate the Pagesav E The generated page in the cache (for next time) return the generated page

To do this, Django provides a stable caching system that allows you to cache the results of a dynamic page so that the same request can be used to directly use the data in the cache to avoid unnecessary duplication of calculations. In addition, Django provides a cache of different granularity data, such as: you can cache an entire page, or you can cache a section, or even cache an entire Web site.


Django also works well with "upstream" caches, such as squid (http://www.squid-cache.org) and browser-based caches, which you don't directly control, but you can provide clues about which parts of your site should be cached and how to cache ( Through HTTP headers) to them


Continue reading to explore how to use Django's caching system. When your website becomes like Slashdot, you'll be happy to understand that this part of the material


Set cache


The caching system requires a little bit of setup, which means you have to tell it where your cache data is-in the database, file system or directly in memory, which is an important decision to affect your cache performance, yes, some cache types are faster than others, and memory caches are usually faster than file system or database caches. Because the former does not have access to the file system or to the database over-connections


Your cache is selected in the Cache_backend settings of your settings file, if you use the cache but do not specify Cache_backend,django will default to use simple:///, the following will explain Cache_ All the available values of the backend


Memory buffering


The fastest and most efficient type of cache that Django can get so far is the memory-based caching framework Memcached, which was originally developed to handle high loads for livejournal.com and then Danga
Interactive (http://www.danga.com) Open source, which is used by sites such as Slashdot and Wikipedia to reduce database access and dramatic increases in site performance


Memcached can be obtained free of charge at http://danga.com/memcached/, which runs as a background process and allocates a specified amount of RAM. It can provide you with lightning fast * add, get and delete arbitrary data in the cache, All data is stored directly in memory, so there is no overuse of database and file system usage


After installing the memcached itself, you will need to install the Memcachedpython bindings, which are not directly and Django bound in a separate Python module, ' memcache.py ', which can be in HTTP// Www.djangoproject.com/thirdparty/python-memcached get


Set Cache_backend to memcached://ip:port/for Django to use memcached, where the IP is the IP address of the memcached background process, and port is where memcached is running


In this example, memcached runs on the local host (127.0.0.1) with Port 11211:

Cache_backend = ' memcached://127.0.0.1:11211/'

An excellent feature of memcached is its ability to share caches on multiple servers, which means that you can run the memcached process on multiple machines, and the program will treat this set of machines as a separate * cache instead of having to replicate cache values on each machine. In order for Django to take advantage of this feature, you need to include all the server addresses in the cache_backend and separate them with semicolons.


In this example, the cache is shared between the IP addresses running on 172.19.26.240 and 172.19.26.242 and the memcached instance of Port 11211:

Cache_backend = ' memcached://172.19.26.240:11211;172.19.26.242:11211/'

In this example, the cache is shared between memcached instances running on 172.19.26.240 (Port 11211), 172.19.26.242 (Port 11212), 172.19.26.244 (Port 11213):

Cache_backend = ' memcached://172.19.26.240:11211;172.19.26.242:11212;172.19.26.244:11213/'

The last thing about memcached is that memory-based caching has a major drawback, because cached data is stored only in memory, and if the server freezes, the data is lost, obviously the memory is not prepared for persistent data storage, and Django does not have a cache backend for persistent storage, they are all caching schemes , not storage. But we're here to point out that because memory-based caching is particularly ephemeral
.


Database Cache


To use a database table as the cache backend, you need to create a cache table in the database and point the Django cache system to the table


First, create a cached data table using the following statement:

Python manage.py createcachetable [Cache_table_name]

Here [Cache_table_name] is to create a database table name, the name can be anything you want, as long as it is legitimate in your database is not used this command in your database to create a follow Django Database cache system expected form of a separate table.


Once you have created the database table, set your cache_backend to "Db://tablename", where the TableName is the name of the database table, in this case the cache table is named my_cache_table:

Cache_backend = ' db://my_cache_table '

The database cache backend uses the same database specified by your settings file, and you cannot use a different database backend for your cache tables.


File system Caching


Use the "file://" cache type as the Cache_backend and specify the file system directory where the cached data is stored to store the cached data on the file system.


For example, use the following settings to store cached data in/var/tmp/django_cache:

Cache_backend = ' File:///var/tmp/django_cache '

Note that there are three front slashes at the beginning of the example, the first two are file://, the third is the/var/tmp/django_cache of the directory path, and if you use the Windows system, place the letter of the letters behind file://, like this: ' File://c:/foo/bar '.


The directory path should be an absolute * path, which should start at the root of your filesystem, and it doesn't matter whether you place a slash at the end of the set.


Verify that the directory that the setting points to exists and that the user of the system that your Web server is running can read and write to the directory, continue with the example above, if your server is running as user Apache, confirm that/var/tmp/django_cache exists and that the user Apache can read and write/var/ Tmp/django_cache Directory


Each cached value is stored as a separate file whose contents are cached data stored as serialized ("Pickled") by the Python pickle module, and each file's filename is a cache key, which is used for secure file system release


Local Memory Cache


If you want the speed advantage of the memory cache but not the ability to run memcached, consider using the local memory cache backend, which is multithreaded and thread safe, but it is not memcached efficient due to its simple lock and memory allocation policy




Set Cache_backend to locmem:///to use it, for example:

Cache_backend = ' locmem:///'

Simple cache (for development phase)


You can use a simple single-process memory cache by configuring ' simple:///', for example:

Cache_backend = ' simple:///'

This cache simply saves the data in the process, so it should be used only in the development environment or in a test environment.


Copy cache (for development use)




Finally, Django provides a fake cache setting: it only implements the cached interface without doing anything real


This is a useful feature, if your online site uses a lot of relatively heavy cache, but in the development environment but do not want to use the cache, then you just modify the configuration file, the Cache_backend is set to ' dummy:///' on it, for example:

Cache_backend = ' dummy:///'

The result is that your development environment is not using caching, and the online environment is still using the cache.


Cache_backend parameters


Each cache backend may use parameters, which are given as a query string in the Cache_backend setting, and the valid parameters are:


Timeout: The expiration time, in seconds, for the cache. This parameter is set to 300 seconds (five minutes) by default


Max_entries: For simple, local-memory and database type cache, this parameter is the maximum number of entries stored in the specified cache, and the old entries will be deleted when the number is greater than this. This parameter is 300 by default.


Cull_frequency: The rate of access that is received when Max_entries is reached. The actual ratio is 1/cull_frequency, so setting cull_frequency=2 is to remove half of the cache when the max_entries is reached.


Setting the value of Cull_frequency to 0 means that when max_entries is reached, the cache is emptied. This will greatly increase the speed of receiving access at the expense of many cache misses. This value is 3 by default.


In this example, timeout is set to 60

Cache_backend = "Locmem:///?timeout=60"



In this example, timeout is set to 30 and Max_entries is 400:

Cache_backend = "locmem:///?timeout=30&max_entries=400"

The illegal parameters and the illegal parameter values are ignored.


Site-level Cache


Once you have specified "Cache_backend", the simplest way to use the cache is to cache your entire site. This means that all pages that do not contain a GET or post parameter will be cached for a specified period of time after the first request.


To activate the cache for each site, simply add "Django.middleware.cache.CacheMiddleware" to the middleware_classes settings, as follows:

middleware_classes = (' Django.middleware.cache.CacheMiddleware ', ' django.middleware.common.CommonMiddleware ',)

Attention


Some things about the order of middleware_classes. See the Middleware_classes Order section later in this chapter.


Then, add the following required settings to your Django settings file:


Cache_middleware_seconds: The number of seconds each page should be cached


§ "Cache_middleware_key_prefix": if the cache is shared by multiple Web sites using the same Django installation, set the value to the current site name, or another unique string that can represent the Django instance, to avoid KEY collisions. If you don't care, you can set an empty string.


Cache middleware caches every page that has no get or post parameters, that is, if a user requests a page and passes a get parameter or post parameter in the query string, the middleware will not attempt to get the cached version of the page, and if you plan to use the whole station cache, remember this when designing your program, for example, Do not use URLs that have query strings, unless those pages can not be cached


Caching middleware (cache middleware) supports another setup option, Cache_middleware_anonymous_only. If you set it to "True", then the cache middleware will only cache anonymous requests, which are those initiated by users who are not logged in. If you want to cancel the user-related page (user-specific
pages), such as the Djangos management interface, is a simple and effective method. In addition, if you want to use the CACHE_MIDDLEWARE_ANONYMOUS_ONLY option, you must first activate Authenticationmiddleware, which is where your profile middleware_classes, Authenticationmiddleware must appear in front of Cachemiddleware.


Finally, let me remind you that Cachemiddleware will automatically set some header information in each httpresponse (headers)


§ Set Last-modified header to current date/time when a new (not cached) version of the page is requested


§ Set the expires header to the current date/time plus the defined Cache_middleware_seconds


§ Set Cache-control header to give the page a maximum time-again, according to Cache_middleware_seconds settings


View-Level caching


The more granular cache framework is used by caching the output of a single view. This has the same effect as the whole-station cache (including ignoring the Get and POST
The requested cache of the parameters). It applies to the view you specify, not the entire site.


The way to do this is to use decorators, which wrap the view functions and convert their behavior to using caching. The view cache decorator, called Cache_page, is located in the Django.views.decorators.cache module, for example:

From Django.views.decorators.cache import cache_pagedef my_view (Request, param): # ... my_view = Cache_page (My_view, 60 * 1 5)

If you use Python version 2.4 or later,
You can also use the decorator syntax. This example is equivalent to the previous one:

From Django.views.decorators.cache import cache_page@cache_page def my_view (Request, param): # ...

Cache_page only accepts one parameter: the cache timeout in seconds. In the preceding scenario, the results of the "my_view ()" View will be cached 15
Minutes. (Note: To improve readability, this parameter is written as 60 * 15.) 60 * 15 will be calculated as 900, which means 15 minutes times per minute
60 seconds. )


As with the site cache, the view cache is URL-independent. If multiple URLs
Point to the same view, and each view will be cached separately. Continue with the My_view paradigm, if urlconf is as follows:

Urlpatterns = ("," (R ' ^foo/(/d{1,2})/$ ', My_view),)

So, as you would expect, the requests sent to/FOO/1/and/foo/23/will be cached separately. However, once a specific request (e.g.,/foo/23/) is issued, it is then issued again to point to the
The request for the URL will use the cache.


Specifying the view cache in urlconf


The example in the previous section hardcoded the view to use the cache, because Cache_page converted the My_view function at the appropriate location. This method coupling the view with the cache system is not ideal in several ways. For example, you might want to reuse the view function in a site that is not cached, or you might want to publish the view to someone who doesn't want to use it through the cache. The way to solve these problems is to
The view cache is specified in URLconf, not next to the view function itself.


It's very simple to do this: simply wrap a cache_page when using these view functions in urlconf. Here's what you've just used.

Urlconf:urlpatterns = (', (R ' ^foo/(/d{1,2})/$ ', My_view),) the following is the same URLconf, but with Cache_page wrapped My_view:from Django.views.decorators.cache Import Cache_pageurlpatterns = (", (R ' ^foo/(/d{1,2})/$ ', Cache_page (My_view, 60 * 15)),)

If you take this approach, don't forget to URLconf
Import the Cache_page.


Low Level Cache API


Sometimes, caching the entire parsed page does not bring you too much, in fact it may be overkill.


For example, maybe your site contains a view that depends on a few time-consuming queries, and the results change every once in a while. In this case, using the site-level cache or the full-page cache provided by the view-level cache policy is not optimal, because you might not want to cache the entire result (because some data often changes), but you still want to cache parts that rarely change.


In a scenario like this, Django shows a simple, low-level cache in the Django.core.cache module
Api. You can use this low-level caching API to store objects in the cache at any granularity. You can do all the pickle that can be safely handled.
Python objects are cached: strings, dictionaries, list of model objects, and so on; Read the Python documentation to learn more about pickling. )


Here's how to import this API:

>>> from Django.core.cache Import cache

The basic interface is set (key, value, Timeout_seconds) and get (key):

>>> cache.set (' My_key ', ' Hello, world! ', ') >>> cache.get (' my_key ') ' Hello, world! '

The timeout_seconds parameter is optional and defaults to the timeout parameter in the previously mentioned Cache_backend settings.


If the object does not exist in the cache, or the cache backend is unreachable, Cache.get () returns none:

# Wait seconds for ' My_key ' to expire...>>> cache.get (' My_key ') none>>> cache.get (' Some_unset_key ') None

We do not recommend saving the none constant in the cache, because you will not be able to distinguish between the saved none variable and the cache not identified by the return value of None.


Cache.get () accepts a default parameter. It specifies the value returned when the object does not exist in the cache:

>>> cache.get (' My_key ', ' has expired ')


' Has expired '

To get more than one cache value at a time, you can use Cache.get_many (). If possible, for a given cache backend, Get_many () will only access the cache once, rather than one access per cache key value. The dictionary returned by Get_many () includes all the key values that you have requested that are present in the cache and that do not time out.

>>> cache.set (' A ', 1) >>> cache.set (' B ', 2) >>> cache.set (' C ', 3) >>> Cache.get_many ([' A ', ' B ', ' C ']) {' A ': 1, ' B ': 2, ' C ': 3}

If a cache keyword does not exist or has timed out, it will not be included in the dictionary. Here is the continuation of the paradigm:

>>> Cache.get_many ([' A ', ' B ', ' C ', ' d '])


{' A ': 1, ' B ': 2, ' C ': 3}

Finally, you can use Cache.delete () to explicitly delete the keyword. This is an easy way to clear specific objects in the cache.

>>> cache.delete (' a ')

Cache.delete () does not have a return value, and it works the same way regardless of the value of the given cache keyword that exists or not.


Upstream cache


So far, the focus of this chapter has been on caching your own data. But there is also a web development-related cache: buffers performed by the upstream cache. Some systems make page caches for users even before the request arrives at the site.


Here are a few examples of upstream caching:


§ Your ISP (Internet service provider) may cache specific pages, so if you request a page from http://www.infocool.net/, your
The ISP may send the page to you without direct access to www.infocool.net. And Www.infocool.net's defenders have no way of knowing that this cache, ISP is located
Www.infocool.net and your Web browser, the transparent bottom handles all the caches.


§ Your Django site may be behind a proxy cache, for example
Squid Web Proxy cache (http://www.squid-cache.org/), which caches pages for improved performance. In this case, each request will be processed first by the proxy server and then delivered to your application only if needed.


§ Your Web browser also caches the page. If a webpage sends a corresponding header, your browser will use a locally cached copy of the subsequent access request for that page, and will not even contact the page again to see if it has changed.


The upstream cache will have a very noticeable efficiency boost, but there are some risks. The content of many Web pages varies based on authentication and many other variables, and the caching system only blindly saves pages based on URLs, potentially exposing incorrect or sensitive data to subsequent visitors to those pages.


For example, assuming that you are using a Web-based email system, it is obvious that the content of the Inbox page depends on which user is logged in. If the ISP caches the site blindly, then the first user who logs on through the ISP will cache the user's Inbox page for subsequent visitors. It's not funny either.


Fortunately, HTTP provides a solution to the problem. Some HTTP headers have been used to guide the upstream cache to differentiate cached content based on specified variables and to inform the caching mechanism not to cache specific pages. We will elaborate on these headers in the following sections of this section.


Using Vary headers


The vary header defines which request header the caching mechanism should take into account when building its cache key value. For example, if the content of a Web page depends on the user's language preference, the page is called different depending on the language.


By default, the Django cache system uses the requested path (for example, "/stories/2005/jun/23/bank_robbed/") to create its cache key. This means that the
Each request to the URL will use the same cached version regardless of user-agent differences such as cookies or language preferences. However, if the page is based on the difference between the header of the request (for example,
Cookies, language, or user-agent), you will have to use


Vary header to notify the caching mechanism: the output of this page depends on these things.


To do this in Django, you can use the handy vary_on_headers view decorator, as shown below:

From django.views.decorators.vary import vary_on_headers# Python 2.3 syntax.def My_view (Request): # ... my_view = Vary_on_ Headers (my_view, ' user-agent ') # Python 2.4+ decorator syntax. @vary_on_headers (' user-agent ') def my_view (request): # ...

In this case, the cache appliance (such as Django's own cache middleware) caches a separate version of the page for each individual user's browser.


Use the Vary_on_headers decorator instead of manually setting the Vary header (use like response[' Vary ')
The benefit of code like ' User-agent ' is that decorators are added on top of (possibly already existing) vary, not from zero, and may overwrite settings that already exist at that point.


You can pass in multiple headers to vary_on_headers ():

@vary_on_headers (' user-agent ', ' Cookie ') def my_view (request): # ...

This code tells the upstream cache to do different things for both, meaning that user-agent and cookies
Each combination should get its own cache value. For example, using Mozilla as User-agent and Foo=bar as
The request for the cookie value should be treated as a different request than the Foo=ham request using Mozilla as user-agent.


Because it is very common to differentiate between treats based on cookies, there are vary_on_cookie modifiers. The following two views are equivalent:

@vary_on_cookiedef My_view (Request): # @vary_on_headers (' Cookie ') def my_view (request): # ...

Incoming vary_on_headers headers are case insensitive; "user-agent" is identical to "user-agent".


You can also use the Help function directly: Django.utils.cache.patch_vary_headers. This function sets or increments the vary header, for example:

From Django.utils.cache import patch_vary_headersdef my_view (Request): # ... response = Render_to_response (' Template_ Name ', context) patch_vary_headers (response, [' Cookie ']) return response

Patch_vary_headers takes a HttpResponse instance as the first argument, a case insensitive header name list or a tuple as the second parameter.


Other Cache header Labels


The remaining question about caching is the privacy of the data and the question of where the data should be stored in the Cascade cache.


Typically the user will face two caches: his or her own browser cache (private cache) and his or her provider cache (public cache). The public cache is used by multiple users and is controlled by someone else. This creates a problem with sensitive data that you don't want to encounter, such as your bank account is stored in the public cache. Therefore, the Web application needs to somehow tell the cache which data is private and which is public.


The solution is to indicate that a page cache should be private. To do this work in Django, you can use the Cache_control view decorator:

From Django.views.decorators.cache import Cache_control@cache_control (private=true) def my_view (request): # ...

The decorator is responsible for sending the appropriate HTTP headers in the background.


There are other ways to control cache parameters. For example, HTTP allows an application to perform the following actions:


§ Defines the maximum number of times a page can be cached.


§ Specifies whether a cache always checks for a newer version and passes the cached content only if no updates are available. (Some caches may also transfer cached content even if the server page changes, just because the cached copy is not expired.) )


In Django, these cache parameters can be specified using the Cache_control view decorator. In this example, Cache_control tells the cache to re-validate the cache for each access and the longest
Save the cached version in 3,600 seconds:

From Django.views.decorators.cache import Cache_control@cache_control (Must_revalidate=true, max_age=3600) def My_ View (Request): ...

In Cache_control (), any valid cache-controlhttp instruction is valid. The following is a complete list:

§public=true§private=true§no_cache=true§no_transform=true§must_revalidate=true§proxy_revalidate=true§max_age= Num_seconds§s_maxage=num_seconds


Little Tips


To learn about the cache-controlhttp directive,
You can consult the http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 specification document.


Attention


The cache middleware has set the cache header Max-age with the Cache_middleware_settings setting. If you use a custom max_age in the Cache_control decorator, the decorator will take precedence and the value of the header will be correctly merged. )


Other optimizations


Django comes with some other middleware to help you optimize your application's performance:


§django.middleware.http.conditionalgetmiddleware adds support for conditional get-based responses to the ETag and last-modified headers for modern browsers.


§django.middleware.gzip.gzipmiddleware compresses response content for all modern browsers to conserve bandwidth and transfer time.


Order of the Middleware_classes


If you are using cache middleware, be sure to place it in the correct location in the middleware_classes settings, because the cache middleware needs to know the headers that are used to produce different cache storage.


After placing Cachemiddleware in all middleware that might add content to the vary header, the following middleware is included:


§ Add Sessionmiddleware of cookies


§ Add Accept-encoding's Gzipmiddleware,

The above is the content of the Django cache mechanism, more relevant content please pay attention to topic.alibabacloud.com (www.php.cn)!

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.