Go deep into Django cache framework

Source: Internet
Author: User
Tags delete cache

The basic function of a dynamic website is: yes, it is dynamic. each time a user requests a page, the web server performs comprehensive computing-from database query to rendering business logic-until the final displayed page is generated. from the perspective of server load, this is far more than reading a file from a file system.

This load is not a big problem for most website applications. most website applications are not as busy as mongoingtonpost.com or slashdot.org; they are usually small-and medium-sized, with no particularly high click rates. however, for large-scale sites with high click rates, the load on Web servers should be minimized.

This is why the cache mechanism is needed.

The purpose of setting up a cache mechanism is to omit the calculation steps (when the cache is accessed again) when appropriate. The following uses a pseudo code to explain how the mechanism works:

Based on the URL request, check whether the page is cached. If the page exists in the cache: Return to the page; otherwise, generate the page and save the page in the cache (for future access) return to the generated page

Django comes with an efficient and practical cache system that allows you to save dynamic pages, so that you do not have to generate new pages through computation every time you request them. for ease of use, Django provides a cache mode at different levels: You can only cache the output of specified views, or you can only cache fragments with high computing workload, as long as you want, you can also cache your entire site.

Django works well with "upstream" caching, just like squid (http://www.squid-cache.org/) and browser-based caching. you cannot directly control these caching methods, but you can use HTTP headers to tell the website what needs to be cached and how to cache them.

Set Cache

The cache system can work only when it is not complex. that is to say, you must tell it where your cache data is stored-or database, or file system, or simply put it in the memory. this option has a significant impact on the cache performance. yes, some cache types are far faster than others!

By settingCache_backendSet to determine the cache type. The following describes all the optional values of this option:

Memcached

This is the most efficient way in the Django cache system. memcached is a fully memory-based Cache framework. It was initially developed to handle the high traffic of livejournal.com and was subsequently open-source by danga interactive. it is usually used for sites such as Slashdot and Wikipedia to reduce database retrieval, which greatly improves the performance of the site.

Memcached can be obtained free of charge through the http://danga.com/memcached/. It operates in the form of a background monitor, using the memory of the specified capacity allocated to it. What it does is provide an interface-Super-lightning-fastInterface-add, obtain, and delete cache data. All data is stored in the memory without occupying any file system or database resources.

After installing memcached, you need to install the python binding of memcached. it is an independent Python module, memcache. PY, can be obtained through ftp://ftp.tummy.com/pub/python-memcached. if the URL is inaccessible, go directly to the memcached web site (http://www.danga.com/memcached/) and get its Python binding through the "client APIs" section.

To use memcached in Django, SetCache_backendIsMemcached: // ip: Port/, HereIPIs the IP address of the memcached daemon, andPortIsMemcachedThe port used by the process.

In the following example, memcached runs on port 11211 of localhost (127.0.0.1:

CACHE_BACKEND = 'memcached://127.0.0.1:11211/'

One of the outstanding features of memcached is its ability to share the cache among multiple servers. To use this advanced featureCache_backendThe IP addresses of multiple servers are separated by semicolons. The following example shares the memcached instances running on 172.19.26.240 and 172.19.26.242. Both instances use port 11211:

CACHE_BACKEND = 'memcached://172.19.26.240:11211;172.19.26.242:11211/'

The memory-based Cache Mechanism has one drawback: Because the cached data is stored in the memory, data will be lost when the server crashes. everyone knows that the memory is not suitable for storing permanent data, so you should not rely on the memory-based Cache as your only data storage method. in fact, there is no Django cache backend suitable for storing permanent data-they exist for the purpose of caching, instead of storing data forever -- we point this out here because the memory-based Cache is more temporary than the other method.

Database cache

To use a database for your cache backend, first create a cache table in your database. Run the following command:

python manage.py createcachetable [cache_table_name]

...[Cache_table_name]Is the name of the cache table to be created in the database. (You can use any name, as long as it does not duplicate the name of an existing table ). this command creates a cache table with an appropriate structure in the database, and Django then uses it to cache your data.

After creating a cache table, SetCache_backendIs"DB: // tablename /", HereTablenameThe name of the cache table. In the following example, the name of the cache table isMy_cache_table:

Cache_backend = 'db: // my_cache_table'

As long as you have a fast, well-indexed database server, the database cache will work very well.

File System Cache

To save the cached data in the file systemCache_backendUsed in"File ://"Cache type. The following example stores the cached data in/Var/tmp/django_cache, Set as follows:

CACHE_BACKEND = 'file:///var/tmp/django_cache'

Note:File:It is followed by three diagonal lines instead of two. The first two areFile ://Protocol. The third is the path./Var/tmp/django_cacheThe first character.

This directory path must be an absolute path-that is, it should start from the root of your file system. As for whether to add a diagonal line at the end of the path, Django does not care.

Make sure that the specified path exists and the Web server can read and write data. In the above example, if your server usesApacheRun as user identity to Ensure Directory/Var/tmp/django_cacheExisting and accessible to usersApacheRead/write.

Local Memory Cache

If you want the high performance of the memory cache but no conditions for running memcached, you can use the local memory cache backend. The cache backend is multi-process and thread-safe. You need to use it and setCache_backendIs"Locmem :///". For example:

CACHE_BACKEND = 'locmem:///'
Simple cache (for development)

"Simple :///"It is a simple memory cache type for a single process. It only saves cached data in the process, which means it is only applicable to development or test environments. Example:

CACHE_BACKEND = 'simple:///'
Virtual cache (for development)

At last, Django also supports a "Dummy" cache (in fact it is not cached)-it only implements the cache interface but does not actually do anything.

This is useful if you have a production site that uses heavy-duty caching in varous places but a development/test environment on which you don't want to cache. In that case, setCache_backendTo"Dummy :///"In the settings file for your development environment. As a result, your development environment won't use caching and your production environment still will.

Cache_backend Parameter

Parameters are accepted for all cache types. parameters are provided in a similar way to query string styles. All valid parameters are listed below:

Timeout
The default cache validity period, in seconds. The default value is 300 seconds (five minutes ).
Max_entries
Used Simple CacheAnd Database cacheThe maximum number of cached entries in the backend. The old cache is cleared. The default value is 300 ).
Cull_percentage

The ratio of items to be retained when the maximum number of items in the cache is reached. the actual value saved is 1/cull_percentage. Therefore, setting cull_percentage = 3 will save the selected 1/3 entries, and the remaining entries will be deleted.

If the value of cull_percentage is set to 0, the entire cache is cleared when the maximum number of cached entries is reached. If the cache hit rate is lowExtremely LargeImproves the efficiency of cache entries (not selected at all ).

In this example,TimeoutSet60:

CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=60"

In this example,TimeoutSet30WhileMax_entriesSet400:

CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=30&max_entries=400"

Invalid parameters are ignored.

Cache the entire site

After the cache type is set, the easiest way to use the cache is to cache the entire site. Add it in the ''middleware _ classes ''setting.Django. Middleware. cache. cachemiddlewareAs in the following example:

MIDDLEWARE_CLASSES = (    "django.middleware.cache.CacheMiddleware",    "django.middleware.common.CommonMiddleware",)

(Middleware_classesOrder. See "order of middleware_classes" below ")

Then, add the following settings to the Django settings file:

  • Cache_middleware_seconds-- The cache time of each page.
  • Cache_middleware_key_prefix-- If the cache is shared by multiple sites installed in the same Django, set the site name or some other unique strings here to avoid key conflicts. if you don't mind, you can also use an empty string.

Cache middleware caches every page without the get/post parameters. In addition,CachemiddlewareAutomaticallyHttpresponseSet some headers:

  • When requesting an Uncached page, SetLast-modifiedHeader to the current date and time.
  • SetExpiresHeader to the current date and time plus the definedCache_middleware_seconds.
  • SetCache-controlHeader indicates the lifetime of the page-the lifetime also comes fromCache_middleware_secondsSet.

Refer to middleware documentation to learn more about middleware.

Cache a single view

Django can cache only specific pages.Django. Views. decorators. CacheDefinesCache_pageModifier, which can automatically cache the response of the view. This modifier is extremely simple to use:

from django.views.decorators.cache import cache_pagedef slashdot_this(request):    ...slashdot_this = cache_page(slashdot_this, 60 * 15)

Alternatively, use the modifier Syntax of Python 2.4:

@cache_page(60 * 15)def slashdot_this(request):    ...

Cache_pageOnly one parameter is accepted: the cache validity period, in seconds. In the preceding example,Slashdot_this ()View will be cached for 15 minutes.

Underlying cache API

In some cases, the cache of a complete page does not meet your requirements. for example, you think it is necessary to cache the results only for some high-intensity queries. to achieve this goal, you can use the underlying cache API to save objects to the cache system at any level.

The cache API is simple. The slave cache ModuleDjango. Core. CacheExportCache_backendSet automatically generatedCacheObject:

>>> from django.core.cache import cache

The basic interface isSet (key, value, timeout_seconds)AndGet (key):

>>> cache.set('my_key', 'hello, world!', 30)>>> cache.get('my_key')'hello, world!'

Timeout_secondsThe parameter is optional and its default value is equalCache_backendOfTimeoutParameters.

If this object does not exist in the cache,Cache. Get ()ReturnNone:

>>> cache.get('some_other_key')None# Wait 30 seconds for 'my_key' to expire...>>> cache.get('my_key')None

Get () can acceptDefaultParameters:

>>> cache.get('my_key', 'has expired')'has expired'

Of course, there is also a get_keys () interface, which only hits the cache once. get_keys () returns a dictionary, including all the keys actually present in your request that have not expired .:

>>> cache.set('a', 1)>>> cache.set('b', 2)>>> cache.set('c', 3)>>> cache.get_many(['a', 'b', 'c']){'a': 1, 'b': 2, 'c': 3}

Finally, you can useDelete ()Explicit deletion key. This is an easy way to clear a specific object in the cache:

>>> cache.delete('a')

This is the case. The cache mechanism has very few restrictions: You can safely cache arbitrary objects that can be pickled (The key must be a string ).

Upstream caches

Till now, our eyes are only focused onYour ownData. There is another type of cache in Web development: "upstream" cache. This is the cache implemented by the browser when the user request has not arrived at your site.

Below are several examples of upstream cache:

  • Your ISP will cache a specific page. When you request a page in somedomain.com, your ISP will directly send you a cache page (not accessing somedomain.com ).
  • Your Django web site may be built on a squid (http://www.squid-cache.org/) Web Proxy Server that caches pages to improve performance. in this case, each request is first processed by squid, and it will pass the request to your application only when necessary.
  • Your browser also caches some pages. If a Web page sends the correct headers, the browser uses a local (cached) copy to respond to the requests sent to the same page.

Upstream cache is a very effective promotion, but it also has considerable shortcomings: many web pages are based on authorization and a bunch of variables, and this cache system blindly relies solely on URL cache pages, this may expose sensitive information to inappropriate users.

For example, if you use a web e-mail system, the content on the "inbox" Page obviously depends on the current login user. if an ISP blindly caches your site, then users will see the inbox of the previous user. This is not an interesting thing.

Fortunately, HTTP provides a solution to this problem: a series of HTTP headers are used to build a cache mechanism to differentiate the cached content, so that the cache system will not cache certain pages.

Use vary Headers

One of the headers isVary. It defines the request headers for the cache mechanism when the cache key is created. for example, if the content of a webpage depends on the language settings of a user, the webpage is notified of "vary on language."

By default, Django's cache system uses the Request Path to create a cache key-for example,& Quot;/stories/2005/Jun/23/bank_robbed/& quot /". This means that each request of the URL uses the same Cache version, regardless of the user proxy (cookies and language features ).

Therefore, we needVary.

If your Django-based page outputs different content based on different request headers-such as a cookie, language, or user agent-you will need to useVaryTo tell the cache system that the page output depends on these items.

To do this in Django, useVary_on_headersThe view modifier is like the following:

from django.views.decorators.vary import vary_on_headers# Python 2.3 syntax.def my_view(request):    ...my_view = vary_on_headers(my_view, 'User-Agent')# Python 2.4 decorator syntax.@vary_on_headers('User-Agent')def my_view(request):    ...

In this way, the cache system (such as Django's own cache middleware) caches pages of different versions for different user proxies.

UseVary_on_headersThe modifier has the following advantages:VaryHeader: similarResponse ['vary '] = 'user-agent') Modifier will be addedVaryHeader (may already exist) instead of overwriting it.

You can also pass multiple headersVary_on_headers ():

@vary_on_headers('User-Agent', 'Cookie')def my_view(request):    ...

Because multiple cookies are quite common, there isVary_on_cookieModifier. The following two views are equivalent:

@vary_on_cookiedef my_view(request):    ...@vary_on_headers('Cookie')def my_view(request):    ...

Note thatVary_on_headersIs case insensitive."User-Agent"And"User-Agent"Identical.

You can also directly use a help function,Django. utils. cache. patch_vary_headers:

from django.utils.cache import patch_vary_headersdef my_view(request):    ...    response = render_to_response('template_name', context)    patch_vary_headers(response, ['Cookie'])    return response

Patch_vary_headersAccept oneHttpresponseThe instance is the first parameter and the list of header names or tuple is the second parameter.

For more information about vary headers, see Official vary spec.

Control cache: use other Headers

Another problem with caching is data privacy and where data is stored in the waterfall cache mode.

Users often have two types of cache: their own browser cache (Private cache) and the cache provided by the site (Public cache ). A public cache is used for multiple users, and its content is controlled by another user. this causes Data Privacy: You certainly do not want your bank account to be stored in the public cache. therefore, applications need to tell the cache system what is private and what is public.

The solution is to declare that the cache of a page is "private". In Django, useCache_controlView modifier. Example:

from django.views.decorators.cache import cache_control@cache_control(private=True)def my_view(request):    ...

This modifier will carefully send appropriate HTTP headers behind the scenes to avoid the above problems.

There are other ways to control cache parameters. For example, HTTP allows applications to do the following:

  • Defines the maximum time for a page to be cached.
  • Specify whether the cache always needs to check the new version. If there is no change, only the Cache version is transferred. (Some caches only pass the Cache version even if the server page changes-just because the cache copy has not expired ).

In Django, useCache_controlThe view modifier specifies the cache parameters. In this exampleCache_controlNotify the cache to check the Cache version every time until the expiration of the 3600 s:

from django.views.decorators.cache import cache_control@cache_control(must_revalidate=True, max_age=3600)def my_view(request):    ...

All validCache-controlThe HTTP command isCache_control ()Is valid. The following is a complete command list:

  • Public = true
  • Private = true
  • No_cache = true
  • No_transform = true
  • Must_revalidate = true
  • Proxy_revalidate = true
  • Max_age = num_seconds
  • S_maxage = num_seconds

For details about the cache-control HTTP command, see cache-control spec.

(Note that the cache middleware has been setCache_middleware_settingsSet the max-age of the cache header.Cache_controlThe modifier uses the customMax_age, The settings in the modifier will be used first, and the header values will be correctly merged)

Other Optimizations

Django comes with some middleware to help you improve site performance:

  • Django. Middleware. http. conditionalgetmiddlewareAdded conditional get support. It usesEtagAndLast-modifiedHeaders.
  • Django.middleware.gzip. gzipmiddlewareTo compress the sent content in a gzip-enabled browser (all popular browsers support this ).
Middleware_classes Sequence

If you useCachemiddleware, InMiddleware_classesIt is very important to use the correct sequence in the settings. Because the cache middleware needs to know which headers are stored in the cache, the middleware always adds somethingVaryResponse header.

SetCachemiddlewarePut some other thingsVaryAfter the middleware of the header, the following middleware will add thingsVaryHeader:

  • SessionmiddlewareAddedCookie
  • GzipmiddlewareAddedAccept-Encoding
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.