This article describes how to use Sitemap in Django. Django is the most popular Pythonweb development framework. if you need it, you can refer to sitemap as an XML file on your server, it tells the search engine how frequently your pages are updated and how important some pages are to other pages. This information will help search engines index your website.
For example, this is part of the Django site (http://www.djangoproject.com/sitemap.xml) sitemap:
<?xml version="1.0" encoding="UTF-8"?>
http://www.djangoproject.com/documentation/
weekly
0.5
http://www.djangoproject.com/documentation/0_90/
never
0.1
...
For more information about sitemaps, see http://www.sitemaps.org /.
The Django sitemap framework allows you to use Python code to express this information and automatically create this XML file. To create a site map, you only need to write a ''sitemap'' class and point to it in URLconf.
Install
To install the sitemap application, follow these steps:
- Add 'Django. contrib. sitemaps 'to your INSTALLED_APPS settings.
- Make sure that 'Django. template. loaders. app_directories.load_template_source 'is in your TEMPLATE_LOADERS settings. It is there by default, so if you have changed that setting, you only need to change it back.
- Make sure that you have installed the sites framework.
Note
The sitemap application does not have any database tables installed. the only reason it needs to be added to INSTALLED_APPS is: in this way, the load_template_source template loader can find the default template. the only reason it needs to go into INSTALLED_APPS is so the load_template_source template loader can find the default templates.
Initialization
To activate sitemap generation on your Django site, add this line in your URLconf:
(R '^ sitemap \. xml $', 'Django. contrib. sitemaps. views. sitemap', {'sitemaps ': sitemaps })
This line tells Django to build a sitemap when a client accesses/sitemap. xml. note that the dot character in sitemap. xml is escaped with a backslash, because dots have a special meaning in regular expressions.
The name of the sitemap file does not matter, but its location on the server is very important. The search engine only indexes the links at the current URL level and below level in your sitemap. For an instance, if sitemap. xml is in your root directory, it references any URL. However, if your sitemap is in/content/sitemap. xml, it only references URLs with/content/headers.
The sitemap view requires an additional required parameter: {'sitemaps ': sitemaps }. sitemaps shoshould be a dictionary that maps a short section label (e.g ., blog or news) to its Sitemap class (e.g ., blogSitemap or NewsSitemap ). it may also map to an instance of a Sitemap class (e.g ., blogSitemap (some_var )).
Sitemap class
The Sitemap class shows a simple Python fragment for accessing the map site. for example, a Sitemap class can display all log entries, and another class can schedule all calendar events. For example, one Sitemap class cocould represent all the entries of your weblog, while another cocould represent all of the events in your events calendar.
In the simplest example, all parts can be included in a sitemap. in xml, you can also use a framework to generate a site map and generate a separate site file for each independent part.
The Sitemap class must be a subclass of django. contrib. sitemaps. Sitemap. they can exist anywhere in your code tree.
For example, suppose you have a blog system and an Entry model, and you want your site map to contain all the hyperlinks connected to your blog portal. Your Sitemap class is probably like this:
from django.contrib.sitemaps import Sitemapfrom mysite.blog.models import Entryclass BlogSitemap(Sitemap): changefreq = "never" priority = 0.5 def items(self): return Entry.objects.filter(is_draft=False) def lastmod(self, obj): return obj.pub_date
Declaring a Sitemap and a Feed looks similar; this is pre-designed.
Like the Feed class, Sitemap members can be methods or attributes.
A Sitemap class can define the following methods/attributes:
Items (required): provides an object list. The framework does not care about the object type. The only concern is that these objects are passed to the location (), lastmod (), changefreq (), and priority () methods.
Location (optional): specifies the absolute URL of an object. The absolute URL does not contain the protocol name and domain name. The following are some examples:
- Okay: '/foo/bar /'
- Poor: 'example. com/foo/bar/''example. com/foo/bar /'
If location is not provided, the framework calls the get_absolute_url () method on the object returned by each items.
Lastmod (optional): The last modification date of The object, which is used as a Python datetime object. The object's last modification date, as a Python datetime object.
Changefreq (optional): Frequency of object changes. Optional values ):
- 'Always'
- 'Urly'
- 'Daily'
- 'Weekly'
- 'Monthly'
- 'Early'
- 'Never'
- Priority (optional): The value ranges from 0.0 to 1.0, indicating the priority.
Shortcut
The sitemap framework provides some common classes. As shown in the following section.
FlatPageSitemap
The django. contrib. sitemaps. FlatPageSitemap class involves all the flat pages on the site and creates an entry in sitemap. However, it only contains the location attribute and does not support lastmod, changefreq, or priority.
GenericSitemap
GenericSitemap works with all common views (see Chapter 9th ).
You can use it as follows to create an instance and pass it to the common view through info_dict. The only requirement is that the dictionary contains queryset. You can also use date_field to specify the date field of the object retrieved from queryset. This will be used as the lastmod attribute in the site map.
Below is a URLconf using FlatPageSitemap and GenericSiteMap (including the previously assumed Entry object:
from django.conf.urls.defaults import *from django.contrib.sitemaps import FlatPageSitemap, GenericSitemapfrom mysite.blog.models import Entryinfo_dict = { 'queryset': Entry.objects.all(), 'date_field': 'pub_date',}sitemaps = { 'flatpages': FlatPageSitemap, 'blog': GenericSitemap(info_dict, priority=0.6),}urlpatterns = patterns('', # some generic view using info_dict # ... # the sitemap (r'^sitemap\.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps}))
Create a Sitemap index
The sitemap framework can also create an index based on a separate sitemap file defined in the sitemaps Dictionary. The usage differences are as follows:
You have used two views in your URLconf: django. contrib. sitemaps. views. index and django. contrib. sitemaps. views. sitemap. ''Django. contrib. sitemaps. views. index'' and ''Django. contrib. sitemaps. views. sitemap''
Django. contrib. sitemaps. views. sitemap view requires a section keyword parameter.
The URLconf line in the previous example looks like this:
(r'^sitemap.xml$', 'django.contrib.sitemaps.views.index', {'sitemaps': sitemaps}),(r'^sitemap-(?P
.+).xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
This will automatically generate a sitemap. xml file that references both the sitemap-flatpages.xml and the sitemap-blog.xml. Sitemap class and the sitemaps directory are not changed at all.
Notify Google
When your sitemap changes, you will want to notify Google to let it know to reindex your site. The framework provides a function: django. contrib. sitemaps. ping_google ().
Ping_google () has an optional parameter sitemap_url, which should be the absolute URL of your site map (for example:
If you cannot determine your sitemap URL, ping_google () will cause django. contrib. sitemaps. SitemapNotFound exception.
We can call ping_google () through the save () method in the model ():
from django.contrib.sitemaps import ping_googleclass Entry(models.Model): # ... def save(self, *args, **kwargs): super(Entry, self).save(*args, **kwargs) try: ping_google() except Exception: # Bare 'except' because we could get a variety # of HTTP-related exceptions. pass
A more effective solution is to use cron scripts or task scheduling tables to call ping_google (). This method uses Http to directly request the Google server, thus reducing the number of calls to save () each time () the bandwidth occupied by the instance. The function makes an HTTP request to Google's servers, so you may not want to introduce that network overhead each time you call save ().
Finally, if 'Django. contrib. sitemaps 'is in your INSTALLED_APPS, then your manage. py will include a new command, ping_google. this is useful for command-line access to pinging. for example:
python manage.py ping_google /sitemap.xml