Configure static files such as robots.txt and sitemaps for the website in djangow.apache.

Source: Internet
Author: User
Tags django website

Metadata and sitemap. XML,

 

Let's take a look at robots.txt and sitemap. xml. The following is an excerpt from Wikipedia"

Http://zh.wikipedia.org/zh-cn/Sitemap http://zh.wikipedia.org/zh/Robots.txt

 

Robots.txt

 

Wikipedia, a free encyclopedia

Robots.txt(Unified lower case) is an ASCII text file stored in the root directory of a website. It usually tells the Web search engine's browser (also known as web spider ), the content on this website cannot be obtained by the search engine's roaming server and can be obtained by the roaming server. For a few systems, the urlis sensitive, and the file names of robots.txt should be in lowercase. Robots.txt should be placed under the root directory of the website. If you want to define the behavior of the search engine's roaming folder sub-directory, You can merge your settings to the robots.txt in the root directory or use the robots metadata.

The robots.txt protocol is not a standard, but a convention, so the privacy of the website cannot be guaranteed. Robots.txt uses a string comparison to determine whether to obtain the URL. Therefore, there are two types of URLs at the end of the directory with or without the slash "/", which indicates different URLs. "disallow :*. a wildcard such as GIF.

Other methods that affect search engine behavior include using robots metadata:

<meta name="robots" content="noindex,nofollow" />

This protocol is not a standard, but just a convention. Generally, the search engine will identify this metadata, not index this page, and link the page to the page.

 

 

XML website map sitemaps

Sitemaps is the protocol that the site administrator publishes to the search engine crawler that the site can crawl pages. The content of sitemap files must follow the definition in XML format. Each URL can contain the update cycle and time, and the priority of the URL in the whole site. This allows search engines to capture website content better and effectively.

Google introduced Google website maps so that web developers can publish the Link List of the entire site. The basic assumption is that some sites have a large number of dynamic pages, which can only be accessed through forms and user logon. Website map files can be used to guide web spider on how to find such pages. Google, MSN, Yahoo, and ask all support the website MAP protocol.

Because MSN, Yahoo, ask, and Google use the same protocol, having a site map keeps the page information of the four largest search engines up-to-date. Website maps do not guarantee that all links can be crawled, but the captured links cannot be indexed. However, a site map is still the safest way for a search engine to obtain information about your entire site.

 

 

So now we use Django + Apache as the architecture of the website (www.souapp.com)Search applicationsFor example, submit robots.txt and sitemap. XML to Google as work tasks.

 

1. First, you need to log on to the Google website administrator tool page:

Https://www.google.com/webmasters/tools/home? Hl = ZH-CN

 

 

Add www.souapp.com and verify that you are the website owner.

 

2. the permission to capture the tool under the "Configure" column of the "website" is detailed in the configuration of robots.txt.

By default, the location of the robots.txt file is accessed at http://www.souapp.com/robots.txt. you can also manually define the url.

 

 

3. "sitemaps" in the "website configuration" section shows the detailed configuration of sitemap. xml.

By default, the location of the sitemap. xml file is accessed by/sitemap. xml. Of course, you can also customize the URL. My custom settings are/media_alias/sitemap. xml.

 

 

========================================================== ================================

Next, we will explain how to configure the path of robots.txt and sitemap. XML in the root directory of the website:

1,Static file JS, image, and CSS path configuration for the Django website:

I put all the CSS JS JPG PNG TXT files used by the website under the Media Directory of the website.
The following configuration must be referenced on the page:

Add in setttings. py
# Set the static file path
Static_path = '/var/www/Media /'

Configuration in URLs. py
Import from Django. conf import settings
In urlpatterns, add
(R' ^ media_alias /(? P <path>. *) $ ', 'django. Views. Static. serv', {'document _ root': settings. static_path }),

Finally, we can use this in the HTML page.
<LINK rel = "stylesheet" type = "text/CSS" href = "/media_alias/common.css"/>
<SCRIPT type = "text/JavaScript" src = "/media_alias/jquery. js"/>
<Image src = "/media_alias/souapp.png"/>

 

2,Apache static file path Configuration:

 

Note: You need to install mod_python in Apache. For details, refer to Ubuntu 8.04 mod_python config.

 

Loadmodule python_module/usr/lib/apache2/modules/mod_python.so

 

Find the/etc/apache2/httpd. conf file and configure it as follows:

 

<Virtualhost *: 80>
<Location "/">
Sethandler Python-Program
Pythonpath "['/var/WWW'] + SYS. path"
Pythonhandler Django. Core. Handlers. modpython
Setenv django_settings_module souapp. Settings
# Pythonoption Django. Root/
Pythondebug on
# Pythoninterpreter souapp
</Location>
 
Alias/media_alias // var/www/Media/

Alias/robots.txt/media_alias/robots.txt
Alias/sitemap. xml/media_alias/sitemap. xml

 

<Locationmatch "/. (JPG | GIF | PNG | TXT | ICO | PDF | CSS | JPEG) $">
Sethandler none
</Locationmatch>

 

In this way, access the ingress

 

3366robots.txt and sitemap. xml generation

 

Under the guidance of the Google website administrator tool page, you will soon generate and download robots.txt and put it in your website directory. For the generation of sitemap. XML, see

Django generates sitemap. XML for the website Search Application Network (www.souapp.com)We are committed to the provision of mobile Internet-mobile app services, and strive to create an environment for the majority of mobile phone users to share mobile apps and fully enjoy mobile life, bringing us unlimited life and fun, therefore, we are convinced that mobile Internet is the personal age of mobile phone users personalization and sharing. Currently, our services are mainly for Android mobile phone systems.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.