Website optimization: robots.txt Use tutorial

Source: Internet
Author: User
Keywords Website optimization
Tags access address check code content directory file files

First, let me introduce what is Robots.txt:robots.txt is the first file to see when visiting a Web site in a search engine. The Robots.txt file tells the spider what files can be viewed on the server. When a search spider accesses a site, it first checks to see if there is a robots.txt in the root directory of the site, and if so, the search robot will determine the scope of the access according to the contents of the file; If the file does not exist, all search spiders will be able to access all pages that are not password protected on the site. Finally, robots.txt must be placed in the root directory of a site.

You can refer to Google, Baidu and Tencent's robots:

Http://www.google.com/robots.txt

Http://www.baidu.com/robots.txt

Http://www.qq.com/robots.txt

After you understand the robots.txt, what can we do with robots.txt?

1, with robots.txt shielding similar high page or no content of the page.

We know that the search engine included in the page, the page will be "audit", and when the similarity of two pages is very high, then the search engine will delete one of them, and will reduce the point of your site score.

Assuming that the following two links, the content is actually similar, then the first link should be blocked off.

/xxx?123

/123.html

Like the first link such a link is very many, then how do we shield it? In fact, as long as shielding/xxx?

The code is as follows:

Disallow:/xxx?

By the same token, we can use the same method to screen out some pages without content.

2, with robots.txt shielding redundant links, generally keep static links (both HTML, htm, shtml, etc.).

Because there are often multiple links to the same page in the site, and this will make the search engine on the site's friendliness decreased. In order to avoid this situation, we can remove the robots.txt links through the main link.

For example, the following two links point to the same page:

/ooo?123

/123.html

Then we should get rid of the first garbage, the code is as follows:

Disallow:/ooo?123

3, with robots.txt shielding dead chain

Dead chain is the Web page that used to exist, because the revision or other reasons and loss of utility after become dead chain, that is, seemingly a normal link to the Web page, but after clicking can not open the corresponding page page.

For example, the original directory for all the links under the/seo, because the directory address changes, now become a dead link, then we can use robots.txt to shield him, the code is as follows:

Disallow:/seo/

4, tell the search engine your sitemap.xml address

Use robots.txt to tell search engines the address of your sitemap.xml file without adding sitemap.xml links to the site. The specific code is as follows:

Sitemap: Your sitemap address

This is the basic use of robots.txt, a good site will have a good robots.txt, because robots.txt is a search engine to understand your site a way. In addition here I recommend a more suitable for WordPress users to use the robots.txt wording:

User: *

Disallow:/wp-

Disallow:/feed/

Disallow:/comments/feed

Disallow:/trackback/

Sitemap:http://rainjer.com/sitemap.xml

Finally, if you feel that the above is not enough to meet your needs, then you can be in Google or Baidu official robots.txt use Guide to learn:

Baidu: http://www.baidu.com/search/robots.html

Google: Http://www.google.com/support/forum/p/webmasters/thread?tid=4dbbe5f3cd2f6a13&hl=zh-CN

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.