Stationmaster must know: Five grammar and three attention points of the robots

Source: Internet
Author: User

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

Robots.txt is the first file to view when visiting a Web site in a search engine. When a search spider accesses a site, it first checks to see if there is a robots.txt in the root directory of the site, and if so, the search robot will determine the scope of the access according to the contents of the file; If the file does not exist, all search spiders will be able to access all pages that are not password protected on the site. As a webmaster, we can screen out the wrong page and some can not let spiders crawl page, then how to write the robots? The syntax of a robots has the following five points:

1, user definition search engine. In general, the site is all: User: *, here * means all, meaning that all search engines are defined. For example, I want to define Baidu, then it is user-agent:baiduspider; define Google,user-agent:googlebot.

2, Disallow prohibit crawling. For example, I would like to prohibit crawling my Help folder, that is disallow:/help/. Prohibit crawling of shenmein.html,disallow under the Help folder:/help/shenmein.html.

3, Allow allowed. As we all know, by default, are allowed. Why should we allow this grammar??? For example: I want to ban all files under the Help folder, out of the. html page, how do you write that? We know we can use disallow one by one, but that's too much time and energy. At this time, the use of Allow to solve the complex problem, wrote: Allow:/help/.html$ Disallow:/help/.

4, $ terminator. Example: Disallow: aspx$ The meaning of this sentence is to screen all files ending in. aspx, regardless of what is in front of him,/a/b/ad/ba/ddddd/eee/index.aspx this is also included.

5, * 0 or more any character. Example: Disallow: *?* This means shielding all bands "?" Files, but also masks all dynamic paths.

Understand the above five points, I believe that the preparation of the robots have no problem with you. But the following three points of attention, you must remember, otherwise it may be wasted.

First, allow the first to prohibit. Baidu rules to allow first, after the ban, Google rules to prohibit the first permit. According to Chinese understanding, Baidu's statement is more consistent with some. It turns out to be the same.

Second, in the User-agent,allow,disallow ":" Behind there is a character space, look at Baidu's robots and Google's robots screenshots.

  

The screenshot above, I can clearly see that there is a character of the space, remember, must be good oh.

Third, Disallow:/help/meaning is to block all Help folders, not only to prevent crawling the root directory of the Help folder below, including other folders under the folder, such as:/a/b/help/will also be blocked off. Therefore, must be careful when writing the shielding. Look at the screen shot in Google.

  

Through the image above, we can learn that Google has shielded the places folder under the root directory, the others are not blocked. Through the above explanation, should solve the problem of robots. The rational use of robots can bring a great effect to the website, especially the 404 error pages of the website. Make good use of robots, the optimization of the site to do a better job.

This article comes from: www.szkaiyi.com, reprint please keep, thank you!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.