International - English

Cart Console

Topic Center

Contact Sales

Home > Website Builders > Website Operations

Robots.txt Standard Making method

Last Update:2014-12-12 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall

A website, can have robots.txt, also can not. But if there is to be, that must be done by the standard, the following personal experience for the next robots.txt production methods.

robots.txt file commands include:

Disallow-Tell the spider not to crawl some files or directories. The following code will prevent spiders from crawling all Web site files:

User: *

Disallow:/

Allow-tell spiders to crawl certain files. Allow and disallow in conjunction with the use, can tell spiders a directory, most of them do not crawl, only part of the crawl. The following code will make the spider not crawl the other files in the AB directory, but only crawl the files under the CD:

User: *

Disallow:/ab/

Allow:/AB/CD

$ wildcard-the character that matches the end of the URL. The following code will allow spiders to access URLs with the. htm suffix:

User: *

Allow: htm$

* Wildcard character-tell the spider to match any of the characters. The following code will prevent spiders from crawling all htm files:

User: *

Disallow:/*.htm

Sitemaps location-Tell the spider where your sitemap is, in the format:

Sitemap:

The three META tags supported include:

NOINDEX-Tell the spider not to index a page.

NOFOLLOW-Tell the spider not to follow the links on the page.

Nosnippet-Tell the spider not to display descriptive text in the search results.

Noarchive-Tell the spider not to show the snapshot.

NOODP-Tell the spider not to use the title and description in the Open directory.

These records or labels are now supported by three of them. One of the wildcard characters seems to have previously been Yahoo Microsoft does not support. Baidu now also supports Disallow,allow and two wildcard characters. Meta tags I did not find the official explanation of whether Baidu supports.

Only Google-supported meta tags are:

Unavailable_after-Tell spider webs when the page expires. After this date, it should not appear in the search results.

Noimageindex-Tell the spider not to index the picture on the page.

Notranslate-Tell the spider not to translate the page content.

Yahoo also supports meta tags:

Crawl-delay-the frequency at which spiders are allowed to delay crawling.

Noydir-Similar to the NOODP label, but refers to the Yahoo directory, not the Open directory.

Robots-nocontent-tells the spider that the part of the HTML that is being labeled is not part of the content of the Web page, or, in other words, tells the spider which part is the main content of the page (the content you want to retrieve).

MSN also supports META tags:

Crawl-delay

Another reminder is that when you return a 404 error, it means that the spider is allowed to crawl all the content. But when you crawl robots.txt files, and so on, and so on error, may cause the search engine does not include the website, because the spider does not know robots.txt file existence or inside has what content, this and confirms the file does not exist is not the same.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

making database design making making design making databases website making cost making a game server making a linux server

How to use the following four ways to promote their own web s... 08-18

How to find a breakthrough in comparison to achieve the effec... 08-18

Local website content access is the site of the first major o... 08-18

Old and new reasons for web site snapshot analysis and solutions 08-18

How to determine the daily number of foreign chains according... 08-18

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

direct digital landing development documentation data user director of marketing deploy it ddos how to description of products and services ddos information data website domain to dns

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Robots.txt Standard Making method

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support