Using Html meta tags to control search engine spiders

Source: Internet
Author: User

Summary: Snapshots are not cached by Baidu: meta name = baiduspider content = noarchive

All search engines, crawl this page, crawl links, prohibit snapshots: meta name = robots content = index,follow,noarchive----------------------------------------- -----------------meta name = Robo ...

Snapshots are not cached by Baidu:

<metaName="Baiduspider"content="noarchive">      

All search engines, crawl this page, crawl links, prohibit snapshots:

<metaName="Robots"content="index,follow,noarchive">   

----------------------------------------------------------

<metaName="Robots"content="noarchive">      

The above section of code restricts all search engines to build your web snapshots. If we need to restrict a search engine to create a snapshot, we can write as follows

<metaName="Baiduspider"content="noarchive">    

It is important to note that such a tag is only to prohibit the search engine to create a snapshot of your site, if you want to prohibit search engines to index your page, please refer to the following method.

Second case: Prohibit search engine crawl this page.

In SEO, prohibit search engine crawl this page or allow search engine crawl This page is often used. So we need to do a discussion on this part of the focus.

In order to prevent the search engine from crawling this page, it is common practice to include the following code in the META tag of the page:

<metaNAME="ROBOTS"CONTENT="Noindex,follow">      

Here, Meta name= "ROBOTS" refers to all search engines, where we can also refer specifically to a search engine, such as meta name= "Googlebot", Meta name= "baiduspide" and so on. The Content section has four commands: Index, NOINDEX, follow, nofollow, and the command is separated by the English ",".

INDEX command: Tell the search engine to crawl this page

Follow command: Tell the search engine to find the link from this page, and then continue to access the crawl down.

NOINDEX command: Tell the search engine not to allow crawling this page

NOFOLLOW command: Tells the search engine not to allow links to be found from this page and to deny their continued access.

According to the above command, we have a little bit of four combinations

<metaNAME="ROBOTS"CONTENT="Index,follow">: You can crawl this page, and you can continue to index other links along this page 

<meta name= "ROBOTS" content = "Noindex,follow" >: You are not allowed to crawl this page, but you can follow this page to crawl index links

span class= "tag" ><metaname= "ROBOTS" = "Index,nofollow" >: You can crawl this page, But don't crawl index links on this page

<metaname=content=" Noindex,nofollow ">< Span class= "PLN" >: Don't crawl this page, or follow this page to crawl index links

It is important to note that two opposing antonyms cannot be written together, for example

<metaNAME="ROBOTS"CONTENT="Index,noindex">      

Or write a couple of words directly at the same time

<metaName="ROBOTS"CONTENT="Index,follow"><metaname= "ROBOTS"CONTENT="Noindex,follow">          

Here's a handy way to do this, if it's

<metaNAME="ROBOTS"CONTENT="Index,follow">      

In the form of a word that can be written:

<metaNAME="ROBOTS"CONTENT="All">      

If it is

<metaNAME="ROBOTS"CONTENT="Noindex,nofollow">    

In the form of a word that can be written:

<metaNAME="ROBOTS"CONTENT="NONE">      

Of course, we can also write a command meta tag that prohibits the creation of snapshots and commands for search engines. As we learned from the above article that the command to prohibit the creation of a webpage snapshot is noarchive, then we can write the following form:

<metaNAME="ROBOTS"CONTENT="index,follow,noarchive">   

If a single search engine is not allowed to create a snapshot, such as Baidu, we can write:

<metaNAME="Baiduspider"CONTENT="index,follow,noarchive">    

If in the META tag disdain about the spider's command, then the default command is as follows

<metaNAME="ROBOTS"CONTENT="Index,follow, archive">   

Therefore, if we are not sure about this part, we can directly write the above line of command, or directly left blank.

In the SEO, the control of the spider is very important part of the content, so I hope you crossing accurate grasp of this part of the content.

Using Html meta tags to control search engine spiders

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.