Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
The robots file looks very simple, only a few lines of characters, but as a search engine into our site after the first access to the object, its role is indeed critical. These lines of characters contain a lot of small details. If we ignore these small details, the document not only can not become the site of the development of foot stone, will likely become a stumbling block to the development of the site, you can be blunt to say a careless, after the possibility of losing a full plate. In the following article, I will analyze the three cases in which we write robots because we do not pay attention to the details of the unbearable "pain."
Question one: The sequence of statements is reversed
Let's start with a simple, but widely used statement:
User: *
Allow:/
Disallow:/1234/
From these three pieces of the robot statement we can not see that the original purpose of writing is to let search engines do not crawl the 1234 directories below the page, while the other pages have no restrictions. And in fact the execution of this statement is contrary to our purpose, why? After analysis you will find that search engine spiders for the reading order of the file is from top to bottom, if you write this will result in shielding statements lost the original effect, the modification of the method is the two big move, Will disallow:/1234/and allow:/position reversal can achieve the effect I want
Problem two: Screen a page of a site, miss the Slash "/"
We also often use robots to block a sensitive page that we don't want search engines to crawl, and we have a lot of details to be aware of in this statement, for instance, If you join us want to block the root directory landing page login.asp This page, some webmaster may write: Disallow:login.asp, this at first glance no problem, but I want to ask you want to block this page is located in what directory? Is it a root directory or a level two directory? If we ignore the front slash, search engine spiders cannot know where the page is. The modified method is: Disallow:/login.asp, so that the real shield is located under the root directory of the Login.asp landing page.
Question three: Screen the entire directory of the site, the missing slash "/"
In addition to shielding a single page, I think most webmasters are more often used to block the entire directory. Similarly, for example, we want to screen a certain directory of the site, such as the/seo/of the page below the directory, some people may write disallow:/seo. Is it correct to write this way? There is no mistake in writing, and it is very wrong and the harm is very great. We do this though we can block out all the/seo/pages below this directory. But it also twists to other unrelated pages, which also masks all pages that start with/seo. It works as if it were disallow:/seo*. The modified method is simple, that is, after we need to mask the name of the directory do not miss the slash, such as disallow:/seo/.
Robots file can protect our site Some files are not search engine crawl, but also can enhance the search engine crawl efficiency. But if we do not pay attention to the details, not only the effect, but also often counterproductive. I hope this article is helpful for everyone in writing a document. Article by Nanjing Network Company http://www.cootem.com/Original, reprint please keep our address.