Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
SEO optimization In the process of the site often need to pass a file called robots.txt and search engine spider dialogue. Generally, search engine spiders crawl to a certain site will read the file first, and follow the rules in the file to perform the following behavior. When a site has some sites do not need to be indexed by search engines, often through the robots.txt to limit the search engine spiders on this page crawl. For example, when the site appears, replytocom Repeat, or some do not need to be included and share the weight of the page and so on. In this respect, robots.txt constrains the behavior of search engines.
But the actual situation is, the website in robots.txt prohibits the spider to a certain type URL webpage collection, but in uses the Search Engine Advanced command site query to collect the situation is discovers, the search engine did not comply with robots.txt the rule. Most of the time, search engines do not include these pages according to the rules written in the site robots.txt. But the search engine sees the rules in robots.txt as nothing. Of course, there may be a site robots.txt writing errors may be, but this article is based on the premise of the correct written robots.txt. There is a very official saying is this: the site robots.txt a Web page to screen the spider included, but the search engine will still crawl the page, but in the relevant search results will not show the content of these pages. This sentence is somewhat puzzling, but according to the author's view, it is very possible.
First of all, the development of search engines is to the search users to show the needs of users, health, quality content. Before the relevant search results are included and returned, search engines must have a corresponding understanding of these sites, and then in weighing whether it is included and given how the ranking for example, if you want to do an illegal website, first assume that the site content has not been detected by the regulatory authorities, and do not consider the following method or not. Of course, if directly using the relevant keywords to do site SEO optimization, the higher exposure rate at the same time also greatly increased the possibility of seizure. What's more, the illegal content is not necessarily blocked by search engines.
At this time, through a large number of healthy content to do the site's SEO rankings. Use healthy keywords to get a lot of traffic to your site, and then bring a link to illegal information on these healthy content. Of course, such links are bound to use the site's robots.txt to screen the search engine spiders, while all the illegal content of the page is also prohibited included. Can this be done both through the search engine profit, and avoid the search engine supervision? In fact, the above mentioned search engine for a good user experience, will be included or will be included in the site to carry out a comprehensive understanding (no matter what is now the search engine to do is perfect).
Well, since the search engine to understand the site, how can you ignore the screen in the robots.txt? Ming to the search engine outfit pure, secretly secretly engaged in some illegal activities. I think that this situation search engine will not be taken into account. So, even if your site robots.txt certain pages specifically prohibit spiders crawling, but the search engine is always to "check" the. Otherwise how to fully understand the pros and cons of the site?
Well, to see the Web page, of course, first to crawl into the search engine server, and then to judge. Since the search engine on the site robots.txt screen or to crawl and view, then, how to embody the role of robots.txt?
That only hides these pages, at least not in the ordinary search results. Otherwise, the website robots.txt not only become furnishings?
Therefore, when found that the search engine is still the Web site robots.txt screen to be included when not too tense. Search engines just want to fully understand the site. However, you must ensure that the robots.txt rules are written correctly. In general, the search engine will be banned from crawling pages to delete, or "hidden." As a result of Web site robots.txt screen still included in the case of a larger uncertainty, it is regrettable that the author can not carry out the actual test of the situation. At the same time I think I just started to learn SEO rookie, so the author's point of view is not necessarily correct. I sincerely hope that you SEO optimization predecessors, people can give guidance and exchanges, thank you.
Unless otherwise noted, this blog article is Uschen original, copyright belongs to Shen Blog© all. Reprint please be sure to indicate the source, thank you. This article link address: http://www.yushenblog.com/talk/509.html