Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
Search engine spiders, for us is very mysterious, this article is the reason for the use of Spider-Man. Of course, we are not Baidu also is not Google, so can only say the secret, not the secrets. This article is relatively simple, just give the friends do not know a way to share, master and cattle please bypass it.
In the traditional sense, we feel that search engine spiders (spider) crawl, should resemble the real spider crawling on the spider web. That is, for example, Baidu Spiders find a link, crawl along this link to a page, and then follow the link inside the page to continue to crawl ... This is similar to a spider's web and resembles a big tree. Although the theory is correct, it is not accurate.
Search engine inside there is a Web site index library, so search engine spiders from the search engine server, follow the search engine has a Web site crawling a webpage, and will crawl back to the content of the Web page. After the page collection back, the search engine will analyze it, the content and link apart, content for the time being not said. Analysis of the link, the search engine will not immediately send spiders to crawl, but the link and anchor text records down to the URL index library for analysis, comparison and calculation, and finally put into the URL index library. After entering the URL index library, there will be spiders to crawl.
That is, if there is a page outside the chain, and does not necessarily immediately have spiders to crawl this page, but there will be a process of analysis and calculation. Even if the chain is removed after the spider has been crawled, the link may have been recorded by the search engine, and then there is the possibility of crawling. And the next time if the spider to crawl this outside the chain of the page, found that the link does not exist, or the chain on the page appeared 404, then just reduce the weight of this outside the chain, should not go to the URL Index library to delete this link.
So there's no link on the page that already doesn't exist, it also works. Share these today, and continue to share with you the content of my own analysis, if there is not accurate place, please criticize correct.
Reproduced please note from the Carefree blog @liboseo, this article address: http://liboseo.com/1060.html unless noted, carefree blog articles are original, reproduced please specify the source and link!