Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
As SEO practitioners, not only to be crawled by search engines, but also to be included, the most important is included in the good rankings, this article will be a simple analysis of the search engine included in the page four stages. Each website, each page ranking is different, to see your site at which stage?
The first stage of the Web page: all size
Search engine Web Crawl is to take " Size-All "strategy, that is, the Web page can be found in the link to the crawl URL, mechanical will be newly crawled Web page in the URL extracted, this way although relatively old, but the effect is very good, this is why many webmaster response spider to visit, but did not include the reason, this is only the first stage.
The second stage: Web rating
The second stage is to grade the importance of the Web page, PageRank is a well-known link analysis algorithm, can be used to measure the importance of the Web page, it is natural that the webmaster can use the idea of PageRank to sort the URL, which is your passion for "hair outside the chain", according to a friend, in China " Hair outside the chain "this market has billions of dollars a year on the scale."
The purpose of the crawler is to download the Web page, but PageRank is a global algorithm, that is, when all the pages have been downloaded, the results are reliable. For small and medium Web sites, if the server quality is not good, if in the crawl process, only to see part of the content, in the crawl phase is unable to obtain a reliable PageRank score.
The third stage of the Web page: OCIP strategy
The OCIP strategy is more like the improvement of the PageRank algorithm. Before the algorithm starts, each page is given the same "cash", and whenever a page A is downloaded, a gives its "cash" average to the link page contained in the page, emptying its "cash". This is one of the reasons why the fewer links are exported, the higher the weight.
And for the Web page to be crawled, according to the amount of cash on hand to sort, priority to download the most abundant cash pages, Ocip is roughly the same as the PageRank idea, the difference is: PageRank each iteration to calculate, and ocip do not need, so the calculation speed is far faster than the PageRank, Suitable for real-time computing. This may be why a lot of Web pages will appear "seconds".
Page four: The priority strategy of the station
Big Station priority thinking is very direct, to the site as a unit to measure the importance of the Web page, for the URL to be crawled in the queue of pages, according to the site classification, if which site waiting to download the most pages, then priority to download these links. The essence of the idea is "a preference to download large Web site url". " Because large websites often contain more pages. In view of the large web site is often a famous station, the quality of its web page generally higher, so this idea is simple, but there is a certain basis.
Experiments show that the algorithm is simple and rough, but it can be used to collect high-quality Web pages, which is very effective. This is why many Web site content is reproduced, the major stations can be ranked in front of you one of the most important reasons.
Article Source: Lou Blog original address: http://lusongsong.com/reed/663.html
Related reading:
A5 Registration offer: 2013 GOMX global Network Marketing Conference