Search Engine Algorithm Ranking

Source: Internet
Author: User

PageRank technology is based on the assumption that for a page a in the Web, if there is a link to page A, you can consider a as an important web page. PageRank that the number of links to the Web page can reflect the importance of the page, but because the real people in the design of a variety of hyperlinks in the web is often not strict, there are many hyperlinks in the Web page is purely for such purposes as Web site navigation, commercial advertising and other production, It is clear that such pages contribute little to the importance of the pages it points to. However, because of the complexity of the algorithm, PageRank not too much consideration of the Web page hyperlink content on the importance of the impact of the page, but only a relatively simple use of two methods: first, if a page of the chain of pages too many, it on the importance of each link to the recognition of the ability to reduce; If a Web page because of its own link to the number of Web pages to reduce the importance of it, it is the importance of the link to the Web page also reduced the impact. Therefore, in the actual calculation, the importance weight of page A is proportional to the value of the link to the page A, and the number of pages linked to page A is inversely proportional. Since it is not possible to know the importance of Web a itself, it is necessary to calculate the important weights of each page repeatedly and iteratively. In other words, the importance of a Web page depends on the importance of other pages as well.

One is based on the traditional information retrieval technology, it mainly uses the keyword itself in the document to the importance of the document and user query requirements of the relevance of measurement, such as the use of the Web page in the frequency and location of keywords. In general, the retrieved Web pages contain more query keywords, the greater the relevance, and the greater the distinction between the keywords, and the query keyword if it appears in such important positions as the title field, it is more relevant than the text appears. Second, the hyper-chain analysis technology, the use of this technology representative search engine has Google and Baidu and so on. Compared with the former, it is based on the importance of Web page recognition as the relevance of the retrieval results. From the design point of view, it pays more attention to the third party to the Web page recognition, such as a large number of pages linked to the Web page is widely recognized as an important page, and according to the location and frequency of the traditional method of keyword is only a form of web page self recognition, lack of objectivity. Finally, there are other ways to customize the collation, such as the user-defined way. The Skynet FTP search engine in Peking University uses this sort of arrangement, which allows the user to select a specific sort of index such as time, size, stability, and distance to sort the results pages. Again, such as the fee ranking model, it as a major search engine profit means, in the network portal characteristics of the large-scale search engines are widely used, but the fear of affecting the objectivity of search results, this way is not their mainstream ranking, but only as a supplement to show in the paid search column.

。 Among them, the PageRank algorithm in the actual use of the effect is better than the hits algorithm, which is mainly due to the following reasons: First, the PageRank algorithm can be one-time, offline and independent of the query to the Web page to calculate the importance of the estimate, and then in the specific user query, In combination with other query index values, the query results are sorted together, thus saving the computation cost of the system query, and secondly, the PageRank algorithm uses the whole Web page collection to compute, unlike the hits algorithm is susceptible to the local link trap to produce "subject drift" phenomenon, So now this technology is widely used in many search engine systems, the success of Google search engine also shows that the hyperlink analysis as a feature of the Web page relevance ranking algorithm is increasingly mature.

 

Correlation ranking technology mainly relies on the implementation of hyper-chain analysis technology. Hyper-chain analysis technology can provide a variety of functions, the main function of which is to solve the results of Web page relevance ranking problem. It mainly uses the various hyperlinks that exist between the webpage, analyzes the reference relationship between the pages, and calculates the importance weight of the page according to the number of pages. It is generally believed that if a page has a hyperlink to the B page, the equivalent of a Web page cast a b page A vote, that is, a recognized the importance of the B Web page. In depth understanding of the hyper-chain analysis algorithm, the whole Web page document Set can be viewed as a topological map according to the link structure, each of these pages constitute a node in the diagram, the links between the pages constitute the end of the point between the side, according to this idea, you can according to the degree of each node and the degree of entry to evaluate the importance of the page.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.