3.4 Web Analytics Algorithms

Source: Internet
Author: User

In a search engine, the crawler crawls the corresponding page and stores the Web page in the server's original database, after which
Search engines will analyze these pages and determine the importance of each page, which will affect the ranking results of users ' searches.
The determination of these importance and the determination of ranking results require algorithms to solve, so first to understand the algorithm.

Search engine page analysis algorithm is divided into 3 categories: Web page analysis algorithm based on user behavior, network topology-based
Web page analysis algorithm, Web page content-based analysis algorithm. Next we explain these algorithms separately.

Search engine page analysis algorithm is divided into 3 categories: Web page analysis algorithm based on user behavior, Web page based on network topology
Web page analysis algorithm, Web page content-based analysis algorithm.

1 Web page analysis algorithm based on user behavior

In this algorithm, according to the user's access behavior to these pages, these pages are evaluated, for example, according to the user to
The frequency of the page, the user's access to the page, the user's click rate and other information on the comprehensive evaluation of the Web page.

2 Web page analysis algorithm based on network topology

Web page analysis algorithm based on network topology is dependent on Web page link relationship, structure relationship, known web page or data, etc.
An algorithm for analysis, the so-called topology, is simply the meaning of structural relationships. Web page analysis based on network topology
Algorithm, the same main can be subdivided into 3 types: Based on the Web page granularity analysis algorithm, based on the page block granularity analysis calculation
method, based on the Web site particle size analysis algorithm.

PageRank algorithm is a typical analysis algorithm based on Web page granularity. It is the core algorithm of Google search engine,
Simply put, it calculates the weight of a Web page based on the link relationship between pages, and can be relied upon to calculate
The weight of the page to rank. The details of the algorithm are many, and they do not understand. In addition to the PageRank algorithm,
The hits algorithm is also a common analysis algorithm based on Web page granularity.

Based on the Web page block granularity analysis algorithm, but also relies on the link between the Web page to calculate, but the calculation rule is different.
We know that a Web page typically contains multiple hyperlinks, but not all of the external links that they point to
Links are related to site themes, or they are not as important to the page as they are, so to
Based on the granularity of Web page block analysis, it is necessary to a Web page of these external links to divide the hierarchy, different levels of external
Links are different in importance to the Web page. The analysis efficiency and accuracy of this algorithm will be better than the traditional algorithm.
A few.

The analysis algorithm based on Web page granularity is similar to the PageRank algorithm. And based on the granularity of the site analysis, the corresponding, will
Use the Siterank algorithm. At this point, we will divide the level and level of the site, and no longer specifically calculate the site under the various
The level of the page. Therefore, it is simpler and more efficient than the algorithm based on the granularity of web pages, but it will bring some missing
Points, such as accuracy, are less accurate than web page granularity based analysis algorithms.


3 Web page analysis algorithm based on Web content
In the Web page analysis algorithm based on Web content, the Web page will be based on the data, text and other Web content characteristics,
Evaluation of the response.

Master if there are specific articles about the algorithm, children please share, thank you crawl!

3.4 Web Analytics Algorithms

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.