To undertake search engine to judge whether the website cheat principle Analysis (i)
Guangzhou SEO Chen Yong continue to analyze the trust propagation model, the distrust propagation model and the anomaly Discovery Model 3 representative algorithms, they are trustrank algorithm, Badrank algorithm and Spamrank algorithm respectively.
Let's start with a detailed introduction to the TrustRank algorithm
The TrustRank algorithm belongs to the trust propagation model, which follows the process of the trust propagation model, that is, the algorithm flow consists of two steps.
Step one: Identify a collection of trustworthy pages
TrustRank algorithm needs to rely on manual audit to judge a page should be put into a collection of web pages, considering the workload of manual audit, so put forward two kinds of primary trust Web page collection strategy, on the basis of the primary collection and then manually audited.
* Primary Strategy 1: High PR score page, that is, the high PR score page is trustworthy, so you can calculate the PR value of the Web page, extract a small number of high score pages as a primary page collection.
* Primary Strategy 2: Inverse PR (inverse PR), in the PR calculation process, is based on the Web page into the chain into the weight of the calculation, inverse PR and the contrary, according to the page out of the chain out of the weight of the calculation, that is, the link between the page to the reverse, the selection of a higher percentage of the subset as primary page.
Step two: Spread the trust score from the whitelist page to other pages in a certain way
In this step, the trust propagation method of the TrustRank algorithm is based on the following two assumptions.
Suppose 1: The closer you are to a trusted Web page, the more trustworthy you are, and the distance here refers to how many links you can get through.
Assuming 2: A High quality Web page contains less chain, the less likely it is that the page being pointed to is a high-quality page.
The so-called trust attenuation, that is, the farther away from the trusted Web pages, through the dissemination of the trust score smaller.
The so-called Trust value equalization strategy, the Web page to obtain the trust value in accordance with the average distribution of the number of chains, if a page has k out of the chain, then each out of the chain assigned to the 1/K Trust score, and will be transferred to the chain.
By combining the above two communication strategies to spread the trust score between page node graphs, in the final calculation result, the page below certain degree of trust will be considered as cheating Web page.
First analysis here, search engine to judge whether the site cheating principle Analysis (iii) will explain Badrank algorithm, specific to my blog (http://www.30ly.com) understand.
This article original in guangzhou seo Chen Yong Blog http://www.30ly.com/?p=205
Reprint please add reprint address