1, the basic concept: PageRank is based on " from many high-quality web links to take over the page, must be a quality page " regression relationship to determine the importance of all Web pages
2, the specific algorithm: the PageRank of a page divided by the positive link that exists in the page, the resulting values and positive links to the page point to add the PageRank, that is, the linked page PageRank.
3, PageRank concept map:
4, PageRank the main points:
Number of backlinks (popularity indicators in pure sense)
Whether backlinks are from highly recommended pages (based on popular metrics)
The number of links to the Backlink source page (the probability indicator is selected)
5, examples to illustrate the specific process of PageRank
Suppose a small group consisting of 4 pages: A, B, C, D. If all pages are linked to a, then A's PR (PageRank) value will be b,c and D.
PR (A) = PR (B) + PR (C) + PR (D)
Continue to assume that B also has links to C, and D also has links to 3 pages that include a. A page cannot be voted 2 times. So b gives each page a priceticket. With the same logic, D cast only one-third of the votes on the PageRank of a.
PR (A) = PR (B)/2 + PR (C) + PR (D)/3
In other words, the PR value of a page is divided by the total number of links
PR (A) = PR (b)/L (b) + PR (c)/L (c) + PR (d)/L (d)
In order to prevent the non-chain of the page passed out of the PR is 0,google through the mathematical system to each page assigned a very small value (1-d)/n, since the page does not have an outside chain or users stop browsing direct jump
Description
The minimum value set for each page in the 1998 text of Sergey Brin and Lawrence Page is 1-d , not here (1-d)/n (You can also refer to the English Wikipedia entry for this section). So the PageRank of a page is computed by the PageRank of the other pages. Google repeatedly calculates the PageRank of each page. If you give each page a random PageRank value (not 0), then after repeated calculations, the PR value of these pages tends to be stable, that is, the state of convergence.
Through the above description, a simple summary of the PageRank formula is as follows:
Description: Dealing with "pages that have no outward links" (these pages are like "black holes" that will devour the probability of the user continuing to browse down), (here is called the damping factor (damping factor), which means that at any given moment, the probability that a user has reached a page and continues to navigate backwards.) (That is, the probability that the user stops clicking and randomly jumps to the new URL) is used on all pages, estimating the probability that the page may be bookmarked by the surfer.
is the page being researched, is linked into A collection of pages, is a chain out the number of pages, and is the number of all pages.
The PageRank value is a feature vector in a special matrix. This feature vector is
R is the answer to the equation
If not, and for each of them, equals 0.
6, simulation of the relationship between HTML pages, Java implementation PageRank algorithm:
1 PackageCOM.PACHIRA.D;2 3 Public classPageRank {4 Public Static voidMain (string[] args) {5 Double[] G = {6{0, 1, 1/2.0, 0, 1/4.0, 1/2.0, 0},7{1/5.0, 0, 1/2.0, 1/3.0, 0, 0, 0},8{1/5.0, 0, 0, 1/3.0, 1/4.0, 0, 0},9{1/5.0, 0, 0, 0, 1/4.0, 0, 0},Ten{1/5.0, 0, 0, 1/3.0, 0, 1/2.0, 1}, One{0, 0, 0, 0, 1/4.0, 0, 0}, A{1/5.0, 0, 0, 0, 0, 0, 0} - }; - Double[] PR = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0}; the DoubleAlpha = 0.85; - DoubleEPS = 0.0000001; - PageRank (PR, G, Alpha, EPS); - } + Public Static voidShowvector (Double[] v) { - for(inti = 0; i < v.length; i++) { +System.out.print (V[i] + "\ T"); A } at System.out.println (); - } - Public Static voidShowmatrix (Double[] m) { - for(inti = 0; i < m.length; i++) { - for(intj = 0; J < M[i].length; J + +) { -System.out.print (M[i][j] + "\ T"); in } - System.out.println (); to } + } - the /** * * Calculate the main function of PageRank $ * @paramvector initial PageRank vectorsPanax Notoginseng * @paramMatrix Initial HTML reverse link probability matrices - * @paramAlpha Damping Factor the * @paramEPS convergence threshold value + * @return A */ the Public Static Double[] PageRank (Double[] Vector,Double[] Matrix,DoubleAlphaDoubleEPS) { + Double[] Vectormove =Vector; - while(true) { $ showvector (vector); $Vectormove =Vectorxmatrix (vector, Matrix, alpha); - Doubledis =Norm (vector, vectormove); - if(Dis <=EPS) { the Break; - }WuyiVector =Vectormove; the } - returnVector; Wu } - About /** $ * Calculates the error of two vectors - * @paramVector - * @paramVectormove - * @returnthe error of the vector A */ + Public Static DoubleNormDouble[] Vector,Double[] vectormove) { the if(Vector.length! =vectormove.length) { - return-1; $ } the Doublesum = 0; the for(inti = 0; i < vector.length; i++) { theSum + = Math.Abs (Vector[i]-vectormove[i]); the } - returnsum; in } the the /** About * Calculate PageRank value the * @paramMatrix HTML Reverse link probability the * @paramvector PageRank vectors the * @returnNew PageRank vector + * @url:http://zh.wikipedia.org/zh/%E7%9F%A9%E9%98%B5 - * The multiplication of two matrices is only defined if the number of the first matrix A is equal to the number of rows in the other matrix B. the * If A is an MXN matrix and B is the NXP matrix, their multiply-AB is an MXP matrix, one of its elementsBayi * | 1 0 2| |3 1| | (1*3 + 0*2 + 2*1) (1*1 + 0*1 + 2*0) | |5 1| the * |-1 3 1| x 1| = | ( -1*3 + 3*2 + 1*1) ( -1*1 + 3*1 + 1*0) | = |4 2| the * | 0| - * | 1 0 2| | 1 | | 1*1 + 0*1 + 2*1| | 3 | - * |-1 3 1| x | 1 | = |-1*1 + 3*1 + 1*1| = | 3 | the * | 1 | the * |3 1| the * | 1 1 1| x 1| = | (1*3 + 1*2 + 1*1) (1*1 + 1*1 + 1*0) | = |6 2| the * | 0| - */ the Public Static Double[] Vectorxmatrix (Double[] Vector,Double[] Matrix,DoubleAlpha) { the if(NULL= = Vector | | Matrix = =NULL|| Vector.length = = 0 | | Matrix.length = = 0 | | Vector.length! = matrix[0].length) { the return NULL;94 } the Double[] result =New Double[vector.length]; the for(inti = 0; i < matrix.length; i++) { the Doublesum = 0;98 for(intj = 0; J < Matrix[i].length; J + +) { AboutSum + = vector[j] *Matrix[i][j]; - }101sum = Alpha * sum + (1-alpha)/vector.length;102Result[i] =sum;103 }104 returnresult; the }106}
The above content is excerpted from wiki's PageRank;
"Algorithm" PageRank