PageRank: When using PageRank, the search engine needs to calculate the value of PageRank for each node:
The calculation formula of this value is given, and the PageRank value of each node is composed of 2 parts, one is the initial PageRank value of the node and the other is the PageRank value of all the neighboring nodes it connects.
The former means that the neighbor node has more PageRank value, which means that the quality of the neighbor node will also affect the PageRank value of the node itself.
Take social networks for example: 2 people, everyone has 5 friends, a friend is Lily,lucy,andy,kitty,rocky, and B's friend is Obama, Bill gates and other elite, then obviously B is more important than a.
The above calculation process needs to iterate over and over again, using the MapReduce algorithm when implemented, distributing the computational task to multiple node parallel computing (MAP), and then summarizing the calculation results (Reduce) for multiple nodes.
After 2000, PageRank's calculations were not using MapReduce, but using Pregel.
In contrast to MapReduce, the difference is that the Pregel is node-centric rather than the whole graph. After each node calculates a value, it proactively sends the message to the other nodes, informing the updated results.
After each node receives the MSG sent by the neighbor node, it extracts the value of the neighbor node and can perform a new round of update calculations, as shown in:
MapReduce and Pregel