Hama is based on the BSP model implementation on HDFS. Apache Hame is an open source implementation of Google Pregel
Pregel is a general-purpose programming model proposed by Google for large-scale graph computing. Many practical applications involve large-scale graph algorithms, such as Web links , social relationships, location maps, citation relationships in research papers, and so on, with a figure of up to billions of vertices and tens of billions of edges. The Pregel programming model is designed to efficiently compute this large-scale graph .
Pregel's design idea comes from the BSP (Bluk synchronous Parallel) model. The BSP model consists of three parts: the BSP machine model, the BSP calculation model, and the BSP cost model. The BSP calculation model adopts the method of single program multi-data (SPMD) execution . The BSP calculation consists of a set of processing units and a series of successive super-steps (Superstep). Within each super step, each processing unit performs local computations concurrently and sends messages to other processing units. There is a global synchronization operation at the end of a super step. Therefore, it can be considered as local computing-global communication-fence synchronization mode .
Specifically, thePregel calculation consists of a series of iterations (that is, super-step). In each super step, the compute framework invokes the user-defined compute function on the vertex, which is executed in parallel. The COMPUTE function defines what needs to be done in a vertex V and a super step s. The function can read the message sent in the previous super step S-1, and then send the message to the other vertices that are processed in the next Uber s+1, and modify the state of V and the state of its out edge in the process, or modify the topology of the diagram. Messages are sent through the edges of the vertices, but a message can be sent to a specific vertex of any known ID. This computational pattern is well suited for distributed implementations: Vertex computations are parallel, and it does not limit the order of execution for each super step, and all communication is limited to S to s+1 .
Pregel is a vertex-centric computational model in which the edge is not the first type of object, and there is no corresponding calculation on the edge.
A little bit less:
One-way relaxation simulation: A child node or grandchild node must appear in the condition set for an out-of-edge pattern diagram
Less: If the point does not exist, then his son node must exist, there will be useless results, such as the existence of a-->b, the output B
Bidirectional relaxation Simulation: Reverse the edges of the pattern graph, then match, bidirectional match.
Less: If the point does not exist, then both his son and Father node must exist.
Application scenario: Societies have found that the work of some people in a group can be replaced at the top level.
Data Source: Twitter,amazon,google
Compare with whom, what is the advantage?
Hama vertex programming