A more concise and correct understanding of gas model

Source: Internet
Author: User
powergraph:distributed graph-parallel computation on Natural Graphs (OSDI ')

2013-02-28 18:30:21| Category: Iteration diagram calculation | font size Subscription

This paper first presents the challenges in the existing parallel graph processing system, then introduces the Powergraph solution, and puts forward an effective partitioning scheme for power-law graphs.

Parallel graph processing systems such as Pregel and Graphlab are limited by the number of vertex neighbors, and the partitioning efficiency of graphs directly affects the communication overhead of the system. However, reality maps such as social networks and web maps are typical Power-law distributions, where a small number of vertex are connected to most vertex in the graph, and the graph partitioning of power-law graphs is a difficult problem in itself.

Powergraph the graph calculation based on vertex is abstracted into a general calculation model: gas model, which is divided into three stages: gather,apply and scatter.

1. Gather stage, user-defined a sum operation for each vertex, will vertex adjacent vertex and corresponding edge collection;

2. Each vertex in the application phase uses the sum value of the previous stage to update the original value;

3. The scatter phase uses the results of the second phase to update the vertex connected edge value.

Figure 1 Page Rank

Because the vertex computation frequently calls the gather stage operation, and most adjacent vertex values do not change, in order to reduce the amount of computation, Powergraph provides a cache mechanism, Figure 1 shows the powergraph mechanism under page The process pseudo code of rank calculation.

Powergraph a balanced graph partitioning scheme is proposed to reduce the amount of traffic in the computation while ensuring load balancing. Different from the hash random allocation scheme used by both Pregel and Graphlab, a balanced p-path vertex cutting (vertex-cut) partitioning scheme is proposed. The expected value of vertex cutting is computed according to the overall distribution probability density function of the graph:

The vertex is cut according to this expectation, and the traditional communication process is modified, as shown in Figure 2 below.

Figure 2 Communication process based on Vertex-cut

In the experiment, the Powergraph implemented three versions (global synchronous, global asynchronous, serializable asynchronous) according to the synchronous mode.

1. Global synchronization and Pregel similar, superstep set up a global sync point, to synchronize all edge and vertex changes;

2. Global asynchronous similar to Graphlab, all the Apply phase and scatter phase of the edge or vertex changes are immediately updated to the diagram;

3. Global Asynchrony makes the algorithm design and debugging become very complex, some algorithms may be less efficient than global synchronization, so there is a global asynchronous plus serializable combination of ways.

In error control, relying on the implementation of checkpoint, the Chandy-lamport snapshot algorithm used in Graphlab is adopted.

In this paper, the graph calculation model is abstracted, the balanced graph partitioning scheme is designed, and the system implementation is compared with three different modes, and the error control is realized.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.