Graphic processing dimensions for large data analysis beyond Hadoop
Source: Internet
Author: User
KeywordsImplementation graphics processing surpassing large data functions
Another important tool from Google looks beyond the Hadoop Mr--pregel framework to achieve graphical computing (Malewicez et al.2010). The calculations in Pregel are made up of a series of iterations, called Supersteps. Each vertex on the diagram is associated with a user-defined calculation function; Pregel ensures that each superstep concurrently invokes a user-defined calculation function on each edge of the diagram. Vertices can send messages through edges, and exchange values between vertices. This is also a global synchronization--all operations must continue after the user-defined function has ended. Readers familiar with BSP can see why Pregel is a good example of BSP--a group of entities that use global locks in parallel computations in user-defined functions and can exchange messages.
Http://www.aliyun.com/zixun/aggregation/14417.html ">apache Hama (Seo et al. 2010) is equivalent to open source Pregel, a BSP implementation. Hama the BSP on top of HDFs and Microsoft's Dryad engine. Presumably because they don't want to be considered different from the Hadoop community. But importantly, BSP is essentially an example of an iterative computation, and Hama has a parallel implementation of CGD, which Hadoop is not easy to implement. It must be noted that Hama's BSP engine is implemented on the MPI, the originator of parallel Programming literature (www.mcs.anl.gov/research/projects/mpi/). The Apache giraph, Golden ORB, and Stanford GPS project are also inspired by Pregel.
Graphlab (Gonzalez et al. 2012) has become an example of modern graphic processing. Graphlab originates from the academic program at the University of Washington and Carnegie Mellon University (CMU). Graphlab provides useful abstractions for working with graphics across cluster nodes. The subsequent version of Powergraph,graphlab makes it effective in dealing with natural or power-law diagrams--a graph with a large number of undesirable junction points and a few good connection points. The performance evaluation on Twitter about page rank and triangle counting has proven that graphlab is more efficient than other methods. The main focus of this book is Giraph,graphlab and its related aspects.
Table 1.1 compares the non-functional features of various paradigms, such as scalability, fault-tolerant mechanisms, and implemented algorithms. It can be inferred that, although traditional tools only work on a single node, it is not possible to do horizontal expansion, there is also the possibility of a single point of failure, the recent reconfiguration efforts to rob them of intergenerational migration. Notably, most graphics-processing paradigms do not have fault-tolerant mechanisms, but spark and Hadoop are the third generation tools that provide fault-tolerant mechanisms.
If you like this article, please point to praise, share, comment.
Original article reproduced please specify the source: Http://outofmemory.cn/wr/?u=http%3A%2F%2Fifeve.com%2Fbigdataanalyticsbeyondhadoop_graphprocessingdimension%2F
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.