The importance of nodes in the Hadoop analysis diagram, solving the number of node triangles in the graph

Hadoop solves the importance of nodes in a non-pointing graph by solving the number of triangles in the node to show:

It is important to solve the importance of the nodes in the graph, and to sort them in big data, distribute the data of large graph organization, find the important nodes, and make special treatment to the important nodes.

The following explains how to solve

1,python adjacency matrix for generating graphs with no direction

2,python draw this graph without a direction.

The number of triangles per node in the 3,hadoop MapReduce solution diagram

For a Hadoop solution matrix multiplication, see the previous article: http://blog.csdn.net/thao6626/article/details/46472535

1,python adjacency matrix for generating graphs with no direction

# coding:utf-8__author__ = ' Taohao ' Import randomclass Adjmatrix (object): Def build_adjmatrix (Self, dimension): t  EMP = 1 FD = open ("./adjmatrix.txt", ' w+ ') for I in range (1, dimension + 1): for j in range (temp, Dimension + 1): if i = = j:if i = = Dimension:fd.write (' A, ' + St                    R (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') fd.write (' B, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ') Else:fd.write (' A, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') f  D.write (' B, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') Else:value = random.randint (0,  1) fd.write (' A, ' + str (i) + ', ' + str (j) + ', ' + str (value) + ' \ n ') fd.write (' A, ' + Str (j) + ', ' + str (i) + ', ' + str (value) + ' \ n ') fd.write (' B, ' + str (i) + ', ' + str (j) + ', ' + str (val             UE) + ' \ n ')       Fd.write (' B, ' + str (j) + ', ' + str (i) + ', ' + str (value) + ' \ n ') temp + = 1 fd.close () if __name__ = = ' __main__ ': Adjmatrix = Adjmatrix () Adjmatrix.build_adjmatrix (10)

2,python draw this graph without a direction.

# coding:utf-8__author__ = ' Taohao ' import matplotlib.pyplot as Pltimport networkx as Nxclass drawgraph (object):    def _ _init__ (self):        self.graph = NX. Graph (name= ' graph ')    def build_graph (self):        fd = open ('./adjmatrix.txt ', ' r ') for line in        FD:            item = Line.split (', ')            print item            # length = Len (item)            if item[0] = = ' A ':                self.graph.add_node (item[1])                Self.graph.add_node (item[2])                # self.graph.add_nodes_from ([Int (item[1]), int (item[2])])                if item[3][0] = = ' 1 ':                    Self.graph.add_edge (item[1], item[2])    def draw_graph (self):        nx.draw_networkx (Self.graph, With_ labels=true)   # DRAW_NETWORKX () can display the label of Nodes        Plt.show () if __name__ = = ' __main__ ':    draw_ Graph = drawgraph ()    draw_graph.build_graph ()    draw_graph.draw_graph ()

The drawing is:

The number of triangles per node in the 3,hadoop MapReduce solution diagram

There are two rounds of MapReduce:

The first round does the matrix multiplication, the adjacency matrix squared, the result output to a directory below

In the second round, the results of the adjacency matrix squared as input, and the results are obtained by analyzing the result of multiplication and the original adjacency matrix.

Each round of MapReduce requires a job to control, so there are two job instances to be started to do two rounds of MapReduce

