The importance of nodes in the Hadoop analysis diagram, solving the number of node triangles in the graph

Source: Internet
Author: User
Tags rounds hadoop mapreduce

Hadoop solves the importance of nodes in a non-pointing graph by solving the number of triangles in the node to show:

It is important to solve the importance of the nodes in the graph, and to sort them in big data, distribute the data of large graph organization, find the important nodes, and make special treatment to the important nodes.

The following explains how to solve


This article is divided into three parts:

1,python adjacency matrix for generating graphs with no direction

2,python draw this graph without a direction.

The number of triangles per node in the 3,hadoop MapReduce solution diagram


For a Hadoop solution matrix multiplication, see the previous article: http://blog.csdn.net/thao6626/article/details/46472535


1,python adjacency matrix for generating graphs with no direction

# coding:utf-8__author__ = ' Taohao ' Import randomclass Adjmatrix (object): Def build_adjmatrix (Self, dimension): t  EMP = 1 FD = open ("./adjmatrix.txt", ' w+ ') for I in range (1, dimension + 1): for j in range (temp, Dimension + 1): if i = = j:if i = = Dimension:fd.write (' A, ' + St                    R (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') fd.write (' B, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ') Else:fd.write (' A, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') f  D.write (' B, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') Else:value = random.randint (0,  1) fd.write (' A, ' + str (i) + ', ' + str (j) + ', ' + str (value) + ' \ n ') fd.write (' A, ' + Str (j) + ', ' + str (i) + ', ' + str (value) + ' \ n ') fd.write (' B, ' + str (i) + ', ' + str (j) + ', ' + str (val             UE) + ' \ n ')       Fd.write (' B, ' + str (j) + ', ' + str (i) + ', ' + str (value) + ' \ n ') temp + = 1 fd.close () if __name__ = = ' __main__ ': Adjmatrix = Adjmatrix () Adjmatrix.build_adjmatrix (10)


2,python draw this graph without a direction.

# coding:utf-8__author__ = ' Taohao ' import matplotlib.pyplot as Pltimport networkx as Nxclass drawgraph (object):    def _ _init__ (self):        self.graph = NX. Graph (name= ' graph ')    def build_graph (self):        fd = open ('./adjmatrix.txt ', ' r ') for line in        FD:            item = Line.split (', ')            print item            # length = Len (item)            if item[0] = = ' A ':                self.graph.add_node (item[1])                Self.graph.add_node (item[2])                # self.graph.add_nodes_from ([Int (item[1]), int (item[2])])                if item[3][0] = = ' 1 ':                    Self.graph.add_edge (item[1], item[2])    def draw_graph (self):        nx.draw_networkx (Self.graph, With_ labels=true)   # DRAW_NETWORKX () can display the label of Nodes        Plt.show () if __name__ = = ' __main__ ':    draw_ Graph = drawgraph ()    draw_graph.build_graph ()    draw_graph.draw_graph ()

The drawing is:






The number of triangles per node in the 3,hadoop MapReduce solution diagram

Import Java.io.bufferedreader;import java.io.file;import java.io.filereader;import java.io.IOException;  Import Java.util.HashMap;  Import Java.util.Iterator;  Import org.apache.hadoop.conf.Configuration;  Import Org.apache.hadoop.fs.Path;  Import Org.apache.hadoop.io.Text;  Import Org.apache.hadoop.mapreduce.Job;  Import Org.apache.hadoop.mapreduce.Mapper;  Import Org.apache.hadoop.mapreduce.Reducer;  Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;  Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;    Import Org.apache.hadoop.util.GenericOptionsParser;      The public class Matrixmutiply {/* * matrix is stored in a file.   * Just start two matrices in a file, Hadoop will do two times for two files map to do a map and reduce, * so that the other matrix has no data, the following reduce will have problems * Matrix storage in the form of: * a,1,1,2 Represents the first column of the first row of a matrix with data of 2 * a,1,2,1 * a,2,1,3 * a,2,2,4 * This is intended to prevent a map from being fragmented while reading data, resulting in incomplete data reads * Matrix by Python script    Generated, Python script See buildmatrix.py * * */private static int colnumb = 10; private static intRownuma = 10; public static class Matrixmapper extends Mapper<object, text, text, text>{/* * Rownuma and Colnum B need to be confirm manually * Map stage: * The data is organized into the form of key value * Key: The position number of the element of the result matrix * Value: The result matrix element needs to use the original two matrix data * to note that the matrix and the post matrix in the map phase of the processing data in the organization map output data is not the same * * */private Text MapO          Utputkey;                    Private Text Mapoutputvalue; @Override protected void Map (Object key, text value, Mapper<object, text, text, Text>. Context context) throws IOException, interruptedexception {//TODO auto-generated Method St              UB SYSTEM.OUT.PRINTLN ("Map input key:" + key);              SYSTEM.OUT.PRINTLN ("Map input value:" + value);              string[] matrixstrings = value.tostring (). Split ("\ n");                  for (String item:matrixstrings) {System.out.println ("item:" + Item); String[] elemstring = Item.split (",");                  for (string string:elemstring) {System.out.println ("element" + string);                  } System.out.println ("elemstring[0]:" +elemstring[0]); if (Elemstring[0].equals ("A")) {///must use equals here, but cannot use = = to judge/* * to map A Matrix, OUTPU                          TKey Outputvalue in the organization to pay attention to details, processing good details * */for (int i=1; i<=colnumb; i++) {                          Mapoutputkey = new Text (elemstring[1] + "," + string.valueof (i));                          Mapoutputvalue = new Text ("A:" + elemstring[2] + "," + elemstring[3]);                          Context.write (Mapoutputkey, Mapoutputvalue);                      System.out.println ("Mapouta:" +mapoutputkey+mapoutputvalue);                  }/* * For the difference between the organization of the B-matrix Map,mapoutput and the A matrix, the details must be handled * */ else if (elemstrIng[0].equals ("B")) {for (int j=1; j<=rownuma; J + +) {Mapoutputkey = new Te                          XT (String.valueof (j) + "," + elemstring[2]);                          Mapoutputvalue = new Text ("B:" + elemstring[1] + "," + elemstring[3]);                          Context.write (Mapoutputkey, Mapoutputvalue);                      System.out.println ("Mapoutb" +mapoutputkey+mapoutputvalue); }} else{//Just for debug System.out.println ("Mapout Else else                  :---------------> "+ Item); }}}} public static class Matixreducer extends Reducer<text, text, text, TEXT&G T          {Private hashmap<string, string> matrixahashmap = new hashmap<string, string> ();          Private hashmap<string, string> matrixbhashmap = new hashmap<string, string> ();                     Private String Val;    @Override      protected void reduce (text key, iterable<text> value, Reducer<text, text, text, text>.c Ontext context) throws IOException, interruptedexception {//TODO auto-generated method Stu              b System.out.println ("Reduce input key:" + key);              SYSTEM.OUT.PRINTLN ("Reduce input value:" + value.tostring ());                  for (Text item:value) {val = item.tostring ();                  System.out.println ("Val------------" +val);                      if (!val.equals ("0")) {string[] kv = val.substring (2). Split (",");                      if (Val.startswith ("A:")) {Matrixahashmap.put (kv[0], kv[1]);                      } if (Val.startswith ("B:")) {Matrixbhashmap.put (kv[0], kv[1]); }}}/*just for debug*/System.out.println ("Hashmapa:" + MatrixahaSHMAP);                  System.out.println ("HASHMAPB:" +matrixbhashmap);                  iterator<string> Iterator = Matrixahashmap.keyset (). Iterator ();                      int sum = 0;                              while (Iterator.hasnext ()) {String keystring = Iterator.next (); Sum + = Integer.parseint (Matrixahashmap.get (keystring)) * Integer.parseint (MATRIXBHASHM                      Ap.get (keystring));                      }//longwritable Reduceoutputvalue = new longwritable (sum);                      Text Reduceoutputvalue = new text (string.valueof (sum));                      Context.write (key, Reduceoutputvalue);                      /*just for debug*/System.out.println ("Reduce output key:" + key);          SYSTEM.OUT.PRINTLN ("Reduce output value:" + reduceoutputvalue); }} public static class Trianglemapper extends Mapper<object, text, text, Text>{@Overrideprotected void Map (Object key, text value,mapper<object, text, text, Text>. Context context) throws IOException, interruptedexception {//TODO auto-generated method stub/* * Map Multiplies the result of the matrix as input * map in The put key is assigned by Hadoop itself * Map input value is the matrix multiplied by each row in the result file * The map input value is processed * Map output key: line number * Map output value: Meta            Element's column number + "," + elements value * For example: * key:1 * value:1,2 * */string[] valuestring = value.tostring (). Split ("\ t");            string[] Keyitems = Valuestring[0].split (",");            Text Outputkey = new text (keyitems[0]);            Text Outputvalue = new text (keyitems[1] + "," + valuestring[1]);        Context.write (Outputkey, outputvalue);} } public static class Trianglereducer extends Reducer<text, text, text, text>{private string[] Matrix = n    EW String[colnumb*colnumb];    Private Boolean readglobalmatrixflag = false;            Private int[] Rowvalue = new Int[colnumb]; /* * Get the adjacency matrix of the original matrix * */Private void Getglobalmatrix () {//TODO auto-generated method stub String Adj_matrix_path = "/home/taohao/pycharmprojects/we    Bs/pythonscript/matrix/adjmatrix.txt ";    File File = new file (Adj_matrix_path);    BufferedReader bufferedreader = null;    try {BufferedReader = new BufferedReader (new FileReader (file));    String line = null;     while (line = Bufferedreader.readline ()) = null) {string[] items = Line.split ("[, \ n]");     if (Items[0].equals ("A")) {matrix[(Integer.parseint (items[1])-1) * Colnumb + integer.parseint (items[2])-1] = items[3];    }} bufferedreader.close ();    } catch (Exception e) {//Todo:handle Exception System.out.println (e.tostring ()); }} @Overrideprotected void reduce (text key, iterable<text> value,reducer<text, text, text, Text>. Context context) throws IOException, interruptedexception {/* * to solve triangles in behavioral units * * *///TODO auto-generated method Stubif (!re Adglobalmatrixflag) {Getglobalmatrix (); readglobalmatrixflag = true;} Iterator<            Text> iterator = Value.iterator ();            int rowsum = 0; while (Iterator.hasnext ()) {/* * Note here to mark the number of elements in reduce input value * Because the reduce input does not necessarily                 is from the past to the back, will be disorderly * */string[] Valueitems = Iterator.next (). toString (). Split (",");             Rowvalue[integer.parseint (Valueitems[0])-1] = Integer.parseint (valueitems[1]); }int RowKey = Integer.parseint (key.tostring ()); for (int i = 0; i < Colnumb; i++) {if (Matrix[i + (rowKey-1) *colnumb].equal S ("1")) {rowsum + = Rowvalue[i];}} Rowsum = ROWSUM/2;        Text Outputvalue = new text (string.valueof (rowsum)); Context.write (key, Outputvalue);}          public static void Main (string[] args) throws exception{configuration conf = new configuration ();        Configuration Conftriangle = new configuration ();          string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs (); if (otherargs.length! = 2) {System.err.println ("Usage:matrix <in> <out>");          System.exit (2);                Job Job = job.getinstance (conf, "Matrix");          Job.setjarbyclass (Matrixmutiply.class);        Job.setmapperclass (Matrixmapper.class);            /* According to the idea, here do not need combiner operation, do not need to specify *//Job.setcombinerclass (Matixreducer.class);          Job.setreducerclass (Matixreducer.class);          /* These two outputkeyclass outputvalueclass work simultaneously on map output and reduce output */* Note is simultaneous, so be consistent when specifying the output of map and reduce */          Job.setoutputkeyclass (Text.class);        Job.setoutputvalueclass (Text.class);          Fileinputformat.addinputpath (Job, New Path (Otherargs[0]));        Fileoutputformat.setoutputpath (Job, New Path (Otherargs[1]));                Job.waitforcompletion (TRUE);        Job Jobtriangle = job.getinstance (Conftriangle, "triangle");        Jobtriangle.setjarbyclass (Matrixmutiply.class);        Jobtriangle.setmapperclass (Trianglemapper.class); Jobtriangle.setreducercLass (Trianglereducer.class);          Jobtriangle.setoutputkeyclass (Text.class);        Jobtriangle.setoutputvalueclass (Text.class);          Fileinputformat.addinputpath (Jobtriangle, New Path ("/trianglematrixoutput/part-r-00000"));        Fileoutputformat.setoutputpath (Jobtriangle, New Path ("/triangleoutput"));    System.exit (Jobtriangle.waitforcompletion (true)? 0:1);   }  }

There are two rounds of MapReduce:

The first round does the matrix multiplication, the adjacency matrix squared, the result output to a directory below

In the second round, the results of the adjacency matrix squared as input, and the results are obtained by analyzing the result of multiplication and the original adjacency matrix.


Each round of MapReduce requires a job to control, so there are two job instances to be started to do two rounds of MapReduce



Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

The importance of nodes in the Hadoop analysis diagram, solving the number of node triangles in the graph

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.