The importance of nodes in the Hadoop analysis diagram, solving the number of node triangles in the graph

Last Update:2015-06-27 Source: Internet

Author: User

Tags rounds hadoop mapreduce

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop solves the importance of nodes in a non-pointing graph by solving the number of triangles in the node to show:

It is important to solve the importance of the nodes in the graph, and to sort them in big data, distribute the data of large graph organization, find the important nodes, and make special treatment to the important nodes.

The following explains how to solve

This article is divided into three parts:

1,python adjacency matrix for generating graphs with no direction

2,python draw this graph without a direction.

The number of triangles per node in the 3,hadoop MapReduce solution diagram

For a Hadoop solution matrix multiplication, see the previous article: http://blog.csdn.net/thao6626/article/details/46472535

1,python adjacency matrix for generating graphs with no direction

# coding:utf-8__author__ = ' Taohao ' Import randomclass Adjmatrix (object): Def build_adjmatrix (Self, dimension): t  EMP = 1 FD = open ("./adjmatrix.txt", ' w+ ') for I in range (1, dimension + 1): for j in range (temp, Dimension + 1): if i = = j:if i = = Dimension:fd.write (' A, ' + St                    R (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') fd.write (' B, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ') Else:fd.write (' A, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') f  D.write (' B, ' + str (i) + ', ' + str (j) + ', ' + ' 0 ' + ' \ n ') Else:value = random.randint (0,  1) fd.write (' A, ' + str (i) + ', ' + str (j) + ', ' + str (value) + ' \ n ') fd.write (' A, ' + Str (j) + ', ' + str (i) + ', ' + str (value) + ' \ n ') fd.write (' B, ' + str (i) + ', ' + str (j) + ', ' + str (val             UE) + ' \ n ')       Fd.write (' B, ' + str (j) + ', ' + str (i) + ', ' + str (value) + ' \ n ') temp + = 1 fd.close () if __name__ = = ' __main__ ': Adjmatrix = Adjmatrix () Adjmatrix.build_adjmatrix (10)

2,python draw this graph without a direction.

# coding:utf-8__author__ = ' Taohao ' import matplotlib.pyplot as Pltimport networkx as Nxclass drawgraph (object):    def _ _init__ (self):        self.graph = NX. Graph (name= ' graph ')    def build_graph (self):        fd = open ('./adjmatrix.txt ', ' r ') for line in        FD:            item = Line.split (', ')            print item            # length = Len (item)            if item[0] = = ' A ':                self.graph.add_node (item[1])                Self.graph.add_node (item[2])                # self.graph.add_nodes_from ([Int (item[1]), int (item[2])])                if item[3][0] = = ' 1 ':                    Self.graph.add_edge (item[1], item[2])    def draw_graph (self):        nx.draw_networkx (Self.graph, With_ labels=true)   # DRAW_NETWORKX () can display the label of Nodes        Plt.show () if __name__ = = ' __main__ ':    draw_ Graph = drawgraph ()    draw_graph.build_graph ()    draw_graph.draw_graph ()

The drawing is:

The number of triangles per node in the 3,hadoop MapReduce solution diagram

Import Java.io.bufferedreader;import java.io.file;import java.io.filereader;import java.io.IOException;  Import Java.util.HashMap;  Import Java.util.Iterator;  Import org.apache.hadoop.conf.Configuration;  Import Org.apache.hadoop.fs.Path;  Import Org.apache.hadoop.io.Text;  Import Org.apache.hadoop.mapreduce.Job;  Import Org.apache.hadoop.mapreduce.Mapper;  Import Org.apache.hadoop.mapreduce.Reducer;  Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;  Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;    Import Org.apache.hadoop.util.GenericOptionsParser;      The public class Matrixmutiply {/* * matrix is stored in a file.   * Just start two matrices in a file, Hadoop will do two times for two files map to do a map and reduce, * so that the other matrix has no data, the following reduce will have problems * Matrix storage in the form of: * a,1,1,2 Represents the first column of the first row of a matrix with data of 2 * a,1,2,1 * a,2,1,3 * a,2,2,4 * This is intended to prevent a map from being fragmented while reading data, resulting in incomplete data reads * Matrix by Python script    Generated, Python script See buildmatrix.py * * */private static int colnumb = 10; private static intRownuma = 10; public static class Matrixmapper extends Mapper<object, text, text, text>{/* * Rownuma and Colnum B need to be confirm manually * Map stage: * The data is organized into the form of key value * Key: The position number of the element of the result matrix * Value: The result matrix element needs to use the original two matrix data * to note that the matrix and the post matrix in the map phase of the processing data in the organization map output data is not the same * * */private Text MapO          Utputkey;                    Private Text Mapoutputvalue; @Override protected void Map (Object key, text value, Mapper<object, text, text, Text>. Context context) throws IOException, interruptedexception {//TODO auto-generated Method St              UB SYSTEM.OUT.PRINTLN ("Map input key:" + key);              SYSTEM.OUT.PRINTLN ("Map input value:" + value);              string[] matrixstrings = value.tostring (). Split ("\ n");                  for (String item:matrixstrings) {System.out.println ("item:" + Item); String[] elemstring = Item.split (",");                  for (string string:elemstring) {System.out.println ("element" + string);                  } System.out.println ("elemstring[0]:" +elemstring[0]); if (Elemstring[0].equals ("A")) {///must use equals here, but cannot use = = to judge/* * to map A Matrix, OUTPU                          TKey Outputvalue in the organization to pay attention to details, processing good details * */for (int i=1; i<=colnumb; i++) {                          Mapoutputkey = new Text (elemstring[1] + "," + string.valueof (i));                          Mapoutputvalue = new Text ("A:" + elemstring[2] + "," + elemstring[3]);                          Context.write (Mapoutputkey, Mapoutputvalue);                      System.out.println ("Mapouta:" +mapoutputkey+mapoutputvalue);                  }/* * For the difference between the organization of the B-matrix Map,mapoutput and the A matrix, the details must be handled * */ else if (elemstrIng[0].equals ("B")) {for (int j=1; j<=rownuma; J + +) {Mapoutputkey = new Te                          XT (String.valueof (j) + "," + elemstring[2]);                          Mapoutputvalue = new Text ("B:" + elemstring[1] + "," + elemstring[3]);                          Context.write (Mapoutputkey, Mapoutputvalue);                      System.out.println ("Mapoutb" +mapoutputkey+mapoutputvalue); }} else{//Just for debug System.out.println ("Mapout Else else                  :---------------> "+ Item); }}}} public static class Matixreducer extends Reducer<text, text, text, TEXT&G T          {Private hashmap<string, string> matrixahashmap = new hashmap<string, string> ();          Private hashmap<string, string> matrixbhashmap = new hashmap<string, string> ();                     Private String Val;    @Override      protected void reduce (text key, iterable<text> value, Reducer<text, text, text, text>.c Ontext context) throws IOException, interruptedexception {//TODO auto-generated method Stu              b System.out.println ("Reduce input key:" + key);              SYSTEM.OUT.PRINTLN ("Reduce input value:" + value.tostring ());                  for (Text item:value) {val = item.tostring ();                  System.out.println ("Val------------" +val);                      if (!val.equals ("0")) {string[] kv = val.substring (2). Split (",");                      if (Val.startswith ("A:")) {Matrixahashmap.put (kv[0], kv[1]);                      } if (Val.startswith ("B:")) {Matrixbhashmap.put (kv[0], kv[1]); }}}/*just for debug*/System.out.println ("Hashmapa:" + MatrixahaSHMAP);                  System.out.println ("HASHMAPB:" +matrixbhashmap);                  iterator<string> Iterator = Matrixahashmap.keyset (). Iterator ();                      int sum = 0;                              while (Iterator.hasnext ()) {String keystring = Iterator.next (); Sum + = Integer.parseint (Matrixahashmap.get (keystring)) * Integer.parseint (MATRIXBHASHM                      Ap.get (keystring));                      }//longwritable Reduceoutputvalue = new longwritable (sum);                      Text Reduceoutputvalue = new text (string.valueof (sum));                      Context.write (key, Reduceoutputvalue);                      /*just for debug*/System.out.println ("Reduce output key:" + key);          SYSTEM.OUT.PRINTLN ("Reduce output value:" + reduceoutputvalue); }} public static class Trianglemapper extends Mapper<object, text, text, Text>{@Overrideprotected void Map (Object key, text value,mapper<object, text, text, Text>. Context context) throws IOException, interruptedexception {//TODO auto-generated method stub/* * Map Multiplies the result of the matrix as input * map in The put key is assigned by Hadoop itself * Map input value is the matrix multiplied by each row in the result file * The map input value is processed * Map output key: line number * Map output value: Meta            Element's column number + "," + elements value * For example: * key:1 * value:1,2 * */string[] valuestring = value.tostring (). Split ("\ t");            string[] Keyitems = Valuestring[0].split (",");            Text Outputkey = new text (keyitems[0]);            Text Outputvalue = new text (keyitems[1] + "," + valuestring[1]);        Context.write (Outputkey, outputvalue);} } public static class Trianglereducer extends Reducer<text, text, text, text>{private string[] Matrix = n    EW String[colnumb*colnumb];    Private Boolean readglobalmatrixflag = false;            Private int[] Rowvalue = new Int[colnumb]; /* * Get the adjacency matrix of the original matrix * */Private void Getglobalmatrix () {//TODO auto-generated method stub String Adj_matrix_path = "/home/taohao/pycharmprojects/we    Bs/pythonscript/matrix/adjmatrix.txt ";    File File = new file (Adj_matrix_path);    BufferedReader bufferedreader = null;    try {BufferedReader = new BufferedReader (new FileReader (file));    String line = null;     while (line = Bufferedreader.readline ()) = null) {string[] items = Line.split ("[, \ n]");     if (Items[0].equals ("A")) {matrix[(Integer.parseint (items[1])-1) * Colnumb + integer.parseint (items[2])-1] = items[3];    }} bufferedreader.close ();    } catch (Exception e) {//Todo:handle Exception System.out.println (e.tostring ()); }} @Overrideprotected void reduce (text key, iterable<text> value,reducer<text, text, text, Text>. Context context) throws IOException, interruptedexception {/* * to solve triangles in behavioral units * * *///TODO auto-generated method Stubif (!re Adglobalmatrixflag) {Getglobalmatrix (); readglobalmatrixflag = true;} Iterator<            Text> iterator = Value.iterator ();            int rowsum = 0; while (Iterator.hasnext ()) {/* * Note here to mark the number of elements in reduce input value * Because the reduce input does not necessarily                 is from the past to the back, will be disorderly * */string[] Valueitems = Iterator.next (). toString (). Split (",");             Rowvalue[integer.parseint (Valueitems[0])-1] = Integer.parseint (valueitems[1]); }int RowKey = Integer.parseint (key.tostring ()); for (int i = 0; i < Colnumb; i++) {if (Matrix[i + (rowKey-1) *colnumb].equal S ("1")) {rowsum + = Rowvalue[i];}} Rowsum = ROWSUM/2;        Text Outputvalue = new text (string.valueof (rowsum)); Context.write (key, Outputvalue);}          public static void Main (string[] args) throws exception{configuration conf = new configuration ();        Configuration Conftriangle = new configuration ();          string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs (); if (otherargs.length! = 2) {System.err.println ("Usage:matrix <in> <out>");          System.exit (2);                Job Job = job.getinstance (conf, "Matrix");          Job.setjarbyclass (Matrixmutiply.class);        Job.setmapperclass (Matrixmapper.class);            /* According to the idea, here do not need combiner operation, do not need to specify *//Job.setcombinerclass (Matixreducer.class);          Job.setreducerclass (Matixreducer.class);          /* These two outputkeyclass outputvalueclass work simultaneously on map output and reduce output */* Note is simultaneous, so be consistent when specifying the output of map and reduce */          Job.setoutputkeyclass (Text.class);        Job.setoutputvalueclass (Text.class);          Fileinputformat.addinputpath (Job, New Path (Otherargs[0]));        Fileoutputformat.setoutputpath (Job, New Path (Otherargs[1]));                Job.waitforcompletion (TRUE);        Job Jobtriangle = job.getinstance (Conftriangle, "triangle");        Jobtriangle.setjarbyclass (Matrixmutiply.class);        Jobtriangle.setmapperclass (Trianglemapper.class); Jobtriangle.setreducercLass (Trianglereducer.class);          Jobtriangle.setoutputkeyclass (Text.class);        Jobtriangle.setoutputvalueclass (Text.class);          Fileinputformat.addinputpath (Jobtriangle, New Path ("/trianglematrixoutput/part-r-00000"));        Fileoutputformat.setoutputpath (Jobtriangle, New Path ("/triangleoutput"));    System.exit (Jobtriangle.waitforcompletion (true)? 0:1);   }  }

There are two rounds of MapReduce:

The first round does the matrix multiplication, the adjacency matrix squared, the result output to a directory below

In the second round, the results of the adjacency matrix squared as input, and the results are obtained by analyzing the result of multiplication and the original adjacency matrix.

Each round of MapReduce requires a job to control, so there are two job instances to be started to do two rounds of MapReduce

The importance of nodes in the Hadoop analysis diagram, solving the number of node triangles in the graph

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More