Requirements: single-table association issues. The relationship between the child and the parent in the document
Model: Child-parent.txt
Xiaoming Daxiong
Daxiong Alice
Daxiong Jack.
Output: Xiaoming Alice
Xiaoming Jack.
Analysis Design:
Mapper Part design:
1, <K1,K1>K1 representative: The number of rows of data location, V1 representative: A row of data.
2, left table: <K2,V2>K2 representative: The parent name, V2 representative: (1,child name), here 1: represents the left table flag.
3, right table: <K3,V3>K3 representative: Child name, V3 representative: (2,parent name), here 2: represents the right table flag.
Reduce part design:
4, <K4,V4>K4 representative: The same KEY,V4 representative:list<string>
5, seek Cartesian product <K5,V5>:K5 representative: Grandchild name, V5 representative: Grandparent name.
Program section:
Singletontablejoinmapper class
PackageCom.cn.singletonTableJoin;Importjava.io.IOException;ImportJava.util.StringTokenizer;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Mapper; Public classSingletontablejoinmapperextendsMapper<object, text, text, text>{@Overrideprotected voidMap (Object key, text value, Mapper<object, text, text, text>. Context context)throwsIOException, interruptedexception {String childname=NewString (); String ParentName=NewString (); String Relationtype=NewString (); String[] Values=NewString[2]; inti = 0; StringTokenizer ITR=NewStringTokenizer (value.tostring ()); while(Itr.hasmoreelements ()) {Values[i]=Itr.nexttoken (); I++; } if(Values[0].compareto ("child")! = 0) {ChildName= Values[0]; ParentName= Values[1]; Relationtype= "1"; Context.write (NewText (ParentName),NewText (relationtype+ "" +childname)); Relationtype= "2"; Context.write (NewText (ChildName),NewText (relationtype+ "" +parentname)); } } }
Singletontablejoinreduce class:
PackageCom.cn.singletonTableJoin;Importjava.io.IOException;Importjava.util.ArrayList;ImportJava.util.Iterator;Importjava.util.List;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Reducer; Public classSingletontablejoinreduceextendsReducer<text, text, text, text>{@Overrideprotected voidReduce (text key, iterable<text> values, Reducer<text, text, text, text>. Context context)throwsIOException, interruptedexception {List<String> grandchild =NewArraylist<string>(); List<String> grandparent =NewArraylist<string>(); Iterator<Text> ITR =Values.iterator (); while(Itr.hasnext ()) {string[] record= Itr.next (). toString (). Split (""); if(0 = = Record[0].length ()) {Continue; } if("1". Equals (record[0]) {Grandchild.add (record[1]); }Else if("2". Equals (record[0]) {Grandparent.add (record[1]); } } if(0! = grandchild.size () && 0! =grandparent.size ()) { for(String grandchild:grandchild) { for(String grandparent:grandparent) {context.write (NewText (grandchild),NewText (grandparent)); } } } }}
Singletontablejoin class
PackageCom.cn.singletonTableJoin;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.Path;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;ImportOrg.apache.hadoop.util.GenericOptionsParser;/*** Single-Table Association *@authorRoot **/ Public classSingletontablejoin { Public Static voidMain (string[] args)throwsException {Configuration conf=NewConfiguration (); String[] Otherargs=Newgenericoptionsparser (conf, args). Getremainingargs (); if(Otherargs.length! = 2) {System.err.println ("Usage:singletontablejoin"); System.exit (2); } //Create a jobJob Job =NewJob (conf, "Singletontablejoin"); Job.setjarbyclass (Singletontablejoin.class); //setting the input and output path of a fileFileinputformat.addinputpath (Job,NewPath (otherargs[0])); Fileoutputformat.setoutputpath (Job,NewPath (otherargs[1])); //set up mapper and reduce processing classesJob.setmapperclass (Singletontablejoinmapper.class); Job.setreducerclass (singletontablejoinreduce.class); //Setting the output key-value data typeJob.setoutputkeyclass (Text.class); Job.setoutputvalueclass (Text.class); //submit the job and wait for it to completeSystem.exit (Job.waitforcompletion (true) ? 0:1); }}
Take the summary as a habit.
Singletontablejoin of the Hadoop program MapReduce