Video address: http://pan.baidu.com/s/1dDEgKwD
This section describes how to deploy a simple development environment and two instances with hands-on operations.
Deployment of Development Environment: http://www.cnblogs.com/admln/p/test-deployDevelopment.html
The first instance is wordcount.
Second Instance
1 package testHadoop; 2 3 import java.io.IOException; 4 5 import org.apache.hadoop.conf.Configuration; 6 import org.apache.hadoop.conf.Configured; 7 import org.apache.hadoop.fs.Path; 8 import org.apache.hadoop.io.LongWritable; 9 import org.apache.hadoop.io.Text;10 import org.apache.hadoop.mapred.TextOutputFormat;11 import org.apache.hadoop.mapreduce.Job;12 import org.apache.hadoop.mapreduce.Mapper;13 import org.apache.hadoop.mapreduce.Reducer;14 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;15 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;16 import org.apache.hadoop.util.Tool;17 import org.apache.hadoop.util.ToolRunner;18 19 @SuppressWarnings("deprecation")20 public class ReverseIndex extends Configured implements Tool{21 enum Counter{22 LINESKIP;23 }24 25 public static class Map extends Mapper<LongWritable,Text,Text,Text> {26 public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException {27 String line = value.toString();28 try {29 String[] lineSplit = line.split(" ");30 String anum = lineSplit[0];31 String bnum = lineSplit[1];32 33 context.write(new Text(bnum), new Text(anum));34 }catch(java.lang.ArrayIndexOutOfBoundsException e) {35 context.getCounter(Counter.LINESKIP).increment(1);36 return;37 }38 39 }40 }41 public static class Reduce extends Reducer<Text,Text,Text,Text> {42 public void reduce(Text key,Iterable<Text> values,Context context) throws IOException, InterruptedException {43 String valueString;44 String out = "";45 46 for(Text value:values) {47 valueString = value.toString();48 out += valueString+"|";49 }50 context.write(key, new Text(out));51 }52 }53 public int run(String[] args) throws Exception {54 Configuration conf = getConf();55 56 Job job = new Job(conf,"ReverseIndex");57 job.setJarByClass(ReverseIndex.class);58 59 FileInputFormat.addInputPath(job, new Path(args[0]));60 FileOutputFormat.setOutputPath(job, new Path(args[1]));61 62 job.setMapperClass(Map.class);63 job.setReducerClass(Reduce.class);64 //job.setOutputFormatClass(TextOutputFormat.class);65 job.setOutputKeyClass(Text.class);66 job.setOutputValueClass(Text.class);67 68 job.waitForCompletion(true);69 70 return job.isSuccessful()?0:1;71 72 }73 public static void main(String[] args) throws Exception {74 int res = ToolRunner.run(new Configuration(), new ReverseIndex(),args);75 System.exit(res);76 }77 }
A small problem occurs when the cluster is packaged and run in eclipse.
Version mismatch. Originally, JDK 7 was used in windows during compilation. in Linux, hadoop JDK is 1.6.
Compile the source code in Linux 1.6.
In practice, I also learned a little bit. If the input and output paths such as input output are used in running program commands, they are input and output under/user name/on HDFS by default.
If the path such as/input/output is
Convert data into gold hadoop video success 05