hadoop-1.2.1 Pseudo-distributed set up, but also just run through the Hadoop-example.jar package wordcount, all this looks so easy.
But unexpectedly, his own Mr Program, run up to encounter the no job file jar and classnotfoundexception problems.
After a few twists and ends, the MapReduce I wrote was finally successfully run.
I did not add a third-party jar package (6 jar packages such as hadoop-core,commons-cli,commons-xxx) and all of my Code's jar packages to the remote cluster. In the local also did not make the third party jar package into Third-party.jar, nor use the "-libjars" parameter, even Genericoptionsparser is not used (many of the online solution say this to resolve the command parameters of Hadoop),
Key code:
Job Job = new Job (getconf ());
Job.setjarbyclass (Wordcountjob.class);
And
int res = Toolrunner.run (new Wordcountjob (), args);
Source code:
Package wordcount2;
Import java.io.IOException;
Import Java.util.StringTokenizer;
Import org.apache.hadoop.conf.Configuration;
Import org.apache.hadoop.conf.Configured;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.util.GenericOptionsParser;
Import Org.apache.hadoop.util.Tool;
Import Org.apache.hadoop.util.ToolRunner;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class Wordcountjob extends configured implements Tool {
public static class Tokenizermapper extends mapper<object,text,text,intwritable>{
Private final static intwritable one = new intwritable (1);
Private text Word = new text ();
public void Map (Object key,text value,context Context) throws ioexception,interruptedexception{
StringTokenizer ITR = new StringTokenizer (value.tostring ());
while (Itr.hasmoretokens ()) {
Word.set (Itr.nexttoken ());
Context.write (Word,one);
}
}
}
public static class Intsumreducer extends reducer<text,intwritable,text,intwritable>{
Private intwritable result = new intwritable ();
public void reduce (Text key,iterable<intwritable> values,context Context) throws IOException, interruptedexception {
int sum = 0;
for (intwritable val:values) {
Sum + = Val.get ();
}
Result.set (sum);
Context.write (key, result);
}
}
@Override
public int run (string[] args) throws Exception {
TODO auto-generated Method Stub
Configuration conf = new configuration ();
string[] Otherargs = new Genericoptionsparser (Conf,args). Getremainingargs ();
if (Args.length!=2) {
System.err.println ("Usage:wordcount <in> <out>");
System.exit (2);
}
Job Job = new Job (conf, "WORDCOUNTMR");
Job Job = new Job (getconf ());
Job.setjarbyclass (Wordcountjob.class);
Job.setmapperclass (Tokenizermapper.class);
Job.setcombinerclass (Intsumreducer.class);
Job.setreducerclass (Intsumreducer.class);
Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Intwritable.class);
Fileinputformat.addinputpath (Job, New Path (Args[0]));
Fileoutputformat.setoutputpath (Job, New Path (Args[1]));
System.exit (Job.waitforcompletion (true)? 0:1);
return 0;
}
public static void Main (string[] args) throws exception{
int res = Toolrunner.run (new Wordcountjob (), args);
System.exit (RES);
}
}
Compiled into a jar package, you can use the command (javac-classpath/home/lzc/hadoop-1.2.1/hadoop-core-1.2.1.jar:/home/lzc/hadoop-1.2.1/lib/ commons-cli-1.2.jar-d./classes//src/wordcountjob.java and JAR-CVFM wordcountjob.jar-c./classes/two commands), The simplest way to do this is to use the Eclipse's Export Jar file feature to generate a jar file separately from that class.
Execute the following command by putting the generated jar package CP to hadoop_home.
[Email protected]:~/dolphin/hadoop-1.2.1$ bin/hadoop jar Wc2.jar Wordcount2. Wordcountjob input/file*.txt Output
14/12/10 15:48:59 INFO input. Fileinputformat:total input paths to Process:2
14/12/10 15:48:59 INFO util. Nativecodeloader:loaded The Native-hadoop Library
14/12/10 15:48:59 WARN Snappy. Loadsnappy:snappy Native Library not loaded
14/12/10 15:49:00 INFO mapred. Jobclient:running job:job_201412080836_0026
14/12/10 15:49:01 INFO mapred. Jobclient:map 0% Reduce 0%
14/12/10 15:49:06 INFO mapred. Jobclient:map 100% Reduce 0%
14/12/10 15:49:13 INFO mapred. Jobclient:map 100% Reduce 33%
14/12/10 15:49:15 INFO mapred. Jobclient:map 100% Reduce 100%
14/12/10 15:49:15 INFO mapred. Jobclient:job complete:job_201412080836_0026
14/12/10 15:49:15 INFO mapred. Jobclient:counters:29
14/12/10 15:49:15 INFO mapred. Jobclient:job Counters
14/12/10 15:49:15 INFO mapred. jobclient:launched Reduce Tasks=1
14/12/10 15:49:15 INFO mapred. jobclient:slots_millis_maps=7921
14/12/10 15:49:15 INFO mapred. Jobclient:total time spent by all reduces waiting after reserving slots (ms) =0
14/12/10 15:49:15 INFO mapred. Jobclient:total time spent by all maps waiting after reserving slots (ms) =0
14/12/10 15:49:15 INFO mapred. jobclient:launched Map tasks=2
14/12/10 15:49:15 INFO mapred. Jobclient:data-local Map tasks=2
14/12/10 15:49:15 INFO mapred. jobclient:slots_millis_reduces=9018
14/12/10 15:49:15 INFO mapred. Jobclient:file Output Format Counters
14/12/10 15:49:15 INFO mapred. Jobclient:bytes written=48
14/12/10 15:49:15 INFO mapred. Jobclient:filesystemcounters
14/12/10 15:49:15 INFO mapred. jobclient:file_bytes_read=102
14/12/10 15:49:15 INFO mapred. jobclient:hdfs_bytes_read=284
14/12/10 15:49:15 INFO mapred. jobclient:file_bytes_written=190665
14/12/10 15:49:15 INFO mapred. jobclient:hdfs_bytes_written=48
14/12/10 15:49:15 INFO mapred. Jobclient:file Input Format Counters
14/12/10 15:49:15 INFO mapred. Jobclient:bytes read=48
14/12/10 15:49:15 INFO mapred. Jobclient:map-reduce Framework
14/12/10 15:49:15 INFO mapred. Jobclient:map output materialized bytes=108
14/12/10 15:49:15 INFO mapred. Jobclient:map input records=2
14/12/10 15:49:15 INFO mapred. Jobclient:reduce Shuffle bytes=108
14/12/10 15:49:15 INFO mapred. Jobclient:spilled records=16
14/12/10 15:49:15 INFO mapred. Jobclient:map Output bytes=80
14/12/10 15:49:15 INFO mapred. Jobclient:cpu Time Spent (ms) =2420
14/12/10 15:49:15 INFO mapred. Jobclient:total committed heap usage (bytes) =390004736
14/12/10 15:49:15 INFO mapred. Jobclient:combine input Records=8
14/12/10 15:49:15 INFO mapred. jobclient:split_raw_bytes=236
14/12/10 15:49:15 INFO mapred. Jobclient:reduce input Records=8
14/12/10 15:49:15 INFO mapred. Jobclient:reduce input Groups=6
14/12/10 15:49:15 INFO mapred. Jobclient:combine Output records=8
14/12/10 15:49:15 INFO mapred. Jobclient:physical memory (bytes) snapshot=436707328
14/12/10 15:49:15 INFO mapred. Jobclient:reduce Output records=6
14/12/10 15:49:15 INFO mapred. Jobclient:virtual memory (bytes) snapshot=1908416512
14/12/10 15:49:15 INFO mapred. Jobclient:map Output records=8
[Email protected]:~/dolphin/hadoop-1.2.1$ bin/hadoop fs-ls Output
Found 3 Items
-rw-r--r--2 hadoop121 supergroup 0 2014-12-10 15:49/user/hadoop121/output/_success
Drwxr-xr-x-hadoop121 supergroup 0 2014-12-10 15:49/user/hadoop121/output/_logs
-rw-r--r--2 hadoop121 supergroup 2014-12-10 15:49/user/hadoop121/output/part-r-00000
[Email protected]:~/dolphin/hadoop-1.2.1$ bin/hadoop Fs-cat output/part-r-00000
Hadoop 1
Hello 2
Word 1
Hadoop 1
Hello 2
Word 1
Some people say that HDFs cannot access local files, there are permissions problems, but I have deliberately tried, local as successful execution.
[Email protected]:~/dolphin/hadoop-1.2.1$ bin/hadoop Jar/home/lzc/workspace/wordcount1/wc2.jar Wordcount2. Wordcountjob input/file*.txt Output
14/12/10 16:08:26 INFO input. Fileinputformat:total input paths to Process:2
14/12/10 16:08:26 INFO util. Nativecodeloader:loaded The Native-hadoop Library
14/12/10 16:08:26 WARN Snappy. Loadsnappy:snappy Native Library not loaded
14/12/10 16:08:27 INFO mapred. Jobclient:running job:job_201412080836_0027
14/12/10 16:08:28 INFO mapred. Jobclient:map 0% Reduce 0%
14/12/10 16:08:33 INFO mapred. Jobclient:map 100% Reduce 0%
14/12/10 16:08:40 INFO mapred. Jobclient:map 100% Reduce 33%
14/12/10 16:08:41 INFO mapred. Jobclient:map 100% Reduce 100%
14/12/10 16:08:42 INFO mapred. Jobclient:job complete:job_201412080836_0027
14/12/10 16:08:42 INFO mapred. Jobclient:counters:29
14/12/10 16:08:42 INFO mapred. Jobclient:job Counters
14/12/10 16:08:42 INFO mapred. jobclient:launched Reduce Tasks=1
14/12/10 16:08:42 INFO mapred. jobclient:slots_millis_maps=7221
14/12/10 16:08:42 INFO mapred. Jobclient:total time spent by all reduces waiting after reserving slots (ms) =0
14/12/10 16:08:42 INFO mapred. Jobclient:total time spent by all maps waiting after reserving slots (ms) =0
14/12/10 16:08:42 INFO mapred. jobclient:launched Map tasks=2
14/12/10 16:08:42 INFO mapred. Jobclient:data-local Map tasks=2
14/12/10 16:08:42 INFO mapred. jobclient:slots_millis_reduces=8677
14/12/10 16:08:42 INFO mapred. Jobclient:file Output Format Counters
14/12/10 16:08:42 INFO mapred. Jobclient:bytes written=48
14/12/10 16:08:42 INFO mapred. Jobclient:filesystemcounters
14/12/10 16:08:42 INFO mapred. jobclient:file_bytes_read=102
14/12/10 16:08:42 INFO mapred. jobclient:hdfs_bytes_read=284
14/12/10 16:08:42 INFO mapred. jobclient:file_bytes_written=190665
14/12/10 16:08:42 INFO mapred. jobclient:hdfs_bytes_written=48
14/12/10 16:08:42 INFO mapred. Jobclient:file Input Format Counters
14/12/10 16:08:42 INFO mapred. Jobclient:bytes read=48
14/12/10 16:08:42 INFO mapred. Jobclient:map-reduce Framework
14/12/10 16:08:42 INFO mapred. Jobclient:map output materialized bytes=108
14/12/10 16:08:42 INFO mapred. Jobclient:map input records=2
14/12/10 16:08:42 INFO mapred. Jobclient:reduce Shuffle bytes=108
14/12/10 16:08:42 INFO mapred. Jobclient:spilled records=16
14/12/10 16:08:42 INFO mapred. Jobclient:map Output bytes=80
14/12/10 16:08:42 INFO mapred. Jobclient:cpu Time Spent (ms) =2280
14/12/10 16:08:42 INFO mapred. Jobclient:total committed heap usage (bytes) =373489664
14/12/10 16:08:42 INFO mapred. Jobclient:combine input Records=8
14/12/10 16:08:42 INFO mapred. jobclient:split_raw_bytes=236
14/12/10 16:08:42 INFO mapred. Jobclient:reduce input Records=8
14/12/10 16:08:42 INFO mapred. Jobclient:reduce input Groups=6
14/12/10 16:08:42 INFO mapred. Jobclient:combine Output records=8
14/12/10 16:08:42 INFO mapred. Jobclient:physical memory (bytes) snapshot=433147904
14/12/10 16:08:42 INFO mapred. Jobclient:reduce Output records=6
14/12/10 16:08:42 INFO mapred. Jobclient:virtual memory (bytes) snapshot=1911033856
14/12/10 16:08:42 INFO mapred. Jobclient:map Output records=8
[Email protected]:~/dolphin/hadoop-1.2.1$
References
1.http://dongxicheng.org/mapreduce/run-hadoop-job-problems/
2.http://lucene.472066.n3.nabble.com/trouble-with-word-count-example-td4023269.html
3.http://stackoverflow.com/questions/22850532/warn-mapred-jobclient-no-job-jar-file-set-user-classes-may-not-be-found
Solution:no job file jar and ClassNotFoundException (hadoop,mapreduce)