Command line run Hadoop instance wordcount program

Source: Internet
Author: User

Reference 1:http://www.cnblogs.com/flying5/archive/2011/05/04/2078408.html

The following points need to be explained. 1. If the WordCount program does not contain layers, that is, there is no package

Then use the following command:

Hadoop jar Wordcount.jar wordcount2/home/hadoop/input/20418.txt/home/hadoop/output/wordcount2-6

The command line basically means: Execute the Hadoop program, which is in Wordcount.jar . The Wordcount.jar contains the following class files, which are 3 class files generated by Wordcount.java compilation:

Wordcount.class Wordcount$map.class Wordcount$reduce.class

Four Claas files generated by Wordcount2.java compilation:

Classwordcount2.class Wordcount2$intsumreducer.class Wordcount2$intwritabledecreasingcomparator.class WordCount2$ Tokenizermapper.class

And these. class files are in the root directory of the jar bundle. The method of packaging the seven class files into a jar file is as follows: Assuming that the above seven class files are in the same directory WordCount folder, use the command line to go to that level directory, and then package by using the following command:

Jar CVF Wordcount.jar *.class
2. If the WordCount program contains layers

Then use the following command (error):

$ Hadoop jar Wordcount.jar wordcount2/home/hadoop/input/20418.txt/home/hadoop/output/wordcount2-7
Will complain, the error is as follows: (Specific reasons can refer to blog: http://blog.csdn.net/xw13106209/article/details/6861855)

Exception in thread ' main ' Java.lang.ClassNotFoundException:WordCount2 at
	Java.net.urlclassloader$1.run ( urlclassloader.java:202) at
	java.security.AccessController.doPrivileged (Native method)
	at Java.net.URLClassLoader.findClass (urlclassloader.java:190) at
	Java.lang.ClassLoader.loadClass ( classloader.java:306) at
	java.lang.ClassLoader.loadClass (classloader.java:247)
	at JAVA.LANG.CLASS.FORNAME0 (Native method) at
	java.lang.Class.forName (class.java:247)
	at Org.apache.hadoop.util.RunJar.main (runjar.java:149)
The correct commands are as follows:

$ Hadoop jar Wordcount.jar org.apache.hadoop.examples.wordcount2/home/hadoop/input/20418.txt/home/hadoop/output/ Wordcount2-7

The only different color jar packages here are different, and the jar package for the previous command is Wordcount.jar, and the Java package here is Wordcount.jar. The Wordcount.jar is packaged for the entire Org/apache/hadoop/examples directory. 3. The compiler Wordcount.java program uses a command similar to the following in reference 1 to compile the Wordcount.java

Javac-classpath/home/hadoop/program/hadoop-0.20.1/hadoop-0.20.1-core.jar wordcount.java-d/home/hadoop/WordCount /

This command was specified Classspath for/home/hadoop/program/hadoop-0.20.1/hadoop-0.20.1-core.jar. In fact, we can modify the environment variables to omit these contents.

The specific methods are:

sudo gedit/etc/profile
Modify

Export classpath=.: $JAVA _home/lib/tools.jar: $JAVA _home/lib/dt.jar: $CLASSPATH
For

Export hadoop_home=/home/hadoop/program/hadoop-0.20.1
export classpath=.: $JAVA _home/lib/tools.jar: $JAVA _home /lib/dt.jar: $CLASSPATH: $HADOOP _home/hadoop-0.20.1-core.jar
In this way, you can directly use the following command to compile the Wordcount.java program.

Javac  wordcount.java-d/home/hadoop/wordcount/

Note: The Wordcount.java here can have a package name or no package name. If there is no package name, there are several compiled. class files in the/home/hadoop/wordcount/directory. If a package name is available, the package's structure directory is also generated under the/home/hadoop/wordcount/directory. 4. Failure to compile Wordcount.java program

When using the same compilation command

Javac wordcount2.java-d/home/hadoop/wordcount/
You will be prompted with the following error:
WORDCOUNT2.JAVA:93: Unable to access
class file Org.apache.commons.cli.Options org.apache.commons.cli.Options not found
string[] Otherargs = new Genericoptionsparser (conf, args)
^
1 Error
The main reason for this error is that some classes are used in the Wordcount2.java, and these classes are not registered under the Classpath path. So it makes such a mistake. Because in eclipse, the jar package that we add to the library in Bulid path is not just a jar, there are many other jars, and if you want to get all of these jar packs out by the command line, it's very troublesome. It is recommended that you use ant to compile. The specifics of how to use ant may be mentioned in a later blog. We can use Eclipse to help us compile. Reference blog: Eclipse compiles java files.






Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.