This article still uses the classic example WordCount of MapReduce to test the Eclipse development environment.
Unlike most tutorials, the Hadoop used in this article is a 2.5.2 version, and Hadoop 2.X has changed significantly compared to previous versions of 0.X
In the case of jar packages, the jar in the Hadoop 2.x release is no longer concentrated in one hadoop-core*.jar, but is broken into multiple jars, such as using Hadoop 2.5.2 to run WordCount instances with at least the following three jars:
- $HADOOP _home/share/hadoop/common/hadoop-common-2.5.2.jar
- $HADOOP _home/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.5.2.jar
- $HADOOP _home/share/hadoop/common/lib/commons-cli-1.2.jar
In fact, we hadoop classpath
can get all the classpath information needed to run the Hadoop program by command. As shown in the following:
Figure out how to start compiling the Hadoop program after adding the jar package
There are two common ways to compile a MapReduce program:
1. Compile and package the MapReduce program using the command line
2. Install the Eclipse plugin to compile the MapReduce program
Here's a quick way to compile
After you create a new Java program in Eclipse, import the appropriate jar package so that you can import the jar package directly when you write the MapReduce program. This method is faster than the previous two methods. which jar packages need to be imported depends on the Java class that the program uses to determine the path of the package, as with 0. X is different, you can see the path to the jar package first as described above. The jar package is imported as follows:
Right-click the Java project you created--->properties, then select Java Build Path, select the Libraries key, and tap add External JARs Add the required jar package
Packaging jar Files
After editing the Java program, package the MapReduce project into a jar file and then send it to the master node of Hadoop to run the MapReduce program. The steps are as follows:
Right-click Java Engineering--->export--->jar file.
After selecting the jar file, click on the Next button to enter the Jar Files Filter dialog box
Note: only the src folder is selected, and the Classpath and project files cannot be added to the jar file.
Then, in the jar file under Select the Export destination, select the storage directory for the jar files and the file name of the jar.
Deployment Run
1. Send the generated jar package to the $hadoop_home directory of the master node of the HADOOP cluster
2. Run the MapReduce program and use the command behavior:
Hadoop jar Jar_name.jar Package_name.classname/inputfile_dir/outputfile_dir
Note: you should ensure that the Inputfile_dir exists before running the MapReduce program and that the Outputfile_dir does not exist.
When you send a jar file to the master node of the Hadoop cluster, you can use the SSH Secure file Transfer Client to send the jar file under Windows to the Master node under Linux
Use the following command to view the resulting file
Hadoop fs-text/outputfile_dir/part-r-00000
Reference:
Run your own MapReduce program using the command-line compilation package Hadoop2.6.0
Run the MapReduce program using Eclipse compilation Hadoop2.6.0_ubuntu/centos
Quick way to develop hadoop2.5.2 programs on eclipse