Previously introduced me in Ubuntu under the combination of virtual machine Centos6.4 build hadoop2.7.2 cluster, in order to do mapreduce development, to use eclipse, and need the corresponding Hadoop plugin Hadoop-eclipse-plugin-2.7.2.jar, first of all, in the official Hadoop installation package before hadoop1.x with Eclipse Plug-ins, And now with the increase and variance of the programmer's Development tools Eclipse version, the Hadoop plug-in must match the development tools, and Hadoop's plug-in package is not entirely compatible. To simplify, Today's Hadoop installation package does not contain eclipse plug-ins. They need to compile themselves according to their own eclipse.
Use ant to make your own eclipse plug-in to introduce my environment and tools
Ubuntu 14.04, (System not important win also can, method is the same) IDE tools eclipse-jee-mars-2-linux-gtk-x86_64.tar.gz
Ant (This is also optional, binary installation or Apt-get installation is OK, configure the environment variable)
Export ant_home=/usr/local/ant/apache-ant-1.9.7
Export path= $PATH: $ANT _home/bin
Add an environment variable if you are prompted not to find an ant Launcher.ja package
Export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/jre/lib: $JAVA _home/lib/toos.jar: $ANT _home/lib/ Ant-launcher.jar
hadoop@hadoop:~$ ant-version
Apache Ant (TM) version 1.9.7 compiled on April 9 2016
Ant to make the eclipse plug-in needs to get Ant's Hadoop2x-eclipse-plugin plug-in, the following is the resource URL provided by GitHub
Https://github.com/winghc/hadoop2x-eclipse-plugin
Download in zip format and extract to a suitable path. Note that the permissions of the path and the directory owner are under the current user's
The path to the three compilation tools and resources is as follows
hadoop@hadoop:~$ CD hadoop2x-eclipse-plugin-master
hadoop@hadoop:hadoop2x-eclipse-plugin-master$ pwd
/ Home/hadoop/hadoop2x-eclipse-plugin-master
hadoop@hadoop:hadoop2x-eclipse-plugin-master$ cd/opt/software/ hadoop-2.7.2
hadoop@hadoop:hadoop-2.7.2$ pwd
/opt/software/hadoop-2.7.2
hadoop@hadoop:hadoop-2.7.2$ cd/home/hadoop/eclipse/
hadoop@hadoop:eclipse$ pwd
/home/hadoop/eclipse
According to the GitHub Description section: How to make, according to the operation of Ant
Unzip the downloaded Hadoop2x-eclipse-plugin and enter the directory hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/to perform the operation.
How to build
[Hdpusr@demo hadoop2x-eclipse-plugin]$ cd src/contrib/eclipse-plugin
# assume Hadoop installation Directory Is/usr/share/hadoop
[hdpusr@apclt eclipse-plugin]$ ant jar-dversion=2.4.1-dhadoop.version=2.4.1- Declipse.home=/opt/eclipse-dhadoop.home=/usr/share/hadoop
final jar is generated at directory
${ Hadoop2x-eclipse-plugin}/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.4.1.jar
But at this point I need the 2.7.2 Eclilpse plug-in, and GitHub downloaded Hadoop2x-eclipse-plugin configuration is the hadoop2.6 of the compilation environment, So you need to modify Ant's Build.xml configuration file and related files before executing ant
First file: Hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/build.xml
On line 83rd, find the <target name= "jar" depends= "compile" unless= "Skip.contrib" > tags, add and modify the Copy Child label tab to the content
Which is 127 lines below.
<copy file= "${hadoop.home}/share/hadoop/common/lib/htrace-core-${htrace.version}-incubating.jar" todir= " ${build.dir}/lib "verbose=" true "/>
<copy file=" ${hadoop.home}/share/hadoop/common/lib/servlet-api-${ Servlet-api.version}.jar " todir=" ${build.dir}/lib "verbose=" true "/> <copy
" file= Share/hadoop/common/lib/commons-io-${commons-io.version}.jar " todir=" ${build.dir}/lib "verbose=" true "/>
Then find the tag <attribute name= "Bundle-classpath" in the list of total value to add and modify Lib, as follows
Lib/servlet-api-${servlet-api.version}.jar,
Lib/commons-io-${commons-io.version}.jar,
lib/htrace-core- ${htrace.version}-incubating.jar "/>
Save exit. Note If you do not modify this, even if you compile the jar package and put it in Eclipse, the configuration link will complain.
But just adding and modifying these lib is not going to work, and the jar versions under share/home/common/lib/in hadoop2.6 to hadoop2.7 are quite different, so you need to modify the jar version accordingly. It took me half a day to change the check.
Note that this version of the environment configuration file in the Hadoop2x-eclipse-plugin-master and directory of Ivy Directory, also hihadoop2x-eclipse-plugin-master/ivy/ In Libraries.properties
The final modifications are shown below
For the convenience of everyone, I copied it over, #覆盖的就是原来的配置
hadoop.version=2.7.2 hadoop-gpl-compression.version=0.1.0 #These are the versions of our dependencies (in alphabetical or
der) apacheant.version=1.7.0 ant-task.version=2.0.10 asm.version=3.2 aspectj.version=1.6.5 aspectj.version=1.6.11
checkstyle.version=4.2 commons-cli.version=1.2 commons-codec.version=1.4 # commons-collections.version=3.2.1 commons-collections.version=3.2.2 commons-configuration.version=1.6 commons-daemon.version=1.0.13 # commons-httpclient.version=3.0.1 commons-httpclient.version=3.1 commons-lang.version=2.6 # commons-logging.version =1.0.4 commons-logging.version=1.1.3 # commons-logging-api.version=1.0.4 commons-logging-api.version=1.1.3 # commons-math.version=2.1 commons-math.version=3.1.1 commons-el.version=1.0 commons-fileupload.version=1.2 # commons-io.version=2.1 commons-io.version=2.4 commons-net.version=3.1 core.version=3.1.1 coreplugin.version=1.3.2 # hsqldb.version=1.8.0.10 # htrace.version=3.0.4 hsqldb.version=2.0.0 htrace.version=3.1.0 ivy.version=2.1.0 jasper.version=5.5.12 jackson.version=1.9.13 #not able to figureout the version of JSPs & Jsp-api version to get I
T resolved throught Ivy # but still declared as we are going to have a local copy from the Lib folder jsp.version=2.1 jsp-api.version=5.5.12 jsp-api-2.1.version=6.1.14 jsp-2.1.version=6.1.14 # jets3t.version=0.6.1 jets3t.version= 0.9.0 jetty.version=6.1.26 jetty-util.version=6.1.26 # jersey-core.version=1.8 # jersey-json.version=1.8 # jersey-server.version=1.8 jersey-core.version=1.9 jersey-json.version=1.9 jersey-server.version=1.9 # junit.version
=4.5 junit.version=4.11 jdeb.version=0.8 jdiff.version=1.0.9 json.version=1.0 kfs.version=0.1
lucene-core.version=2.3.1 mockito-all.version=1.8.5 jsch.version=0.1.42 oro.version=2.0.8 rats-lib.version=0.5.1
servlet.version=4.0.6 servlet-api.version=2.5 # slf4j-api.version=1.7.5 # slf4j-log4j12.version=1.7.5
slf4j-api.version=1.7.10 slf4j-log4j12.version=1.7.10 wagon-http.version=1.0-beta-2xmlenc.version=0.52 # xerces.version=1.4.4 xerces.version=2.9.1 protobuf.version=2.5.0 guava.version=11.0.2
Netty.version=3.6.2.final
After the modification is complete, workpiece, start ant
Enter src/contrib/eclipse-plugin/execute Ant command, as follows
hadoop@hadoop:hadoop2x-eclipse-plugin-master$ cd src/contrib/eclipse-plugin/
hadoop@hadoop:eclipse-plugin$ ls
build.properties build.xml.bak ivy.xml meta-inf resources
build.xml Ivy makeplus.sh plugin.xml src
hadoop@hadoop:eclipse-plugin$ ant jar-dhadoop.version=2.7.2- declipse.home=/home/hadoop/eclipse-dhadoop.home=/opt/software/hadoop-2.7.2
The process will be slow for the first time, and then soon.
When the final display is as follows, it means that the ant production is successful
Compile:
[echo] Contrib:eclipse-plugin
[Javac]/home/hadoop/hadoop2x-eclipse-plugin-master/src/contrib/ Eclipse-plugin/build.xml:76:warning: ' Includeantruntime ' wasn't set, defaulting to Build.sysclasspath=last; Set to False for repeatable builds
jar:
[jar] Building jar:/home/hadoop/hadoop2x-eclipse-plugin-master/build/ Contrib/eclipse-plugin/hadoop-eclipse-plugin-2.7.2.jar Build
Successful total
time:4 seconds
Then put your own plug-ins into the plugins of the Eclipse directory
Then restart Eclipse or the shell command line refresh Eclipse as follows, and you can also show Eclipse's running process in the shell, and discover the cause in time after the error
hadoop@hadoop:eclipse-plugin$ cp/home/hadoop/hadoop2x-eclipse-plugin-master/build/contrib/eclipse-plugin/ hadoop-eclipse-plugin-2.7.2.jar/home/hadoop/eclipse/plugins/
hadoop@hadoop:eclipse-plugin$/home/hadoop/ Eclipse/eclipse-clean
Choose your own workspace, go to Eclipse, click on Windows Select Preferences, and in the list you can find out one more Hadoop map/reduce, select an installation directory
A distributed File system appears in Project Explorer in Eclipse, click Windows-->show View, and select MapReduce Tools
Opens the Mr Locations window, appears the kind elephant icon, then chooses to add a m/r configuration, and configures as follows
Of course, the location name is filled in, and then Map/reduce's master is here to correspond to the Hadoop cluster you have configured or to the distributed Core-site.xml and Mapred-sitexml file one by one, If the configuration is wrong, a link failure will appear
My configuration is as follows, so host is Hadoop (master node name), this can also write their own configuration master IP address, the port number is 9000 (file system host port number) and 9001 (MapReduce management node Joptracker host port number)
Then start the Hadoop cluster, test it in the shell and then pass the file transfer test through the Eclipse DFS locations, and use the FileSystem interface programming and MapReduce API programming test, Here just to verify that the plugin is available, HDFs to test it yourself, very simply, here to test a Mr Program. Telephone statistics, the format is as follows, the left is to call, the right is to be called, count the number of calls ranked, and show callers
11500001211 10086
11500001212 10010
15500001213
, 15500001214 11500001211 11500001212 10010
15500001213 10086
15500001214 110
The code section is as follows
Package HDFs;
Import java.io.IOException;
Import org.apache.hadoop.conf.Configuration;
Import org.apache.hadoop.conf.Configured;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
Import Org.apache.hadoop.util.GenericOptionsParser;
Import Org.apache.hadoop.util.Tool;
Import Org.apache.hadoop.util.ToolRunner; public class MR extends configured implements Tool {enum counter{lineskip,} public static class Wcmapper extend s mapper<longwritable, text, text, text> {@Override protected void map (longwritable key, Text Value,context con Text) throws IOException, interruptedexception {String line =Value.tostring ();
try {string[] Linesplit = Line.split ("");
String anum = linesplit[0];
String bnum = linesplit[1];
Context.write (new text (bnum), new text (Anum));
catch (Exception e) {context.getcounter (Counter.lineskip). Increment (1);//Error counter +1 return; }} public static class Intsumreduce extends Reducer<text, text, text, text> {@Override protected void R Educe (Text key, iterable<text> Values,context context) throws IOException, interruptedexception {String
valuestring;
String out= "";
for (Text value:values) {valuestring = Value.tostring ();
out+=valuestring+ "|";}
Context.write (Key, New Text (out));
} public int run (string[] args) throws Exception {Configuration conf = new Configuration ();
string[] STRs = new Genericoptionsparser (conf, args). Getremainingargs ();
Job Job = Parseinputandoutput (this, conf, args);
Job.setjarbyclass (Mr.class); Fileinputformat.addinputpath (JOB, New Path (Strs[0]));
Fileoutputformat.setoutputpath (Job, New Path (strs[1));
Job.setmapperclass (Wcmapper.class);
Job.setinputformatclass (Textinputformat.class);
Job.setcombinerclass (Intsumreduce.class);
Job.setreducerclass (Intsumreduce.class);
Job.setoutputkeyclass (Text.class);
Job.setoutputvalueclass (Text.class); Return Job.waitforcompletion (True)?
0:1; Public Job parseinputandoutput (Tool Tool, Configuration conf, string[] args) throws Exception {//Validate if (a
Rgs.length!= 2) {System.err.printf ("Usage:%s [generic options] <input> <output>");
return null;
}//Step 2:create Job Job = job.getinstance (conf, Tool.getclass (). Getsimplename ());
return job; public static void Main (string[] args) throws Exception {//run map reduce int status = Toolrunner.run (new MR (),
args);
Step 5 exit System.exit (status); }
}
The upload file structure is as follows
hadoop@hadoop:~$ HDFs dfs-mkdir-p/user/hadoop/mr/wc/input hadoop@hadoop:~$ hdfs dfs-put top.data/user/hadoop/mr
/wc/input
Running the MR Program in Eclipse
Successful execution, output execution step in eclipse console, view execution results
There are no problems with the plugin