Hadoop 2.7.2 (hadoop2.x) uses Ant to make Eclipse Plug-ins Hadoop-eclipse-plugin-2.7.2.jar

Source: Internet
Author: User
Tags static class hdfs dfs

Previously introduced me in Ubuntu under the combination of virtual machine Centos6.4 build hadoop2.7.2 cluster, in order to do mapreduce development, to use eclipse, and need the corresponding Hadoop plugin Hadoop-eclipse-plugin-2.7.2.jar, first of all, in the official Hadoop installation package before hadoop1.x with Eclipse Plug-ins, And now with the increase and variance of the programmer's Development tools Eclipse version, the Hadoop plug-in must match the development tools, and Hadoop's plug-in package is not entirely compatible. To simplify, Today's Hadoop installation package does not contain eclipse plug-ins. They need to compile themselves according to their own eclipse.

Use ant to make your own eclipse plug-in to introduce my environment and tools

Ubuntu 14.04, (System not important win also can, method is the same) IDE tools eclipse-jee-mars-2-linux-gtk-x86_64.tar.gz

Ant (This is also optional, binary installation or Apt-get installation is OK, configure the environment variable)

Export ant_home=/usr/local/ant/apache-ant-1.9.7
Export path= $PATH: $ANT _home/bin

Add an environment variable if you are prompted not to find an ant Launcher.ja package

Export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/jre/lib: $JAVA _home/lib/toos.jar: $ANT _home/lib/ Ant-launcher.jar

hadoop@hadoop:~$ ant-version
Apache Ant (TM) version 1.9.7 compiled on April 9 2016
Ant to make the eclipse plug-in needs to get Ant's Hadoop2x-eclipse-plugin plug-in, the following is the resource URL provided by GitHub

Https://github.com/winghc/hadoop2x-eclipse-plugin

Download in zip format and extract to a suitable path. Note that the permissions of the path and the directory owner are under the current user's

The path to the three compilation tools and resources is as follows

hadoop@hadoop:~$ CD hadoop2x-eclipse-plugin-master
hadoop@hadoop:hadoop2x-eclipse-plugin-master$ pwd
/ Home/hadoop/hadoop2x-eclipse-plugin-master
hadoop@hadoop:hadoop2x-eclipse-plugin-master$ cd/opt/software/ hadoop-2.7.2
hadoop@hadoop:hadoop-2.7.2$ pwd
/opt/software/hadoop-2.7.2
hadoop@hadoop:hadoop-2.7.2$ cd/home/hadoop/eclipse/
hadoop@hadoop:eclipse$ pwd
/home/hadoop/eclipse

According to the GitHub Description section: How to make, according to the operation of Ant

Unzip the downloaded Hadoop2x-eclipse-plugin and enter the directory hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/to perform the operation.

How to build

[Hdpusr@demo hadoop2x-eclipse-plugin]$ cd src/contrib/eclipse-plugin

# assume Hadoop installation Directory Is/usr/share/hadoop

[hdpusr@apclt eclipse-plugin]$ ant jar-dversion=2.4.1-dhadoop.version=2.4.1- Declipse.home=/opt/eclipse-dhadoop.home=/usr/share/hadoop

final jar is generated at directory

${ Hadoop2x-eclipse-plugin}/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.4.1.jar

But at this point I need the 2.7.2 Eclilpse plug-in, and GitHub downloaded Hadoop2x-eclipse-plugin configuration is the hadoop2.6 of the compilation environment, So you need to modify Ant's Build.xml configuration file and related files before executing ant

First file: Hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/build.xml

On line 83rd, find the <target name= "jar" depends= "compile" unless= "Skip.contrib" > tags, add and modify the Copy Child label tab to the content

Which is 127 lines below.

    <copy file= "${hadoop.home}/share/hadoop/common/lib/htrace-core-${htrace.version}-incubating.jar"  todir= " ${build.dir}/lib "verbose=" true "/>
    <copy file=" ${hadoop.home}/share/hadoop/common/lib/servlet-api-${ Servlet-api.version}.jar "  todir=" ${build.dir}/lib "verbose=" true "/> <copy
    " file= Share/hadoop/common/lib/commons-io-${commons-io.version}.jar "  todir=" ${build.dir}/lib "verbose=" true "/>

Then find the tag <attribute name= "Bundle-classpath" in the list of total value to add and modify Lib, as follows

Lib/servlet-api-${servlet-api.version}.jar,
 Lib/commons-io-${commons-io.version}.jar,
 lib/htrace-core- ${htrace.version}-incubating.jar "/>

Save exit. Note If you do not modify this, even if you compile the jar package and put it in Eclipse, the configuration link will complain.

But just adding and modifying these lib is not going to work, and the jar versions under share/home/common/lib/in hadoop2.6 to hadoop2.7 are quite different, so you need to modify the jar version accordingly. It took me half a day to change the check.

Note that this version of the environment configuration file in the Hadoop2x-eclipse-plugin-master and directory of Ivy Directory, also hihadoop2x-eclipse-plugin-master/ivy/ In Libraries.properties

The final modifications are shown below

For the convenience of everyone, I copied it over, #覆盖的就是原来的配置

hadoop.version=2.7.2 hadoop-gpl-compression.version=0.1.0 #These are the versions of our dependencies (in alphabetical or

der) apacheant.version=1.7.0 ant-task.version=2.0.10 asm.version=3.2 aspectj.version=1.6.5 aspectj.version=1.6.11
checkstyle.version=4.2 commons-cli.version=1.2 commons-codec.version=1.4 # commons-collections.version=3.2.1 commons-collections.version=3.2.2 commons-configuration.version=1.6 commons-daemon.version=1.0.13 # commons-httpclient.version=3.0.1 commons-httpclient.version=3.1 commons-lang.version=2.6 # commons-logging.version =1.0.4 commons-logging.version=1.1.3 # commons-logging-api.version=1.0.4 commons-logging-api.version=1.1.3 # commons-math.version=2.1 commons-math.version=3.1.1 commons-el.version=1.0 commons-fileupload.version=1.2 # commons-io.version=2.1 commons-io.version=2.4 commons-net.version=3.1 core.version=3.1.1 coreplugin.version=1.3.2 # hsqldb.version=1.8.0.10 # htrace.version=3.0.4 hsqldb.version=2.0.0 htrace.version=3.1.0 ivy.version=2.1.0 jasper.version=5.5.12 jackson.version=1.9.13 #not able to figureout the version of JSPs & Jsp-api version to get I 
T resolved throught Ivy # but still declared as we are going to have a local copy from the Lib folder jsp.version=2.1 jsp-api.version=5.5.12 jsp-api-2.1.version=6.1.14 jsp-2.1.version=6.1.14 # jets3t.version=0.6.1 jets3t.version= 0.9.0 jetty.version=6.1.26 jetty-util.version=6.1.26 # jersey-core.version=1.8 # jersey-json.version=1.8 # jersey-server.version=1.8 jersey-core.version=1.9 jersey-json.version=1.9 jersey-server.version=1.9 # junit.version
=4.5 junit.version=4.11 jdeb.version=0.8 jdiff.version=1.0.9 json.version=1.0 kfs.version=0.1

lucene-core.version=2.3.1 mockito-all.version=1.8.5 jsch.version=0.1.42 oro.version=2.0.8 rats-lib.version=0.5.1
servlet.version=4.0.6 servlet-api.version=2.5 # slf4j-api.version=1.7.5 # slf4j-log4j12.version=1.7.5
slf4j-api.version=1.7.10 slf4j-log4j12.version=1.7.10 wagon-http.version=1.0-beta-2xmlenc.version=0.52 # xerces.version=1.4.4 xerces.version=2.9.1 protobuf.version=2.5.0 guava.version=11.0.2
 Netty.version=3.6.2.final
After the modification is complete, workpiece, start ant

Enter src/contrib/eclipse-plugin/execute Ant command, as follows

hadoop@hadoop:hadoop2x-eclipse-plugin-master$ cd src/contrib/eclipse-plugin/
hadoop@hadoop:eclipse-plugin$ ls
build.properties  build.xml.bak  ivy.xml      meta-inf    resources
build.xml         Ivy            makeplus.sh  plugin.xml  src
hadoop@hadoop:eclipse-plugin$ ant jar-dhadoop.version=2.7.2- declipse.home=/home/hadoop/eclipse-dhadoop.home=/opt/software/hadoop-2.7.2
The process will be slow for the first time, and then soon.

When the final display is as follows, it means that the ant production is successful

Compile:
     [echo] Contrib:eclipse-plugin
    [Javac]/home/hadoop/hadoop2x-eclipse-plugin-master/src/contrib/ Eclipse-plugin/build.xml:76:warning: ' Includeantruntime ' wasn't set, defaulting to Build.sysclasspath=last; Set to False for repeatable builds

jar:
      [jar] Building jar:/home/hadoop/hadoop2x-eclipse-plugin-master/build/ Contrib/eclipse-plugin/hadoop-eclipse-plugin-2.7.2.jar Build

Successful total
time:4 seconds
Then put your own plug-ins into the plugins of the Eclipse directory

Then restart Eclipse or the shell command line refresh Eclipse as follows, and you can also show Eclipse's running process in the shell, and discover the cause in time after the error

hadoop@hadoop:eclipse-plugin$ cp/home/hadoop/hadoop2x-eclipse-plugin-master/build/contrib/eclipse-plugin/ hadoop-eclipse-plugin-2.7.2.jar/home/hadoop/eclipse/plugins/
hadoop@hadoop:eclipse-plugin$/home/hadoop/ Eclipse/eclipse-clean

Choose your own workspace, go to Eclipse, click on Windows Select Preferences, and in the list you can find out one more Hadoop map/reduce, select an installation directory


A distributed File system appears in Project Explorer in Eclipse, click Windows-->show View, and select MapReduce Tools

Opens the Mr Locations window, appears the kind elephant icon, then chooses to add a m/r configuration, and configures as follows


Of course, the location name is filled in, and then Map/reduce's master is here to correspond to the Hadoop cluster you have configured or to the distributed Core-site.xml and Mapred-sitexml file one by one, If the configuration is wrong, a link failure will appear

My configuration is as follows, so host is Hadoop (master node name), this can also write their own configuration master IP address, the port number is 9000 (file system host port number) and 9001 (MapReduce management node Joptracker host port number)



Then start the Hadoop cluster, test it in the shell and then pass the file transfer test through the Eclipse DFS locations, and use the FileSystem interface programming and MapReduce API programming test, Here just to verify that the plugin is available, HDFs to test it yourself, very simply, here to test a Mr Program. Telephone statistics, the format is as follows, the left is to call, the right is to be called, count the number of calls ranked, and show callers

11500001211 10086
11500001212 10010
15500001213
, 15500001214 11500001211 11500001212 10010
15500001213 10086
15500001214 110
The code section is as follows

Package HDFs;
Import java.io.IOException;
Import org.apache.hadoop.conf.Configuration;
Import org.apache.hadoop.conf.Configured;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.LongWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
Import Org.apache.hadoop.util.GenericOptionsParser;
Import Org.apache.hadoop.util.Tool;

Import Org.apache.hadoop.util.ToolRunner; public class MR extends configured implements Tool {enum counter{lineskip,} public static class Wcmapper extend s mapper<longwritable, text, text, text> {@Override protected void map (longwritable key, Text Value,context con Text) throws IOException, interruptedexception {String line =Value.tostring ();
				try {string[] Linesplit = Line.split ("");
				String anum = linesplit[0];
				String bnum = linesplit[1];
			Context.write (new text (bnum), new text (Anum));
			catch (Exception e) {context.getcounter (Counter.lineskip). Increment (1);//Error counter +1 return; }} public static class Intsumreduce extends Reducer<text, text, text, text> {@Override protected void R Educe (Text key, iterable<text> Values,context context) throws IOException, interruptedexception {String
			valuestring;
			String out= "";
				for (Text value:values) {valuestring = Value.tostring ();
			
			out+=valuestring+ "|";}
		Context.write (Key, New Text (out));
		} public int run (string[] args) throws Exception {Configuration conf = new Configuration ();
		string[] STRs = new Genericoptionsparser (conf, args). Getremainingargs ();
		
		Job Job = Parseinputandoutput (this, conf, args);
		Job.setjarbyclass (Mr.class); Fileinputformat.addinputpath (JOB, New Path (Strs[0]));

		Fileoutputformat.setoutputpath (Job, New Path (strs[1));
		Job.setmapperclass (Wcmapper.class);
		Job.setinputformatclass (Textinputformat.class);
		Job.setcombinerclass (Intsumreduce.class);
		Job.setreducerclass (Intsumreduce.class);
		Job.setoutputkeyclass (Text.class);
		Job.setoutputvalueclass (Text.class); Return Job.waitforcompletion (True)?
	0:1; Public Job parseinputandoutput (Tool Tool, Configuration conf, string[] args) throws Exception {//Validate if (a
			Rgs.length!= 2) {System.err.printf ("Usage:%s [generic options] <input> <output>");
		return null;
		}//Step 2:create Job Job = job.getinstance (conf, Tool.getclass (). Getsimplename ());
	return job; public static void Main (string[] args) throws Exception {//run map reduce int status = Toolrunner.run (new MR (),
		args);
	Step 5 exit System.exit (status); }

}
The upload file structure is as follows

hadoop@hadoop:~$ HDFs dfs-mkdir-p/user/hadoop/mr/wc/input hadoop@hadoop:~$ hdfs dfs-put top.data/user/hadoop/mr
/wc/input

Running the MR Program in Eclipse




Successful execution, output execution step in eclipse console, view execution results


There are no problems with the plugin


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.