Hadoop 2.7.2 (hadoop2.x) uses Ant to make Eclipse plugins Hadoop-eclipse-plugin-2.7.2.jar

Source: Internet
Author: User
Tags hdfs dfs



Previously introduced me in Ubuntu under the combination of virtual machine Centos6.4 build hadoop2.7.2 cluster, in order to do mapreduce development, to use eclipse, and need the corresponding Hadoop plug-in Hadoop-eclipse-plugin-2.7.2.jar, first of all, before the hadoop1.x in the official Hadoop installation package is self-contained Eclipse plug-in, Now with the increase and divergence of the developer tools Eclipse version of the programmer, Hadoop plug-ins must be matched with development tools, and Hadoop's plug-in packages are not all compatible. To simplify, Today's Hadoop installation package does not contain eclipse plugins. You need to compile it yourself according to your own eclipse.



Make your own eclipse plugin with Ant and introduce my environment and tools



Ubuntu 14.04, (System not important win can also, methods are the same) IDE tools eclipse-jee-mars-2-linux-gtk-x86_64.tar.gz









Ant (This is also optional, binary installation or Apt-get installation can be, configure environment variables)



Export ant_home=/usr/local/ant/apache-ant-1.9.7
Export path= $PATH: $ANT _home/bin



If you are prompted to find an ant Launcher.ja package, add an environment variable



Export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/jre/lib: $JAVA _home/lib/toos.jar: $ANT _home/lib/ Ant-launcher.jar


[Email protected]:~$ ant-versionapache Ant (TM) version 1.9.7 compiled on April 9 2016
The ant authoring Eclipse plugin requires access to the Ant Hadoop2x-eclipse-plugin plugin, which is the resource URL provided by GitHub


Https://github.com/winghc/hadoop2x-eclipse-plugin



Download it in zip format and unzip it to a suitable path. Note The permissions of the path and the directory owner are the current user



The paths to the three compilation tools and resources are as follows





[email protected]:~$ cd hadoop2x-eclipse-plugin-master
[email protected]:hadoop2x-eclipse-plugin-master$ pwd
/home/hadoop/hadoop2x-eclipse-plugin-master
[email protected]:hadoop2x-eclipse-plugin-master$ cd /opt/software/hadoop-2.7.2
[email protected]:hadoop-2.7.2$ pwd
/opt/software/hadoop-2.7.2
[email protected]:hadoop-2.7.2$ cd /home/hadoop/eclipse/
[email protected]:eclipse$ pwd
/home/hadoop/eclipse

According to the GitHub Instructions section: How to make, follow the actions of Ant





Unzip the downloaded Hadoop2x-eclipse-plugin and go to the directory hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/to perform the operation.





How to build
[[email protected] hadoop2x-eclipse-plugin]$ cd src/contrib/eclipse-plugin
# Assume hadoop installation directory is /usr/share/hadoop
[[email protected] eclipse-plugin]$ ant jar -Dversion=2.4.1 -Dhadoop.version=2.4.1 -Declipse.home=/opt/eclipse -Dhadoop.home=/usr/share/hadoop
final jar will be generated at directory
${hadoop2x-eclipse-plugin}/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.4.1.jar





But what I need at this point is 2.7.2 's Eclilpse plugin, and the Hadoop2x-eclipse-plugin configuration that GitHub has downloaded is hadoop2.6 's compilation environment, So you need to modify Ant's Build.xml configuration file and related files before executing ant



First file: Hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/build.xml



In line 83rd, find <target name= "jar" depends= "compile" unless= "Skip.contrib" > tags, add and modify copy sub-label label content



That's 127 lines below.





    <copy file="${hadoop.home}/share/hadoop/common/lib/htrace-core-${htrace.version}-incubating.jar"  todir="${build.dir}/lib" verbose="true"/>
    <copy file="${hadoop.home}/share/hadoop/common/lib/servlet-api-${servlet-api.version}.jar"  todir="${build.dir}/lib" verbose="true"/>
    <copy file="${hadoop.home}/share/hadoop/common/lib/commons-io-${commons-io.version}.jar"  todir="${build.dir}/lib" verbose="true"/>





Then find the label <attribute name= "Bundle-classpath" in the Align total value of the list corresponding to add and modify Lib, as follows





 lib/servlet-api-${servlet-api.version}.jar,
 lib/commons-io-${commons-io.version}.jar,
 lib/htrace-core-${htrace.version}-incubating.jar"/>

Save exit. Note If you do not modify this, even if you compile the jar package and put it in Eclipse, the configuration link will be error-





However, just adding and modifying these lib is not possible, hadoop2.6 to hadoop2.7 in the jar version share/home/common/lib/is a lot of different, so also need to modify the corresponding jar version: It took me a half a day. A check mark changes.



Note that this version of the environment configuration file in the Hadoop2x-eclipse-plugin-master directory with the Ivy directory, also hihadoop2x-eclipse-plugin-master/ivy/ In Libraries.properties



The final modifications are shown below






In order to facilitate everyone, I copied over, #覆盖的就是原来的配置





hadoop.version=2.7.2
hadoop-gpl-compression.version=0.1.0

#These are the versions of our dependencies (in alphabetical order)
apacheant.version=1.7.0
ant-task.version=2.0.10

asm.version=3.2
aspectj.version=1.6.5
aspectj.version=1.6.11

checkstyle.version=4.2

commons-cli.version=1.2
commons-codec.version=1.4
# commons-collections.version=3.2.1
commons-collections.version=3.2.2
commons-configuration.version=1.6
commons-daemon.version=1.0.13
# commons-httpclient.version=3.0.1
commons-httpclient.version=3.1
commons-lang.version=2.6
# commons-logging.version=1.0.4
commons-logging.version=1.1.3
# commons-logging-api.version=1.0.4
commons-logging-api.version=1.1.3
# commons-math.version=2.1
commons-math.version=3.1.1
commons-el.version=1.0
commons-fileupload.version=1.2
# commons-io.version=2.1
commons-io.version=2.4
commons-net.version=3.1
core.version=3.1.1
coreplugin.version=1.3.2

# hsqldb.version=1.8.0.10
# htrace.version=3.0.4
hsqldb.version=2.0.0
htrace.version=3.1.0

ivy.version=2.1.0

jasper.version=5.5.12
jackson.version=1.9.13
#not able to figureout the version of jsp & jsp-api version to get it resolved throught ivy
# but still declared here as we are going to have a local copy from the lib folder
jsp.version=2.1
jsp-api.version=5.5.12
jsp-api-2.1.version=6.1.14
jsp-2.1.version=6.1.14
# jets3t.version=0.6.1
jets3t.version=0.9.0
jetty.version=6.1.26
jetty-util.version=6.1.26
# jersey-core.version=1.8
# jersey-json.version=1.8
# jersey-server.version=1.8
jersey-core.version=1.9
jersey-json.version=1.9
jersey-server.version=1.9
# junit.version=4.5
junit.version=4.11
jdeb.version=0.8
jdiff.version=1.0.9
json.version=1.0

kfs.version=0.1

log4j.version=1.2.17
lucene-core.version=2.3.1

mockito-all.version=1.8.5
jsch.version=0.1.42

oro.version=2.0.8

rats-lib.version=0.5.1

servlet.version=4.0.6
servlet-api.version=2.5
# slf4j-api.version=1.7.5
# slf4j-log4j12.version=1.7.5
slf4j-api.version=1.7.10
slf4j-log4j12.version=1.7.10


wagon-http.version=1.0-beta-2
xmlenc.version=0.52
# xerces.version=1.4.4
xerces.version=2.9.1

protobuf.version=2.5.0
guava.version=11.0.2
netty.version=3.6.2.Final
After the modification is finished, the work is done and the ant is started





Enter src/contrib/eclipse-plugin/to execute the ant command, as follows





[email protected]:hadoop2x-eclipse-plugin-master$ cd src/contrib/eclipse-plugin/
[email protected]:eclipse-plugin$ ls
build.properties  build.xml.bak  ivy.xml      META-INF    resources
build.xml         ivy            makePlus.sh  plugin.xml  src
[email protected]:eclipse-plugin$ ant jar -Dhadoop.version=2.7.2 -Declipse.home=/home/hadoop/eclipse -Dhadoop.home=/opt/software/hadoop-2.7.2
This process will be slow for the first time, and then soon.





When the final display is shown below, it means that ant production is successful





compile:
     [echo] contrib: eclipse-plugin
    [javac] /home/hadoop/hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin/build.xml:76: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds

jar:
      [jar] Building jar: /home/hadoop/hadoop2x-eclipse-plugin-master/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.7.2.jar

BUILD SUCCESSFUL
Total time: 4 seconds
[email protected]:eclipse-plugin$ 
Then put the plugin you made into the Eclipse directory plugins





Then restart Eclipse or the shell command line to refresh Eclipse as follows, as well as display the eclipse's running process in the shell, and find out why in time after the error


[email protected]:eclipse-plugin$ cp /home/hadoop/hadoop2x-eclipse-plugin-master/build/contrib/eclipse-plugin/hadoop-eclipse-plugin-2.7.2.jar /home/hadoop/eclipse/plugins/
[email protected]:eclipse-plugin$ /home/hadoop/eclipse/eclipse -clean

Choose your own workspace, enter Eclipse, click on Windows Select Preferences, in the list can find out a Hadoop map/reduce, select an installation directory










A distributed file system appears in Eclipse's Project Explorer, click Windows-->show View and select MapReduce Tools





Open the Mr Locations window, there is a friendly elephant icon, then choose to add a M/R configuration, and configure the following






Of course, the location name is filled in here, and then the master of Map/reduce is here to be configured with your own Hadoop cluster or for distributed Core-site.xml and Mapred-sitexml file one by one, Incorrect configuration will show link failure



My configuration is as follows, so host is the Hadoop (Master node name), this can also write their own configuration master node IP address, port number is 9000 (file system host port number) and 9001 (MapReduce management node Joptracker host port number)







Then start the Hadoop cluster, test it briefly in the shell and then pass the file transfer test through the DFS locations of Eclipse, as well as API programming tests using FileSystem interface programming and MapReduce, Here is just to verify that this plugin is available, HDFS test it yourself, very simple, here to test a Mr Program. Phone statistics, the format is as follows, the left is to make a call, the right is to be called, the number of calls to statistics ranked, and show callers





11500001211 10086
11500001212 10010
15500001213 110
15500001214 120
11500001211 10010
11500001212 10010
15500001213 10086
15500001214 110
The code section is as follows







package hdfs;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class MR extends Configured implements Tool {
	
	enum Counter{
		LINESKIP,
	}

	public static class WCMapper extends Mapper<LongWritable, Text, Text, Text> {
		@Override
		protected void map(LongWritable key, Text value,Context context)
				throws IOException, InterruptedException {
			String line = value.toString();
			
			try {
				String[] lineSplit = line.split(" ");
				String anum = lineSplit[0];
				String bnum = lineSplit[1];
				context.write(new Text(bnum), new Text(anum));
			} catch (Exception e) {
				context.getCounter(Counter.LINESKIP).increment(1);//出错计数器+1
				return;
			}
		}
	}

	public static class IntSumReduce extends Reducer<Text, Text, Text, Text> {
		@Override
		protected void reduce(Text key, Iterable<Text> values,Context context)
				throws IOException, InterruptedException {
			
			String valueString;
			String out="";
			for(Text value: values){
				valueString = value.toString();
				out+=valueString+"|";
			}
			
			context.write(key, new Text(out));
		}
	}

	public int run(String[] args) throws Exception {
		Configuration conf = new Configuration();
		String[] strs = new GenericOptionsParser(conf, args).getRemainingArgs();
		Job job = parseInputAndOutput(this, conf, args);
		
		job.setJarByClass(MR.class);
		FileInputFormat.addInputPath(job, new Path(strs[0]));
		FileOutputFormat.setOutputPath(job, new Path(strs[1]));

		job.setMapperClass(WCMapper.class);
		job.setInputFormatClass(TextInputFormat.class);
		//job.setCombinerClass(IntSumReduce.class);
		job.setReducerClass(IntSumReduce.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(Text.class);
		return job.waitForCompletion(true) ? 0 : 1;
	}

	public Job parseInputAndOutput(Tool tool, Configuration conf, String[] args) throws Exception {

		// validate
		if (args.length != 2) {
			System.err.printf("Usage: %s [generic options] <input> <output> \n");
			return null;
		}

		// step 2:create job
		Job job = Job.getInstance(conf, tool.getClass().getSimpleName());
		return job;
	}

	public static void main(String[] args) throws Exception {
		// run map reduce
		int status = ToolRunner.run(new MR(), args);
		// step 5 exit
		System.exit(status);
	}

}
Upload the file structure as follows







[email protected]:~$ hdfs dfs -mkdir -p /user/hadoop/mr/wc/input
[email protected]:~$ hdfs dfs -put top.data /user/hadoop/mr/wc/input
 





Run the MR Program in Eclipse















Execution succeeds, output execution steps in Eclipse console, view execution results






Description plugin without any problems






Hadoop 2.7.2 (hadoop2.x) uses Ant to make Eclipse plugins Hadoop-eclipse-plugin-2.7.2.jar



Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.