(iii) Configuring the Hadoop1.2.1+eclipse (version Juno) development environment and running the WordCount program

Source: Internet
Author: User
Tags hadoop fs

Configuration hadoop1.2.1+eclipse (Juno version ) development environment, and run WordCount program

First, Requirements Section

Using the Eclipse IDE for Hadoop-related development on Ubuntu requires the installation of Hadoop's development plug-in on eclipse. The latest release of Hadoop contains the source code package to hadoop-1. X, for example, contains the source code for the relevant Eclipse plug-in, so you can compile an Eclipse plugin for Hadoop for your own eclipse version. The following is a detailed introduction to the build-and-install process for plug-ins and the process of configuring Hadoop development plug-ins on eclipse.

Second, Environment

    1. Vmware®workstation 10.04
    2. Ubuntu14.04 32-bit
    3. Java JDK 1.6.0
    4. Hadoop1.2.1
    5. Eclipse:juno Service Release 2 Version

Third, compiling hadoop1.2.1 with the Eclipse-juno Plugins

1) Install Ant

sudo apt-get install ant

2) Modify the compilation configuration file

In the Hadoop decompression directory, locate src\contrib\eclipse-plugin\build.xmland modify the following lines:

<path id= "Hadoop-core-jar" >

<fileset dir= "${hadoop.root}/" >

<include name= "Hadoop*.jar"/>

</fileset>

</path>

<!--Override classpath to include Eclipse SDK jars--

<path id= "Classpath" >

<pathelement location= "${build.classes}"/>

<pathelement location= "${hadoop.root}/build/classes"/>

<path refid= "Eclipse-sdk-jars"/>

<path refid= "Hadoop-core-jar"/>

</path>

......

<target name= "Jar" depends= "compile" unless= "Skip.contrib" >

<mkdir dir= "${build.dir}/lib"/>

<copy file= "${hadoop.root}/hadoop-core-${version}.jar" tofile= "${build.dir}/lib/hadoop-core.jar" verbose= "true "/>

<copy file= "${hadoop.root}/lib/commons-cli-1.2.jar" todir= "${build.dir}/lib" verbose= "true"/>

<copy file= "${hadoop.root}/lib/commons-lang-2.4.jar" todir= "${build.dir}/lib" verbose= "true"/>

<copy file= "${hadoop.root}/lib/commons-configuration-1.6.jar" todir= "${build.dir}/lib" verbose= "true"/>

<copy file= "${hadoop.root}/lib/jackson-mapper-asl-1.8.8.jar" todir= "${build.dir}/lib" verbose= "true"/>

<copy file= "${hadoop.root}/lib/jackson-core-asl-1.8.8.jar" todir= "${build.dir}/lib" verbose= "true"/>

<copy file= "${hadoop.root}/lib/commons-httpclient-3.0.1.jar" todir= "${build.dir}/lib" verbose= "true"/>

<jar

Jarfile= "${build.dir}/hadoop-${name}-${version}.jar"

Manifest= "${root}/meta-inf/manifest. MF ">

<fileset dir= "${build.dir}" includes= "classes/lib/"/>

<fileset dir= "${root}" includes= "Resources/plugin.xml"/>

</jar>

</target>

L Find src\contrib\build-contrib.xml, add the following lines:

<property name= "version" value= "1.2.1"/>
<property name= "ivy.version" value= "2.1.0"/>
<property name= "Eclipse.home" location= "..."/>

The path to eclipse is replaced by the Eclipse storage path on your host.

3) Then, open the command line, enter the directory \src\contrib\eclipse-plugin, enter the ant compilation, if all is well compiled.

Finally, you can find the compiled plugin under {hadoophome}\build\contrib\eclipse-plugin path.

4) Several points of note:

L must be in the network environment, if you need to set up the Internet agent, you can add the following lines in the Src\contrib\build-contrib.xml:

<target name= "proxy" >
<property name= "Proxy.host" value= ""/>
<property name= "Proxy.port" value= "/>"
<property name= "Proxy.user" value= ""/>
<property name= "Proxy.pass" value= ""/>
<setproxy proxyhost= "${proxy.host}" proxyport= "${proxy.port}"
Proxyuser= "${proxy.user}" proxypassword= "${proxy.pass}"/>
</target>

In the download task for Ivy-related files in the XML file, add the dependencies for the above agent tasks, configured as:

<target name= "Ivy-download" depends= "proxy" description= "To download Ivy" unless= "Offline" >
<get src= "${ivy_repo_url}" dest= "${ivy.jar}" usetimestamp= "true"/>
</target>

If there is a problem with the version mismatch of the Compile prompt class, make sure that your Java version is greater than 1.6.

Four, Configuration hadoop1.2.1 with the Eclipse Development Environment

Once you have the Hadoop1.2.1-eclipse development plugin (jar package), place it in the Eclipse/plugins directory and restart Eclipse. One thing to note here is that sometimes eclipse fails to load the plugin, and if it does, it starts with the Eclipse-clean command, and a blue elephant logo should appear in the upper right corner of Eclipse when it starts.

Five, Run WordCount program

After you start eclipse, File->new->project. If the Map/reduce project option appears, after selecting Next, enter project name to finish, and the plugin installation is successful. If the Map/reduce project option appears, but next prompts an error, it means that the plugin you are using is not available.

Configure the Hadoop directory in the window->preferences option below

Then start Hadoop, click on the yellow image below eclipse, and in the lower space right click on the New Hadoop location.

The Map/reduce master setting on the left of the host and port corresponds to the host and port under conf mapred-site.xml file settings under your Hadoop installation directory, and the DFS on the right Master corresponds to Core-site.xml. Set it up and then you can browse and manipulate HDFs in eclipse.


Below we try to run a wordcount algorithm.

Right-click on the SRC folder under Map/reduce project that you just built, New->class


Then copy the code inside the Wordcount.java in the Src/examples/org/apache/hadoop/examples in the Hadoop installation directory into the Wordcount.java in the project.

Note the first line. Save.

Under Ubuntu documents, create a new file input and enter the content:

My name is Sun Bin Bin,what is your name?

Then upload the input file to HDFs:

bin/hadoop fs-put/home/binbin/documents/input. Pay attention to the back.

The file is uploaded to HDFs, and the directory under Myhdoop in Eclipse can be seen


And then start running. Right-click on the established Wordcount.java, Run as->run configurations

Left Java application Right-click New

Arguments setting Parameters:

To ensure that the output directory does not exist in HDFs, an exception is thrown. Click Run on Hadoop.

At the end of the run, you can see the output in the DFS Locations/myhadoop on the left (to right-click to refresh), or at the terminal via the command line.


Because the wordcount algorithm only makes spacing symbols for spaces, there is a case where bin,what is counted as a word.

Reference:

Http://www.cnblogs.com/alex-blog/p/3160619.html

Http://blog.sina.com.cn/s/blog_7deb436e0101kh0d.html

(iii) Configuring the Hadoop1.2.1+eclipse (version Juno) development environment and running the WordCount program

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.