Question about Hadoop-0.20.203.0 connection programming through Eclipse

Source: Internet
Author: User
For Hadoop developers, Java API programming is the first step in distributed development of Map-Reduce. Eclipse does not support the MapReduce programming mode, so some simple steps are required. 1. Install Hadoop. Hadoop in this article is a pseudo distribution mode deployed on virtual machines. The related software environment is as follows: JDK: sunjdk1.6.0 _ 30 Hadoop: hadoop-0.20.20

For Hadoop developers, java api programming is the first step in distributed development of Map-Reduce. Eclipse does not support the MapReduce programming mode, so some simple steps are required.

1. Install Hadoop.

Hadoop in this article is a pseudo distribution mode deployed on virtual machines. The related software environment is as follows:

JDK: sun jdk1.6.0 _ 30

Hadoop: hadoop-0.20.203.0

Linux: Ubuntu 11.10

Eclipse: Eclipse Indigo 3.7.1

The specific configuration method of the Hadoop pseudo-distribution environment is not described here. In terms of settings, hdfs: // localhost: 9000 is entered in fs. default. name in the core-site.xml

 

2. Set the environment

Enter

$ Sudo vim/etc/profile

Modify the running environment and add the following content at the end of the file:

Export HADOOP_HOME =/home/wangyucao/hadoop-0.20.203.0 (this is the hadoop installation directory)
Export PATH = $ PATH: # HADOOP_HOME/bin

 

3. Install Eclipse

Search for Eclipse in the Ubuntu Software Center for installation, or download the Eclipse archive file from the official website. Download from the official website:

Eclipse-jee-indigo-SR1-linux-gtk.tar.gz

Decompress the file and put it in the/usr directory:

$ Tar-zxvf eclipse-jee-indigo-SR1-linux-gtk.tar.gz

$ Sudo mv eclipse/usr/

Complete the installation steps.

 

4. Install the hadoop-eclipse plug-in

The hadoop release version contains the hadoop-eclipse-plugin plug-in. When developing a hadoop application, you must first install the Eclipse plug-in.

See online tutorials, generally said that the HADOOP_HOME/lib directly copy the hadoop-eclipse-plugin-0.20.203.0.jar to the eclipse installation directory plugins directory.However, in my practice, I found that the hadoop-0.20.203.0 version of this package if directly copied to the eclipse plug-in directory, when connecting to the DFS error will occur, the prompt message is: "error: failure to login. the pop-up error prompt box is "An internal error occurred during:" Connecting to DFS hadoop ". org/apache/commons/configuration/Configuration ". check the log of Eclipse and find that the jar package is missing. Further find the information, found that directly copy the hadoop-eclipse-plugin-0.20.203.0.jar, the lib directory of the package is missing jar package.

After collecting online information, the correct installation method is provided here:

First, modify the hadoop-eclipse-plugin-0.20.203.0.jar. Open the package with the archive manager and discover that there are only two packages, commons-cli-1.2.jar and hadoop-core.jar. Copy the commons-configuration-1.6.jar, commons-httpclient-3.0.1.jar, commons-lang-2.4.jar, jackson-core-asl-1.0.1.jar and jackson-mapper-asl-1.0.1.jar under the HADOOP_HOME/lib directory to the lib directory of the hadoop-eclipse-plugin-0.20.203.0.jar, such:

Then, modify MANIFEST. MF under the META-INF directory of the package and change classpath to the following:

Bundle-ClassPath: classes/, lib/hadoop-core.jar, lib/commons-cli-1.2.jar, lib/commons-httpclient-3.0.1.jar, lib/jackson-core-asl-1.0.1.jar, lib/jackson-mapper-asl-1.0.1.jar, lib/commons-configuration-1.6.jar, lib/commons-lang-2.4.jar

For example:

This completes the modification to the hadoop-eclipse-plugin-0.20.203.0.jar.

Finally, copy the hadoop-eclipse-plugin-0.20.203.0.jar to the Eclipse plugins directory:

$ Cd ~ /Hadoop-0.20.203.0/lib

$ Sudo cp hadoop-eclipse-plugin-0.20.203.0.jar/usr/eclipse/plugins/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.