The beauty of Java [from rookie to master evolution] Eclipse connects Hadoop clusters

Source: Internet
Author: User

Two cyan
Personal site: zhangerqing.cn email: [Email protected] Weibo: HTTP://WEIBO.COM/XTFGGEF

Preparation tools:

    • Ubuntu 14.10 Desktop + bit
    • Eclipse 4.3 Kepler jee version
    • Hadoop 2.6.0
    • Hadoop Eclipse Plugin 2.6.0
At first I was planning to use Win7 to do the experiment, and later encountered a null pointer problem is very strange, in the internet to find a lot of information is not working, some problems are like, alone is not the same version, so it does not necessarily work, so for convenience, I will directly change to use Ubuntu Desktop version of the, so much more convenient. Install an Ubuntu 14.10 Desktop version of VirtualBox, configure the JDK, and download the eclipse Kepler.  1. Copy the compiled Eclipse Hadoop plugin into the plugin directory under the Eclipse installation directory and restart Eclipse. Source Address 2. Turn on Hadoop map/reduce, Windows-Preference, and configure the "Hadoop installation directory" entry on the right to point to the Hadoop installation directory. Note: Instead of the actual Hadoop installation directory on Linux, you can simply copy the Hadoop installation package to a directory on the current system and unzip it.

3. The following view appears below Map/reduce,eclipse, other, Window, Open perspective

Click on the blue elephant in the top right corner

Location name is filled in, host fills in the IP of the Hadoop cluster's master, The first port fills in the port defined in Mapred-site.xml, like, 9001, and the second port fills in the port defined in Core-site.xml, such as two, 9000. After completing the restart Eclipse.


4. Open the previous Settings page, switch to the Advanced Parameters tab, Modify the following configuration items: Dfs.replication, default is 3, according to the number of your datanode to decide, if you have only 2 Datanode, then write 2, if only one, write 1, greater than or equal to 3, write 3.
Hadoop.tmp.dir, this is the same as the Hadoop.tmp.dir setting in Core-site.xml in the Hadoop cluster configuration set up in the previous section, where we fill in the/root/hadoop/tmp. Restart Eclipse.
5. We can see the DFS locations view in the Explorer on the left

Right-click folder to upload/download files or directories. Note: If this is reported:Exception in thread "main" Org.apache.hadoop.security.AccessControlException:Permission denied:Error, you need to modify the DFS access permissions, edit the Hdfs-site.xml file, and add:
<property>      <name>dfs.permissions.enabled</name>      <value>false</value>  </property>  

Note: This is not recommended for production environments.
6. Run your own Word count example to create a new Map/reduce project, enter the/home/adam/hadoop/share/hadoop/mapreduce/sources directory, Unzip the Hadoop-mapreduce-examples-2.6.0-sources.jar package, copy the Wordcount.java inside to the new project, and click Run configurations to configure the startup parameters. Here the output folder must be not created before, this will be created automatically, otherwise reported "directory already exists" error.

Finally run as "run on Hadoop", after execution, go to the OUTPUT1 directory on the HDFs to see the output.
This allows eclipse to connect to the Hadoop cluster and perform map/reduce tasks.
Do you have any questions please feel free to contact me:
Two cyan
Personal site: zhangerqing.cn email: [Email protected] Weibo: HTTP://WEIBO.COM/XTFGGEF

The beauty of Java [from rookie to master evolution] Eclipse connects Hadoop clusters

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.