Eclipse executes Hadoop WordCount

Source: Internet
Author: User
Tags hadoop fs

Eclipse executes Hadoop WordCount

Pre-work

My eclipse is installed under Windows, through the Eclipse program to connect to Hadoop, you need to keep the virtual machine's access address and native access address in the same domain, the address of the virtual machine is described earlier in the article, if you want to change the Windows native IP address, open "Network and Sharing Center", click on the left menu "Change adapter Settings", select the appropriate connection network for IPV4 property address modification. My virtual machine address is 192.168.3.137


Preparatory work

After the address is configured, install the Hadoop plugin on eclipse (you can refer to the source code to modify it yourself).

Open the Eclipse installation path---\eclipse\plugins, and put the Hadoop-eclipse-plugin-1.1.2.jar in this directory.

Windows native New directory (I am in E:\hadoopMapReduceDir), copy all the jars from within the Linux Hadoop installation package and place them in this directory for backup.

Configuration work

Open Eclipse, click Window,showview in the menu bar and select other to open the window to display, as follows

Locate the elephant in the Mapreducetools directory and drag it to the bottom of Eclipse display (and console)

Click on the Elephant and right click on the blank area below to select New Hadooplocation (see)

Open the New Configuration window, set the connection information, is the local connection configuration name to fill out, two ports and username fill in the diagram (Hadoop default port)

Click the Eclipse menu bar window---Preference, find Hadoopmap/reduce, click on the right side to select the path to import the Hadoop jar, so that after the new Hadoop project, the jar package will be automatically loaded from this path, The path above has been set, added in

Create a project

Click File---Other to open the New dialog box and create the Map/reduceproject project

Once created, you will see that the jar package is automatically loaded into the project

Copy the example Java project under the SRC directory under Linux to the project you just created,

There's wordcount.java inside.

Linux starts Hadoop (with JPS check is started), after the Eclipse connection, the Hadoop directory will be displayed as follows

Modify Code

Now that the setup aspect is basically complete, the next step is to modify and HDFs path configuration for Wordcount.java

Open Wordcount.java (may be an error),

Change the main method as follows:

If genericoptionsparser error, add Hadoop-core-1.1.2.jar to the project (or to the jar configuration path).

Create a new A.txt file, enter the contents below, save

Under Hadoopname user-"Hadoop directory to create the input folder, the file a.txt upload to HDFs input, the process is as follows (do not create an output directory, or execution will error,)

If HDFs is already available, Linux can use the command hadoop fs-rmr/output Delete)

In select Wordcount.java, right-click Run as---run configurations, open arguments to fill in the input out path (note: There are spaces between the input and output paths), Here I set the number of Word occurrences for all files under query input.

Right-click Wordcount.java,run as-àrun on Hadoop execution

After execution finishes, refresh Dfs Locations/use/hadoop

Automatically generate the output folder and click to see the execution results. The final result is stored in the part-r-00000, double-click to view

Eclipse executes Hadoop WordCount

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.