Build a hadoop application development environment in Windows 7 with eclipse

Source: Internet
Author: User
I. Overview I recently started setting up the cloud platform for colleges and universities. I shared my experiences on installing and configuring the hadoop cluster test environment a few days ago, This article mainly introduces win7 64-bit eclipse4.2 connection remote RedHat Linux 5 hadoop-1.2.0 cluster development environment to buildII. Environment 1, window 7 64-bit 2, eclipse 4.23, RedHat Linux 54, hadoop-1.2.0 3, install and configure hadoop cluster reference my article: http://blog.csdn.net/shan9liang/article/details/9841933http://www.jialinblog.com? P = 176 4. install and configure the hadoop plug-in eclipse. 1. Compile the eclipse-hadoop plug-in. For details, refer to plug-ins under the installation directory of Eclipse, restart eclipse3, and configure (1) decompress hadoop to a directory in the Windows File System (2) Open eclipse and set the Workspace

Open Window --> preferens and you will find the hadoop MAP/reduce option. In this option, you need to configure hadoop installation directory. After the configuration is complete, exit.

(3) Select WINDOW> open perspective> other... and select MAP/reduce with the elephant icon. Then, the development environment of MAP/reduce is opened. We can see that there is a map/reduce locations box in the lower right corner. For example

 

New. In the displayed window, enter:

 

Location name: Enter the parameter name.

MAP/reduce master (this is the map/reduce address of the hadoop cluster, which should be the same as the mapred. Job. Tracker setting in the mapred-site.xml)

DFS master (this is the master server address for hadoop, which should be the same as the fs. Default. Name setting in the core-site.xml)

After setting, click Finish to apply the setting.

In this case, the DFS directory is displayed in the project explorer on the far left, as shown in.

5. test the new project: file --> New --> Other --> MAP/reduce project. The project name can be retrieved as needed. For example, hadoop_test_01 automatically adds a dependency package, as shown below:

 

You can run the wordcount instance that comes with hadoop/*** licensed under the Apache license, version 2.0 (the "License"); * you may not use this file license t in compliance with the license. * You may be obtain a copy of the license at ** http://www.apache.org/licenses/LICENSE-2.0 ** unless required by applicable law or agreed to in writing, software * distributed under the license is distributed on an "as is" basis, * without W Arranties or conditions of any kind, either express or implied. * See the license for the specific language governing permissions and * limitations under the license. */package COM. jialin. hadoop; import Java. io. ioexception; import Java. util. stringtokenizer; import Org. apache. hadoop. conf. configuration; import Org. apache. hadoop. FS. path; import Org. apache. hadoop. io. intwritable; import Org. apache. hadoop. io. Text; import Org. apache. hadoop. mapreduce. job; import Org. apache. hadoop. mapreduce. mapper; import Org. apache. hadoop. mapreduce. reducer; import Org. apache. hadoop. mapreduce. lib. input. fileinputformat; import Org. apache. hadoop. mapreduce. lib. output. fileoutputformat; import Org. apache. hadoop. util. genericoptionsparser; public class wordcount {public static class tokenizermapper extends mapper <object, text, text, Intwritable> {private final static intwritable one = new intwritable (1); private text word = new text (); Public void map (Object key, text value, context) throws ioexception, interruptedexception {stringtokenizer itr = new stringtokenizer (value. tostring (); While (itr. hasmoretokens () {word. set (itr. nexttoken (); context. write (word, one) ;}} public static class intsumreducer extends re Ducer <text, intwritable, text, intwritable> {private intwritable result = new intwritable (); Public void reduce (Text key, iterable <intwritable> values, context) throws ioexception, interruptedexception {int sum = 0; For (intwritable VAL: values) {sum + = Val. get ();} result. set (SUM); context. write (Key, result) ;}} public static void main (string [] ARGs) throws exception {configuration co NF = new configuration (); string [] otherargs = new genericoptionsparser (Conf, argS). getremainingargs (); If (otherargs. length! = 2) {system. err. println ("Usage: wordcount <in> <out>"); system. exit (2);} job = new job (Conf, "Word Count"); job. setjarbyclass (wordcount. class); job. setmapperclass (tokenizermapper. class); job. setcombinerclass (intsumreducer. class); job. setreducerclass (intsumreducer. class); job. setoutputkeyclass (text. class); job. setoutputvalueclass (intwritable. class); fileinputformat. addinputpath (job, new path (Otherargs [0]); fileoutputformat. setoutputpath (job, new path (otherargs [1]); system. Exit (job. waitforcompletion (true )? 0: 1) ;}} runtime parameter settings: Right-click wordcount and select Run as-run configurations

 

The input directory contains two files, input1 and input2, respectively: Hello world and hello hadoopoutput. Run: Right-click wordcount-run as-run on hadoop and check the file content in output Hello 2 hadoop 1 World 1. Note: solution to the Problem encountered during the test: permission problem 1. If the user name currently logged on to Windows is different from the user name of the hadoop cluster, the user will not be authorized to access the hadoop cluster, an error is reported. Currently, permission authentication is disabled for the hadoop service cluster during development. When the hadoop service cluster is officially released, you can create a user with the same username as the hadoop cluster on the server without modifying the master's permissions policy. For more information, see my article: http://blog.csdn.net/shan9liang/article/details/9734693
Http://www.jialinblog.com /? P = 172 2, Windows 0700 problem this problem is really tangle of my days, and finally repair the hadoop source code hadoop-core-1.2.0.jar fileutil, re-compile the hadoop-core-1.2.0.jar, replace the original. To solve the detailed reference my article: http://blog.csdn.net/shan9liang/article/details/9734677http://www.jialinblog.com? P = 174 VII. Summary The basic development environment for hadoop clusters on the cloud platform of colleges and universities has come out, and the rest is enriched on this basis. For a simple test, we recommend that you use the standalone hadoop mode or pseudo-distributed mode. Instead of standalone or pseudo-distributed, I just want to simulate the real environment as much as possible. Select as needed.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.