MyEclipse connecting Hadoop cluster programming and problem solving

Source: Internet
Author: User

Originally thought to build a local programming test Hadoop program Environment is very simple, did not expect to do a lot of trouble, here to share steps and problems encountered, I hope everyone smooth.

I. To achieve the purpose of connecting a Hadoop cluster and being able to encode it requires the following preparation:

1. Remote Hadoop cluster (my master address is 192.168.85.2)

2. Local MyEclipse and MyEclipse connect to Hadoop plug-ins

3. Local Hadoop (I'm using hadoop-2.7.2)

First download the plugin hadoop-eclipse-plugin, I use Hadoop-eclipse-plugin-2.6.0.jar, downloaded and placed in the "MyEclipse Professional 2014\dropins" directory, Restarting MyEclipse will find a map/reduce option in perspective and views

  

Switch to Hadoop attempt, and then open the MapReduce Tools

  

Two. Next, add the Hadoop service, to start configuring the connection, you need to view the Hadoop configuration

1.hadoop/etc/hadoop/mapred-site.xml configuration, view the IP and port inside the Mapred.job.tracker to configure Map/reduce Master

2.hadoop/etc/hadoop/core-site.xml configuration, view the IP and port inside the Fs.default.name to configure DFS Master

3. User name directly write to Hadoop operations user can

  

This configuration is complete, and if it goes well, you can see:

  

Create a new Hadoop project.

Map/reduce project, Project Name:wordcount, "Map/reduce", "Project ...", "New", "File" Configure Hadoop Install directory ... "-" Hadoop installation directory:d:\nlsoftware\hadoop\hadoop-2.7.2 " Apply "OK", "Next", "Allow output folders for source folders", "Finish"

Three classes were created under the project, namely Mapper,reduce, and main

Testmapper

 PackageBB;Importjava.io.IOException;ImportJava.util.StringTokenizer;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Mapper;ImportOrg.apache.hadoop.mapreduce.Mapper.Context; Public classTestmapperextendsMapper<object, text, text, intwritable>{Private Final StaticIntwritable one =NewIntwritable (1);PrivateText Word =NewText (); Public voidmap (Object key, Text value, context context)throwsIOException, interruptedexception {stringtokenizer ITR=NewStringTokenizer (value.tostring ()); while(Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ()); Context.write (Word, one); }}}

Testreducer

 PackageBB;Importjava.io.IOException;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Reducer;ImportOrg.apache.hadoop.mapreduce.Reducer.Context; Public classTestreducerextendsReducer<text,intwritable,text,intwritable> {Privateintwritable result =Newintwritable (); Public voidReduce (Text key, iterable<intwritable>values, context context)throwsIOException, interruptedexception {intsum = 0; for(intwritable val:values) {sum+=val.get ();} Result.set (sum); Context.write (key, result);}}

WordCount

 PackageBB;Importorg.apache.hadoop.conf.Configuration;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapreduce.Job;ImportOrg.apache.hadoop.mapreduce.lib.input.FileInputFormat;ImportOrg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;ImportOrg.apache.hadoop.util.GenericOptionsParser; Public classWordCount { Public Static voidMain (string[] args)throwsException {Configuration conf=NewConfiguration (); String[] Otherargs=Newgenericoptionsparser (conf, args). Getremainingargs (); if(Otherargs.length! = 2) {System.err.println ("Usage:wordcount <in> <out>"); System.exit (2); } Job Job=NewJob (conf, "word count"); Job.setjarbyclass (WordCount.class); Job.setmapperclass (testmapper.class); Job.setcombinerclass (testreducer.class); Job.setreducerclass (testreducer.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath (otherargs[0])); Fileoutputformat.setoutputpath (Job,NewPath (otherargs[1])); System.exit (Job.waitforcompletion (true) ? 0:1); }}

I have created two Tex files in the input of HDFs, which can be used for testing or testing with other files. So my parameters:

  

hdfs://192.168.85.2:9000/input/* HDFS://192.168.85.2:9000/OUTPUT6

-xms512m-xmx1024m-xx:maxpermsize=256m

For a little explanation, the two parameters of the entry, one is the input file, the other is the output file. Specify the correct directory. The name of the Output6 folder was written casually. Automatically created

Then the final and most critical step. I ran as Hadoop when

Server IPC version 9 cannot communicate with client version 4

Error. This is the hint version of the wrong, I see. The remote Hadoop version is different from the jar package version. Remote is 2.7.2. So I changed the Hadoop jar package to that version (2.* version should be able to, if not similar can be used)

And then I changed the wrong one.

The data was found to be due to the fact that Windows local Hadoop was not winutils.exe. The mechanism of native Hadoop is going to call this program. We're going to download the 2.7 Winutils.exe and make it work right.

After the download, it is found that the Hadoop.dll file is required. Stun. Download again and place it in the C:\windows\System32 directory.

However, my winutils.exe is still unable to start, although this is my computer problem. But I think some people will still meet (briefly).

The error is missing msvcr120.dll. Download and then go to the start prompt, "application does not start properly 0xc000007b".

This is caused by a memory error. Download Directx_repair fix DirectX Finally solved the problem and finally started the Hadoop program successfully.

Some students may be able to start winutils.exe but still can not run the application, still error, you can try to modify the permission validation.

Modify Hadoop/etc/hadoop/hdfs-site.xml

Add content

<property>   <name>dfs.permissions</name>     <value>false</value> </property>

Cancels permission validation.

MyEclipse connecting Hadoop cluster programming and problem solving

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.