Windows 32-bit Eclipse remote Hadoop development environment Build _java

Source: Internet
Author: User
Tags static class log4j

This article assumes that the Hadoop environment is on a remote machine (such as a Linux server), and the Hadoop version is 2.5.2

Note: This article Eclipse/intellij idea Remote debugging Hadoop 2.6.0 main reference and on the basis of the adjustment

Since I like to install 32-bit software on the Win7 64-bit, such as 32-bit jdk,32 Eclipse, the operating system in this article is Win7 64-bit, but all software is 32-bit.

Software version:

operating system:Win7 64 bits

Eclipse:eclipse-jee-mars-2-win32

Java:1.8.0_77 32-bit

hadoop:2.5.2

First, install Hadoop

1, in the Win7 to find a directory to extract hadoop-2.5.2.tar.gz, such as D:\app\hadoop-2.5.2\

2. Configure Environment variables

Hadoop_home = D:\app\hadoop-2.5.2\

Ii. Installing the Hadoop Eclipse plugin

1, download Hadoop-eclipse-plugin

Hadoop-eclipse-plugin is a Hadoop plug-in dedicated to eclipse, which allows you to view the contents of HDFs directories and files directly in the IDE environment.  Its source code is hosted on the GitHub, the official website address is Https://github.com/winghc/hadoop2x-eclipse-plugin Download the Hadoop-eclipse-plugin-2.6.0.jar in the release folder

2. Download the Hadoop plug-in package for Windows 32-bit platform (Hadoop.dll,winutils.exe)

Since our software environment is 32-bit, so we need to download 32-bit hadoop.dll and winutils.exe, download the address you can Baidu Hadoop.dll 32

For example, download this: Http://xiazai.jb51.net/201607/yuanma/eclipse-hadoop (jb51.net). rar

Copy the Winutils.exe to the $hadoop_home\bin directory and copy the Hadoop.dll to the C:\Windows\SysWOW64 directory (note: Since our operating system is 64-bit and the software is 32-bit, we are handcuffed to this directory, In addition, if your operating system is 32-bit, then copy directly to the C:\windwos\system32 directory.

3. Configure Hadoop-eclipse-plugin Plugin

Start Eclipse,window->preferences->hadoop Map/reduce Specify the Hadoop root directory on Win7 (that is, $HADOOP _home)

Toggle Map/reduce View

Windows->show View->other map/reduce Locations

Then add the new location in the Map/reduce locations panel below

Follow the configuration below

Location Name Here's a name, whatever.

map/reduce (V2) Master Host This is the IP address of the Hadoop master in the virtual machine, and the following port corresponds The port specified by the Dfs.datanode.ipc.address property in the Hdfs-site.xml

The port here in DFS Master port, corresponding to the port specified in the Core-site.xml fs.defaultfs

The last user name is the same as the username running Hadoop in the virtual machine, and I'm running Hadoop 2.6.0 with a Hadoop, so fill out Hadoop here, if you're installing with root, change it to root.

When these parameters are specified, click Finish,eclipse to know how to connect to Hadoop, and everything goes well, in the Project Explorer panel, you can see the directories and files in the HDFs.

You can right-click on a file, select Delete try, usually the first time is unsuccessful, will prompt a bunch of things, the effect is insufficient permissions, the reason is that the current Win7 login user is not a virtual machine running Hadoop users, the solution has many, For example, you can create a new Hadoop administrator user on Win7, then switch to a Hadoop login win7, then use eclipse development, but this is too annoying, the easiest way:

Add in Hdfs-site.xml

 <property>
 <name>dfs.permissions.enabled</name>
 <value>false</value>
 </property>

All in all, it's all about turning off the security of Hadoop (the learning phase does not need these, do not do this in the formal production, do not do so), finally restart Hadoop, and then go to eclipse, repeat the delete file operation just try, should be OK.

Note: If you cannot connect, try Telnet 192.168.1.6 9000 (swap IP and port for your own Hadoop server IP and port) to ensure that the port is accessible.

If Telnet is unsuccessful, it may be a problem with the value of Fs.defaultfs in Core-site.xml, for example, the configuration is localhost:9000, you can consider replacing localhost with host name

Iii. Writing WordCount Examples

1, a new project, select Map/reduce Project

The next one is OK, then a new class Wodcount.java code is as follows:

Import java.io.IOException;

Import Java.util.StringTokenizer;
Import org.apache.hadoop.conf.Configuration;
Import Org.apache.hadoop.fs.Path;
Import org.apache.hadoop.io.IntWritable;
Import Org.apache.hadoop.io.Text;
Import Org.apache.hadoop.mapreduce.Job;
Import Org.apache.hadoop.mapreduce.Mapper;
Import Org.apache.hadoop.mapreduce.Reducer;
Import Org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
Import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

Import Org.apache.hadoop.util.GenericOptionsParser;

 public class WordCount {public static class Tokenizermapper extends Mapper<object, text, text, intwritable> {
 Private final static intwritable one = new intwritable (1);

 Private Text Word = new text (); public void Map (Object key, Text value, Context context) throws IOException, interruptedexception {stringtokenizer ITR
  = New StringTokenizer (value.tostring ());
  while (Itr.hasmoretokens ()) {Word.set (Itr.nexttoken ());
  Context.write (Word, one);
}
 } public static class Intsumreducer extends Reducer<text, intwritable, Text, intwritable> {private intwritable

 result = new intwritable (); public void reduce (Text key, iterable<intwritable> values, context context) throws IOException,
  interruptedexception {int sum = 0;
  for (intwritable val:values) {sum + = Val.get ();
  } result.set (sum);
 Context.write (key, result); 
 } public static void Main (string[] args) throws Exception {Configuration conf = new Configuration ();
 string[] Otherargs = new Genericoptionsparser (conf, args). Getremainingargs (); if (Otherargs.length < 2) {System.err.println ("Usage:wordcount <in> [<in> ...]
  <out> ");
 System.exit (2);
 Job Job = job.getinstance (conf, word count);
 Job.setjarbyclass (Wordcount.class);
 Job.setmapperclass (Tokenizermapper.class);
 Job.setcombinerclass (Intsumreducer.class);
 Job.setreducerclass (Intsumreducer.class);
 Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (Intwritable.class);
 for (int i = 0; i < otherargs.length-1 ++i) {fileinputformat.addinputpath (Job, New Path (otherargs[i));
 } fileoutputformat.setoutputpath (Job, New Path (otherargs[otherargs.length-1));
 System.exit (Job.waitforcompletion (true)? 0:1);


 }
}

And then the SRC directory to create a log4j.properties, the contents are as follows: (to facilitate the operation, see the various output)

Log4j.rootlogger=info, stdout

#log4j. Logger.org.springframework=info
#log4j. logger.org.apache.activemq= INFO
#log4j. Logger.org.apache.activemq.spring=warn
#log4j. logger.org.apache.activemq.store.journal= INFO
#log4j. Logger.org.activeio.journal=info

Log4j.appender.stdout=org.apache.log4j.consoleappender
Log4j.appender.stdout.layout=org.apache.log4j.patternlayout
Log4j.appender.stdout.layout.conversionpattern=%d{absolute} | %-5.5p | %-16.16t | %-32.32c{1} | %-32.32C%4l | %m%n

The final directory structure is as follows:

2. Configure running parameters

Because WordCount is to enter a file for statistical word, and then output to another folder, so give two parameters, refer to the above figure, in the program arguments, input

Hdfs://192.168.1.6:9000/user/nub1.txt
Hdfs://192.168.1.6:9000/user/output

Note that if the User/nub1.txt file is not available, please upload it manually (using the right key of the DFS location tool in Eclipse), and then/output/must not exist, otherwise the program will run to the end and the target directory is found, and an error is detected.

All right, run it.

The above is the entire content of this article, I hope to help you learn, but also hope that we support the cloud habitat community.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.