Use terminal to run Hadoop program in Ubuntu

Last Update:2014-12-19 Source: Internet

Author: User

Tags hdfs dfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Next "Ubuntu Kylin system Installation Hadoop2.6.0"

In the previous article, Hadoop Pseudo-distributed is basically well-equipped.

The next step is to run a mapreduce program, taking WordCount as an example:

1. Build the Implementation class:

Cd/usr/local/hadoopmkdir Workspace
CD Workspace
Gedit Wordcount.java

Copy and paste the code.

import java.io.ioexception;import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; Import Org.apache.hadoop.fs.path;import Org.apache.hadoop.io.intwritable;import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.mapper;import Org.apache.hadoop.mapreduce.reducer;import Org.apache.hadoop.mapreduce.lib.input.fileinputformat;import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;  Public classWordCount { Public Static classTokenizermapper extends Mapper<object, text, text, intwritable>{     PrivateFinalStaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidmap (Object key, Text value, Context context) throws IOException, interruptedexception {stri Ngtokenizer ITR=NewStringTokenizer (value.tostring ());  while(Itr.hasmoretokens ()) {Word.Set(Itr.nexttoken ());      Context.write (Word, one); }    }  }    Public Static classIntsumreducer extends Reducer<Text,IntWritable,Text,IntWritable> {    Privateintwritable result =Newintwritable ();  Public voidReduce (Text key, iterable<intwritable>values, Context context) throws IOException, Interruptedexception { intsum =0;  for(intwritable val:values) {sum+ = val.Get(); } result.Set(sum);    Context.write (key, result); }  }    Public Static voidMain (string[] args) throws Exception {Configuration conf=NewConfiguration (); Job Job= Job.getinstance (conf,"Word Count"); Job.setjarbyclass (WordCount.class); Job.setmapperclass (tokenizermapper.class); Job.setcombinerclass (intsumreducer.class); Job.setreducerclass (intsumreducer.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath (args[0])); Fileoutputformat.setoutputpath (Job,NewPath (args[1])); System.exit (Job.waitforcompletion (true) ?0:1); }}

For the specific analysis of the code, the next article is explained in detail.

2. Compiling

(1) Add Java_home

Export Java_home=/usr/lib/jvm/java-8u5-sun

Forget java_home can be used:

Echo $JAVA _home

(2) Add the Bin folder under the JDK directory to the environment variable

Export path= $JAVA _home/bin: $PATH

(3) Adding Hadoop_classpath to environment variables

Export Hadoop_classpath= $JAVA _home/lib/tools.jar

Compiling the Wordcount.java file

.. /bin/hadoop Com.sun.tools.javac.Main Wordcount.java

Where Com.sun.tools.javac.Main is an instance of building a compiler

The above statement generates three Class:WordCount.class Reducer.class Tokenizermapper.class

Package the above three classes into a. jar Package

Jar CF Wordcount.jar Wordcount*.class

Generate Wordcount.jar

3. Running

Bin/hdfs Dfs-mkdir/userbin/hdfs Dfs-mkdir/user/hadoop

To construct the input file:

Bin/hdfs Dfs-put Etc/hadoop/input

Where Etc/hadoop is the input file and can be replaced with other files

View Run Results

Bin/hdfs dfs-cat/output/*

4. End Hadoop

sbin/stop-dfs.sh

Use terminal to run Hadoop program in Ubuntu

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More