Use terminal to run Hadoop program in Ubuntu

Source: Internet
Author: User
Tags hdfs dfs

Next "Ubuntu Kylin system Installation Hadoop2.6.0"

In the previous article, Hadoop Pseudo-distributed is basically well-equipped.

The next step is to run a mapreduce program, taking WordCount as an example:

1. Build the Implementation class:

Cd/usr/local/hadoopmkdir Workspace
CD Workspace
Gedit Wordcount.java

Copy and paste the code.

import java.io.ioexception;import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; Import Org.apache.hadoop.fs.path;import Org.apache.hadoop.io.intwritable;import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.job;import Org.apache.hadoop.mapreduce.mapper;import Org.apache.hadoop.mapreduce.reducer;import Org.apache.hadoop.mapreduce.lib.input.fileinputformat;import Org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;  Public classWordCount { Public Static classTokenizermapper extends Mapper<object, text, text, intwritable>{     PrivateFinalStaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidmap (Object key, Text value, Context context) throws IOException, interruptedexception {stri Ngtokenizer ITR=NewStringTokenizer (value.tostring ());  while(Itr.hasmoretokens ()) {Word.Set(Itr.nexttoken ());      Context.write (Word, one); }    }  }    Public Static classIntsumreducer extends Reducer<Text,IntWritable,Text,IntWritable> {    Privateintwritable result =Newintwritable ();  Public voidReduce (Text key, iterable<intwritable>values, Context context) throws IOException, Interruptedexception { intsum =0;  for(intwritable val:values) {sum+ = val.Get(); } result.Set(sum);    Context.write (key, result); }  }    Public Static voidMain (string[] args) throws Exception {Configuration conf=NewConfiguration (); Job Job= Job.getinstance (conf,"Word Count"); Job.setjarbyclass (WordCount.class); Job.setmapperclass (tokenizermapper.class); Job.setcombinerclass (intsumreducer.class); Job.setreducerclass (intsumreducer.class); Job.setoutputkeyclass (Text.class); Job.setoutputvalueclass (intwritable.class); Fileinputformat.addinputpath (Job,NewPath (args[0])); Fileoutputformat.setoutputpath (Job,NewPath (args[1])); System.exit (Job.waitforcompletion (true) ?0:1); }}

For the specific analysis of the code, the next article is explained in detail.

2. Compiling

(1) Add Java_home

Export Java_home=/usr/lib/jvm/java-8u5-sun

Forget java_home can be used:

Echo $JAVA _home

(2) Add the Bin folder under the JDK directory to the environment variable

Export path= $JAVA _home/bin: $PATH

(3) Adding Hadoop_classpath to environment variables

Export Hadoop_classpath= $JAVA _home/lib/tools.jar

Compiling the Wordcount.java file

.. /bin/hadoop Com.sun.tools.javac.Main Wordcount.java

Where Com.sun.tools.javac.Main is an instance of building a compiler

The above statement generates three Class:WordCount.class Reducer.class Tokenizermapper.class

Package the above three classes into a. jar Package

Jar CF Wordcount.jar Wordcount*.class

Generate Wordcount.jar

3. Running

Bin/hdfs Dfs-mkdir/userbin/hdfs Dfs-mkdir/user/hadoop

To construct the input file:

Bin/hdfs Dfs-put Etc/hadoop/input

Where Etc/hadoop is the input file and can be replaced with other files

View Run Results

Bin/hdfs dfs-cat/output/*

4. End Hadoop

sbin/stop-dfs.sh

  

Use terminal to run Hadoop program in Ubuntu

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.