Come with me. Hadoop (1)-hadoop2.6 Installation and use

Last Update:2015-06-28 Source: Internet

Author: User

Tags hdfs dfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Pseudo-distributed

Three ways to install Hadoop:

Local (Standalone) Mode
Pseudo-distributed Mode
Fully-distributed Mode

Required before installation

$ sudo apt-get install SSH
$ sudo apt-get install rsync

See: http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html

Pseudo-Distributed configuration configurations

Modify Bottom:

Etc/hadoop/core-site.xml:

<configuration>    <property>        <name>fs.defaultFS</name>        <value>hdfs:// Localhost:9000</value>    </property></configuration>

Etc/hadoop/hdfs-site.xml:

<configuration>    <property>        <name>dfs.replication</name>        <value>1< /value>    </property></configuration>

Configuring SSH

  $ ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA  $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

If you want to run on yarn

You need to follow the steps below:

Configure parameters as follows:

Etc/hadoop/mapred-site.xml:

<configuration>    <property>        <name>mapreduce.framework.name</name>        <value >yarn</value>    </property></configuration>

Etc/hadoop/yarn-site.xml:

<configuration>    <property>        <name>yarn.nodemanager.aux-services</name>        < Value>mapreduce_shuffle</value>    </property></configuration>

Start ResourceManager daemon and NodeManager daemon:
```
  $ sbin/start-yarn.sh
```
Browse the Web interface for the ResourceManager; By default it's available at:
- ResourceManager- http://localhost:8088/
Run a MapReduce job.
When you ' re done, stop the daemons with:
```
  $ sbin/stop-yarn.sh
```

Input:

http://localhost:8088/

Can see

After you start yarn

Format the filesystem:
```
  $ Bin/hdfs Namenode-format
```
Start NameNode daemon and DataNode daemon:
```
  $ sbin/start-dfs.sh
```
The Hadoop daemon log output is written to the $HADOOP _log_dir directory (defaults to $HADOOP _home/logs) .
Browse the Web interface for the NameNode; By default it's available at:
- NameNode- http://localhost:50070/

The input is followed by:

Then perform the test

Make the HDFS directories required to execute MapReduce jobs:

  $ Bin/hdfs dfs-mkdir/user  $ bin/hdfs dfs-mkdir/user/<username>

Copy the input files into the distributed filesystem:
```
  $ Bin/hdfs dfs-put Etc/hadoop input
```

Run Some of the examples provided:

  $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input Output ' dfs[a-z. +

Examine the output files:
Copy the output files from the distributed filesystem to the local filesystem and examine them:
```
  $ bin/hdfs dfs-get output output  $ cat output/*
```
Or
View the output files on the distributed filesystem:
```
  $ Bin/hdfs Dfs-cat output/*
```

Look at the running situation:

View Results

The test execution succeeds and the local code can be written.

Eclipse hadoop2.6 Plug-in uses

Download Source:

git clone https://github.com/winghc/hadoop2x-eclipse-plugin.git

Download process:

To compile the plugin:

CD Src/contrib/eclipse-plugin
Ant jar-dversion=2.6.0-declipse.home=/usr/local/eclipse-dhadoop.home=/usr/local/hadoop-2.6.0//path according to its own configuration

Copy the compiled jar into the Eclipse plugin directory and restart eclipse

Configure the Hadoop installation directory

Windows->preference, Hadoop map/reduce, Hadoop installation directory

Configuring the Map/reduce View

Window->open Perspective, other->map/reduce click "OK"

Windows→show view→other->map/reduce locations-> Click "OK"

The console will have one more tab page for "Map/reduce Locations"

On the "Map/reduce Locations" tab, click the icon < elephant +> or right click on the blank, select "New Hadoop location ...", and the dialog box "new Hadoop locations ..." pops up, Configure the following: Change HA1 to your own Hadoop user

Note: The MR master and DFS master configurations must be consistent with the configuration files such as Mapred-site.xml and Core-site.xml.

Open Project Explorer to view the HDFs file system.

New Map/reduce Task

File->new->project->map/reduce Project->next

Write the WordCount class: Remember to get the service up first

/** *  */ PackageCom.zongtui;/*** Classname:wordcount <br/> * Function:todo ADD Function. <br/> * Date:jun, 5:34:18 AM <br/ > * *@authorZhangfeng *@version  * @sinceJDK 1.7*/Importjava.io.IOException;ImportJava.util.Iterator;ImportJava.util.StringTokenizer;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapred.FileInputFormat;ImportOrg.apache.hadoop.mapred.FileOutputFormat;Importorg.apache.hadoop.mapred.JobClient;Importorg.apache.hadoop.mapred.JobConf;Importorg.apache.hadoop.mapred.MapReduceBase;ImportOrg.apache.hadoop.mapred.Mapper;ImportOrg.apache.hadoop.mapred.OutputCollector;ImportOrg.apache.hadoop.mapred.Reducer;ImportOrg.apache.hadoop.mapred.Reporter;ImportOrg.apache.hadoop.mapred.TextInputFormat;ImportOrg.apache.hadoop.mapred.TextOutputFormat; Public classWordCount { Public Static classMapextendsMapreducebaseImplementsMapper<longwritable, text, text, intwritable> {        Private Final StaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidmap (longwritable key, Text value, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {String line=value.tostring (); StringTokenizer Tokenizer=NewStringTokenizer (line);  while(Tokenizer.hasmoretokens ()) {Word.set (Tokenizer.nexttoken ());            Output.collect (Word, one); }        }    }     Public Static classReduceextendsMapreducebaseImplementsReducer<text, Intwritable, Text, intwritable> {         Public voidReduce (Text key, iterator<intwritable>values, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {intsum = 0;  while(Values.hasnext ()) {sum+=Values.next (). get (); } output.collect (Key,Newintwritable (sum)); }    }     Public Static voidMain (string[] args)throwsException {jobconf conf=NewJobconf (WordCount.class); Conf.setjobname ("WordCount"); Conf.setoutputkeyclass (Text.class); Conf.setoutputvalueclass (intwritable.class); Conf.setmapperclass (Map.class); Conf.setreducerclass (Reduce.class); Conf.setinputformat (Textinputformat.class); Conf.setoutputformat (Textoutputformat.class); Fileinputformat.setinputpaths (conf,NewPath (args[0])); Fileoutputformat.setoutputpath (conf,NewPath (args[1]));    Jobclient.runjob (conf); }}

User/admin123/input/hadoop is the file you upload in the HDFs folder (which you create yourself) and put in the files you want to process. OUPUT1 Output Results

Run the program on a Hadoop cluster: Right--->runas-->run on Hadoop, and the final output will be displayed in the appropriate folder in HDFs. At this point, Ubuntu hadoop-2.6.0 Eclipse plugin configuration is complete.

Encountered an exception

Exception in thread "main" Org.apache.hadoop.mapred.FileAlreadyExistsException:Output directory HDFs://Localhost:9000/output already existsAt Org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs (fileoutputformat.java:132) at Org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs (Jobsubmitter.java:564) at Org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal (Jobsubmitter.java:432) at org.apache.hadoop.mapreduce.job$10.run (job.java:1296) at org.apache.hadoop.mapreduce.job$10.run (job.java:1293) at java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (Subject.jav A:415) at Org.apache.hadoop.security.UserGroupInformation.doAs (Usergroupinformation.java:1628) at Org.apache.hadoop.mapreduce.Job.submit (Job.java:1293) at org.apache.hadoop.mapred.jobclient$1.run (jobclient.java:562) at org.apache.hadoop.mapred.jobclient$1.run (jobclient.java:557) at java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (Subject.jav A:415) at Org.apache.hadoop.security.UserGroupInformation.doAs (Usergroupinformation.java:1628) at Org.apache.hadoop.mapred.JobClient.submitJobInternal (Jobclient.java:557) at Org.apache.hadoop.mapred.JobClient.submitJob (Jobclient.java:548) at Org.apache.hadoop.mapred.JobClient.runJob (Jobclient.java:833) at Com.zongtui.WordCount.main (Wordcount.java:83)

1, change the output path.

2. Remove the re-build.

After the run is finished, look at the results:

Come with me. Hadoop (1)-hadoop2.6 installation and use

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More