Come with me. Hadoop (1)-hadoop2.6 Installation and use

Source: Internet
Author: User
Tags hdfs dfs

Pseudo-distributed

Three ways to install Hadoop:

    • Local (Standalone) Mode
    • Pseudo-distributed Mode
    • Fully-distributed Mode
Required before installation

$ sudo apt-get install SSH
$ sudo apt-get install rsync

See: http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html

Pseudo-Distributed configuration configurations

Modify Bottom:

Etc/hadoop/core-site.xml:

<configuration>    <property>        <name>fs.defaultFS</name>        <value>hdfs:// Localhost:9000</value>    </property></configuration>

Etc/hadoop/hdfs-site.xml:

<configuration>    <property>        <name>dfs.replication</name>        <value>1< /value>    </property></configuration>
Configuring SSH
  $ ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA  $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
If you want to run on yarn
You need to follow the steps below:
    1. Configure parameters as follows:

      Etc/hadoop/mapred-site.xml:

      <configuration>    <property>        <name>mapreduce.framework.name</name>        <value >yarn</value>    </property></configuration>

      Etc/hadoop/yarn-site.xml:

      <configuration>    <property>        <name>yarn.nodemanager.aux-services</name>        < Value>mapreduce_shuffle</value>    </property></configuration>
    2. Start ResourceManager daemon and NodeManager daemon:
        $ sbin/start-yarn.sh
    3. Browse the Web interface for the ResourceManager; By default it's available at:
      • ResourceManager- http://localhost:8088/
    4. Run a MapReduce job.
    5. When you ' re done, stop the daemons with:
        $ sbin/stop-yarn.sh

Input:

http://localhost:8088/

Can see

After you start yarn

    1. Format the filesystem:
        $ Bin/hdfs Namenode-format
    2. Start NameNode daemon and DataNode daemon:
        $ sbin/start-dfs.sh

      The Hadoop daemon log output is written to the $HADOOP _log_dir directory (defaults to $HADOOP _home/logs) .

    3. Browse the Web interface for the NameNode; By default it's available at:
      • NameNode- http://localhost:50070/

The input is followed by:

Then perform the test

    1. Make the HDFS directories required to execute MapReduce jobs:
        $ Bin/hdfs dfs-mkdir/user  $ bin/hdfs dfs-mkdir/user/<username>
    2. Copy the input files into the distributed filesystem:
        $ Bin/hdfs dfs-put Etc/hadoop input
    3. Run Some of the examples provided:
        $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input Output ' dfs[a-z. +
    4. Examine the output files:

      Copy the output files from the distributed filesystem to the local filesystem and examine them:

        $ bin/hdfs dfs-get output output  $ cat output/*

      Or

      View the output files on the distributed filesystem:

        $ Bin/hdfs Dfs-cat output/*

Look at the running situation:

View Results

The test execution succeeds and the local code can be written.

Eclipse hadoop2.6 Plug-in uses

Download Source:

git clone https://github.com/winghc/hadoop2x-eclipse-plugin.git

Download process:

To compile the plugin:

CD Src/contrib/eclipse-plugin
Ant jar-dversion=2.6.0-declipse.home=/usr/local/eclipse-dhadoop.home=/usr/local/hadoop-2.6.0//path according to its own configuration

    • Copy the compiled jar into the Eclipse plugin directory and restart eclipse
    • Configure the Hadoop installation directory

Windows->preference, Hadoop map/reduce, Hadoop installation directory

    • Configuring the Map/reduce View

Window->open Perspective, other->map/reduce click "OK"

Windows→show view→other->map/reduce locations-> Click "OK"

    • The console will have one more tab page for "Map/reduce Locations"

On the "Map/reduce Locations" tab, click the icon < elephant +> or right click on the blank, select "New Hadoop location ...", and the dialog box "new Hadoop locations ..." pops up, Configure the following: Change HA1 to your own Hadoop user

Note: The MR master and DFS master configurations must be consistent with the configuration files such as Mapred-site.xml and Core-site.xml.

Open Project Explorer to view the HDFs file system.

    • New Map/reduce Task

File->new->project->map/reduce Project->next

Write the WordCount class: Remember to get the service up first

/** *  */ PackageCom.zongtui;/*** Classname:wordcount <br/> * Function:todo ADD Function. <br/> * Date:jun, 5:34:18 AM <br/ > * *@authorZhangfeng *@version  * @sinceJDK 1.7*/Importjava.io.IOException;ImportJava.util.Iterator;ImportJava.util.StringTokenizer;ImportOrg.apache.hadoop.fs.Path;Importorg.apache.hadoop.io.IntWritable;Importorg.apache.hadoop.io.LongWritable;ImportOrg.apache.hadoop.io.Text;ImportOrg.apache.hadoop.mapred.FileInputFormat;ImportOrg.apache.hadoop.mapred.FileOutputFormat;Importorg.apache.hadoop.mapred.JobClient;Importorg.apache.hadoop.mapred.JobConf;Importorg.apache.hadoop.mapred.MapReduceBase;ImportOrg.apache.hadoop.mapred.Mapper;ImportOrg.apache.hadoop.mapred.OutputCollector;ImportOrg.apache.hadoop.mapred.Reducer;ImportOrg.apache.hadoop.mapred.Reporter;ImportOrg.apache.hadoop.mapred.TextInputFormat;ImportOrg.apache.hadoop.mapred.TextOutputFormat; Public classWordCount { Public Static classMapextendsMapreducebaseImplementsMapper<longwritable, text, text, intwritable> {        Private Final StaticIntwritable one =NewIntwritable (1); PrivateText Word =NewText ();  Public voidmap (longwritable key, Text value, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {String line=value.tostring (); StringTokenizer Tokenizer=NewStringTokenizer (line);  while(Tokenizer.hasmoretokens ()) {Word.set (Tokenizer.nexttoken ());            Output.collect (Word, one); }        }    }     Public Static classReduceextendsMapreducebaseImplementsReducer<text, Intwritable, Text, intwritable> {         Public voidReduce (Text key, iterator<intwritable>values, Outputcollector<text, intwritable>output, Reporter Reporter)throwsIOException {intsum = 0;  while(Values.hasnext ()) {sum+=Values.next (). get (); } output.collect (Key,Newintwritable (sum)); }    }     Public Static voidMain (string[] args)throwsException {jobconf conf=NewJobconf (WordCount.class); Conf.setjobname ("WordCount"); Conf.setoutputkeyclass (Text.class); Conf.setoutputvalueclass (intwritable.class); Conf.setmapperclass (Map.class); Conf.setreducerclass (Reduce.class); Conf.setinputformat (Textinputformat.class); Conf.setoutputformat (Textoutputformat.class); Fileinputformat.setinputpaths (conf,NewPath (args[0])); Fileoutputformat.setoutputpath (conf,NewPath (args[1]));    Jobclient.runjob (conf); }}

User/admin123/input/hadoop is the file you upload in the HDFs folder (which you create yourself) and put in the files you want to process. OUPUT1 Output Results

Run the program on a Hadoop cluster: Right--->runas-->run on Hadoop, and the final output will be displayed in the appropriate folder in HDFs. At this point, Ubuntu hadoop-2.6.0 Eclipse plugin configuration is complete.

Encountered an exception

Exception in thread "main" Org.apache.hadoop.mapred.FileAlreadyExistsException:Output directory HDFs://Localhost:9000/output already existsAt Org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs (fileoutputformat.java:132) at Org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs (Jobsubmitter.java:564) at Org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal (Jobsubmitter.java:432) at org.apache.hadoop.mapreduce.job$10.run (job.java:1296) at org.apache.hadoop.mapreduce.job$10.run (job.java:1293) at java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (Subject.jav A:415) at Org.apache.hadoop.security.UserGroupInformation.doAs (Usergroupinformation.java:1628) at Org.apache.hadoop.mapreduce.Job.submit (Job.java:1293) at org.apache.hadoop.mapred.jobclient$1.run (jobclient.java:562) at org.apache.hadoop.mapred.jobclient$1.run (jobclient.java:557) at java.security.AccessController.doPrivileged (Native Method) at Javax.security.auth.Subject.doAs (Subject.jav A:415) at Org.apache.hadoop.security.UserGroupInformation.doAs (Usergroupinformation.java:1628) at Org.apache.hadoop.mapred.JobClient.submitJobInternal (Jobclient.java:557) at Org.apache.hadoop.mapred.JobClient.submitJob (Jobclient.java:548) at Org.apache.hadoop.mapred.JobClient.runJob (Jobclient.java:833) at Com.zongtui.WordCount.main (Wordcount.java:83)

1, change the output path.

2. Remove the re-build.

After the run is finished, look at the results:

Come with me. Hadoop (1)-hadoop2.6 installation and use

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.