Hadoop-Setup and configuration

Source: Internet
Author: User

    • Hadoop Modes
    • Pre-install Setup
      • Creating a user
      • SSH Setup
      • Installing Java
    • Install Hadoop
      • Install in Standalone Mode
        • Lets do a test
      • Install in Pseudo distributed Mode
        • Hadoop Setup
        • Hadoop Configuration
        • YARN Configuration

This section configures a Linux-based Hadoop environment.

Hadoop Modes

Hadoop supports three modes:

    • local/standalonemode: The default setting is Standalone mode, which runs as a Java process.
    • Pseudo Distributed Mode: simulates the distribution on a single machine. HDFs, YARN, MapReduce, etc. these Hadoop daemon are a standalone Java process.
    • Fully Distributed Mode: Two or more machines are required as a cluster for true distribution.
Pre-install setupcreating a user

Recommended to build a separate user for Hadoop, modify directory Permissions

$ su    passwd# useradd hadoop# passwd hadoop    New passwd:    new# chown -R hadoop /usr/hadoop
SSH Setup
$ $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized$ 0600 ~/.ssh/authorized

The current shell then uses SSH to link localhost without having to enter a password.

$ ssh localhost
Installing Java
$ java -version

If this command can be correctly viewed in Java version then Java has been fighting for installation, if not, be sure to install Java first.

    • step1: Download java (jdk-*u**-os-x64.tar.gz) here.
    • SETP2: Switch to the folder where Java is located and unzip.
$ cd Downloads/ $ tar zxf jdk-7u71-linux-x64.gz $ ls jdk1.7.0_71   jdk-7u71-linux-x64.
    • step3: Enable all users to use Java, move Java to "/usr/local", or other places you want to install.
$ password:# mv jdk1.7.0_71 /usr/local/ 
    • Step4
      Add the following to the ~/.BASHRC:
export JAVA_HOME=/usr/local/jdk1.7.0export PATH=$PATH:$JAVA_HOME
source ~/.bashrc
    • Step5
      To facilitate management, add Java to the version manager, Ubuntu is update-alternatives:
# alternatives --install /usr/bin/java java usr/local/java/bin/java 2# alternatives --install /usr/bin/javac javac usr/local/java/bin/javac 2# alternatives --install /usr/bin/jar jar usr/local/java/bin/jar 2# alternatives --set java usr/local/java/bin/java# alternatives --set javac usr/local/java/bin/javac# alternatives --set jar usr/local/java/bin/jar
Install Hadoop

Find the version you need here, download Hadoop, and pressurize it. I downloaded the hadoop-2.7.1.

# cd /usr/local # wget http://apache.claz.org/hadoop/common/hadoop-2.7.1/hadoop-2.4.1.tar.gz# tar xzf hadoop-2.7.1.tar.gz # chmod -R 777 /usr/local/hadop-2.7.1
Install in Standalone Mode

There is no daemons in this mode and all are running in the same JVM.
Write the following command to ~/.BASHRC

exportexport PATH=$HADOOP_HOME/bin:$PATH
source ~/.bashrc

Then check to see if Hadoop works correctly:

$ hadoop version

If the installation succeeds, a result similar to the following is displayed (this is my output):

Hadoop2.7. 1SubversionHTTPS://git-wip-us.apache.org/repos/asf/hadoop.git-r theEcc87ccf4a0228f35af08fc56de536e6ce657acompiled byJenkinsOn   2015-06-29t06:04Z Compiled withProtoc2.5. 0From source withChecksum Fc0a1a23fc1868e4d5ee7fa2b28a58athis Command  was run using /usr/local/hadoop-2. 7. 1/share/Hadoop/Common/hadoop-common-2. 7  . 1. jar 
Let's do a test

Now, let's do a little WordCount quiz with Hadoop in standalone mode.

$ $HADOOP_HOME$ mkdir input$ $ $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar  wordcount input ouput  #请查看你所安装按本的jar文件的名字$ cat output/*

Then, without an accident, you should be able to see the number of words in the file.

Install in Pseudo distributed modehadoop Setup

Write the following command to ~/.BASHRC

ExportHadoop_home=/usr/local/hadoopExportHadoop_mapred_home=$HADOOP _home ExportHadoop_common_home=$HADOOP _home ExportHadoop_hdfs_home=$HADOOP _home ExportYarn_home=$HADOOP _home ExportHadoop_common_lib_native_dir=$HADOOP _home/lib/nativeExportPath=$PATH:$HADOOP _home/sbin:$HADOOP _home/binExportHadoop_install=$HADOOP _homeExportHadoop_log_dir=$HADOOP _home/logs#这个是默认位置, you can also customize the location you want
source ~/.bashrc
Hadoop Configuration

Modify $hadoop_home/etc/hadoop/hadoop-env.sh to set the Java environment variable Java_home

#export JAVA_HOME=${JAVA_HOME}export JAVA_HOME=/opt/jdk1.8.0_81

Modify $hadoop_home/etc/hadoop/core-site.xml file:

<configuration>    <property>        <name>fs.defaultFS</name> //也可以写作 fs.default.name        <value>hdfs://localhost:9000</value>    </property></configuration>

Modify $hadoop_home/etc/hadoop/core-site.xml file:

<configuration>    <property >        <name>Dfs.replication</name>        <value>1</value>    </Property >     <property >      <name>Dfs.name.dir</name>      <value>File:///home/hadoop/hadoopinfra/hdfs/namenode</value>Here the file behind is three slash, I hit two, has not started up namenode.</Property >   <property >      <name>Dfs.data.dir</name>       <value>File:///home/hadoop/hadoopinfra/hdfs/datanode</value>    </Property ></configuration>

Format HDFS File system

$ hdfs namenode -format

Start the daemon for NameNode and DataNode

$  start-dfs.sh $  JPS # see if it starts normally   jps  169799  secondarynamenode  34918   Nailgun  169311  namenode  169483  datanode  $  stop-dfs.sh  

Note: If you are downloading the content according to the contents ofthe blog hadoop-. *.TAR.GZ installation package, and if your machine is 64-bit, you will see a warning, as follows:

tolibraryforplatformusingwhere applicable

This is because the $HADOOP in the installation package _home/lib/native/libhadoop.so.1.0.0 This local HADOOP library is compiled on a 32-bit machine and can be selected as follows:
1. Ignore it, because it's just a warn, and it doesn't affect the functionality of Hadoop
2. Worry that it will cause instability, download the Hadoop source package hadoop-. *-SRC.TAR.GZ, recompile
For details, refer to "Link 1" "Link 2"

Stop before you can view NameNode by browser, whose default port is 50070, the default port for DataNode is 50030
+ http://localhost:50070
+ http://localhost:50030

If you do not start properly, You can view the contents of the corresponding LOG file in the Hadoop_log_dir directory, corresponding to debugging.

YARN Configuration

Modify $hadoop_home/etc/hadoop/mapred-site.xml file:

<configuration>    <property>        <name>mapreduce.framework.name</name>        <value>yarn</value>    </property></configuration>

Modify $hadoop_home/etc/hadoop/yarn-site.xml file:

< Configuration ,  <property ;  Span class= "Hljs-tag" ><name ;  Yarn.nodemanager.aux-services</name ;  <value ;  mapreduce_shuffle</value ;  </property ;  </configuration ;   
$ start-yarn.sh$ jps  # 查看是否正常启动    152423JournalNode    173170Jps    34918Nailgun    172778ResourceManager    172956NodeManager$ stop-yarn.sh

ResourceManager the default browser port is 8088,stop before it can be viewed in the browser:
+ http://localhost:8088

Or you can use script start/stop-all.sh to manage

start-all.sh$ stop-all.sh

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Hadoop-Setup and configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.