- Hadoop Modes
- Pre-install Setup
- Creating a user
- SSH Setup
- Installing Java
- Install Hadoop
- Install in Standalone Mode
- Install in Pseudo distributed Mode
- Hadoop Setup
- Hadoop Configuration
- YARN Configuration
This section configures a Linux-based Hadoop environment.
Hadoop Modes
Hadoop supports three modes:
- local/standalonemode: The default setting is Standalone mode, which runs as a Java process.
- Pseudo Distributed Mode: simulates the distribution on a single machine. HDFs, YARN, MapReduce, etc. these Hadoop daemon are a standalone Java process.
- Fully Distributed Mode: Two or more machines are required as a cluster for true distribution.
Pre-install setupcreating a user
Recommended to build a separate user for Hadoop, modify directory Permissions
$ su passwd# useradd hadoop# passwd hadoop New passwd: new# chown -R hadoop /usr/hadoop
SSH Setup
$ $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized$ 0600 ~/.ssh/authorized
The current shell then uses SSH to link localhost without having to enter a password.
$ ssh localhost
Installing Java
$ java -version
If this command can be correctly viewed in Java version then Java has been fighting for installation, if not, be sure to install Java first.
- step1: Download java (jdk-*u**-os-x64.tar.gz) here.
- SETP2: Switch to the folder where Java is located and unzip.
$ cd Downloads/ $ tar zxf jdk-7u71-linux-x64.gz $ ls jdk1.7.0_71 jdk-7u71-linux-x64.
- step3: Enable all users to use Java, move Java to "/usr/local", or other places you want to install.
$ password:# mv jdk1.7.0_71 /usr/local/
- Step4
Add the following to the ~/.BASHRC:
export JAVA_HOME=/usr/local/jdk1.7.0export PATH=$PATH:$JAVA_HOME
source ~/.bashrc
- Step5
To facilitate management, add Java to the version manager, Ubuntu is update-alternatives:
# alternatives --install /usr/bin/java java usr/local/java/bin/java 2# alternatives --install /usr/bin/javac javac usr/local/java/bin/javac 2# alternatives --install /usr/bin/jar jar usr/local/java/bin/jar 2# alternatives --set java usr/local/java/bin/java# alternatives --set javac usr/local/java/bin/javac# alternatives --set jar usr/local/java/bin/jar
Install Hadoop
Find the version you need here, download Hadoop, and pressurize it. I downloaded the hadoop-2.7.1.
# cd /usr/local # wget http://apache.claz.org/hadoop/common/hadoop-2.7.1/hadoop-2.4.1.tar.gz# tar xzf hadoop-2.7.1.tar.gz # chmod -R 777 /usr/local/hadop-2.7.1
Install in Standalone Mode
There is no daemons in this mode and all are running in the same JVM.
Write the following command to ~/.BASHRC
exportexport PATH=$HADOOP_HOME/bin:$PATH
source ~/.bashrc
Then check to see if Hadoop works correctly:
$ hadoop version
If the installation succeeds, a result similar to the following is displayed (this is my output):
Hadoop2.7. 1SubversionHTTPS://git-wip-us.apache.org/repos/asf/hadoop.git-r theEcc87ccf4a0228f35af08fc56de536e6ce657acompiled byJenkinsOn 2015-06-29t06:04Z Compiled withProtoc2.5. 0From source withChecksum Fc0a1a23fc1868e4d5ee7fa2b28a58athis Command was run using /usr/local/hadoop-2. 7. 1/share/Hadoop/Common/hadoop-common-2. 7 . 1. jar
Let's do a test
Now, let's do a little WordCount quiz with Hadoop in standalone mode.
$ $HADOOP_HOME$ mkdir input$ $ $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount input ouput #请查看你所安装按本的jar文件的名字$ cat output/*
Then, without an accident, you should be able to see the number of words in the file.
Install in Pseudo distributed modehadoop Setup
Write the following command to ~/.BASHRC
ExportHadoop_home=/usr/local/hadoopExportHadoop_mapred_home=$HADOOP _home ExportHadoop_common_home=$HADOOP _home ExportHadoop_hdfs_home=$HADOOP _home ExportYarn_home=$HADOOP _home ExportHadoop_common_lib_native_dir=$HADOOP _home/lib/nativeExportPath=$PATH:$HADOOP _home/sbin:$HADOOP _home/binExportHadoop_install=$HADOOP _homeExportHadoop_log_dir=$HADOOP _home/logs#这个是默认位置, you can also customize the location you want
source ~/.bashrc
Hadoop Configuration
Modify $hadoop_home/etc/hadoop/hadoop-env.sh to set the Java environment variable Java_home
#export JAVA_HOME=${JAVA_HOME}export JAVA_HOME=/opt/jdk1.8.0_81
Modify $hadoop_home/etc/hadoop/core-site.xml file:
<configuration> <property> <name>fs.defaultFS</name> //也可以写作 fs.default.name <value>hdfs://localhost:9000</value> </property></configuration>
Modify $hadoop_home/etc/hadoop/core-site.xml file:
<configuration> <property > <name>Dfs.replication</name> <value>1</value> </Property > <property > <name>Dfs.name.dir</name> <value>File:///home/hadoop/hadoopinfra/hdfs/namenode</value>Here the file behind is three slash, I hit two, has not started up namenode.</Property > <property > <name>Dfs.data.dir</name> <value>File:///home/hadoop/hadoopinfra/hdfs/datanode</value> </Property ></configuration>
Format HDFS File system
$ hdfs namenode -format
Start the daemon for NameNode and DataNode
$ start-dfs.sh $ JPS # see if it starts normally jps 169799 secondarynamenode 34918 Nailgun 169311 namenode 169483 datanode $ stop-dfs.sh
Note: If you are downloading the content according to the contents ofthe blog hadoop-. *.TAR.GZ installation package, and if your machine is 64-bit, you will see a warning, as follows:
tolibraryforplatformusingwhere applicable
This is because the $HADOOP in the installation package _home/lib/native/libhadoop.so.1.0.0 This local HADOOP library is compiled on a 32-bit machine and can be selected as follows:
1. Ignore it, because it's just a warn, and it doesn't affect the functionality of Hadoop
2. Worry that it will cause instability, download the Hadoop source package hadoop-. *-SRC.TAR.GZ, recompile
For details, refer to "Link 1" "Link 2"
Stop before you can view NameNode by browser, whose default port is 50070, the default port for DataNode is 50030
+ http://localhost:50070
+ http://localhost:50030
If you do not start properly, You can view the contents of the corresponding LOG file in the Hadoop_log_dir directory, corresponding to debugging.
YARN Configuration
Modify $hadoop_home/etc/hadoop/mapred-site.xml file:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property></configuration>
Modify $hadoop_home/etc/hadoop/yarn-site.xml file:
< Configuration , <property ; Span class= "Hljs-tag" ><name ; Yarn.nodemanager.aux-services</name ; <value ; mapreduce_shuffle</value ; </property ; </configuration ;
$ start-yarn.sh$ jps # 查看是否正常启动 152423JournalNode 173170Jps 34918Nailgun 172778ResourceManager 172956NodeManager$ stop-yarn.sh
ResourceManager the default browser port is 8088,stop before it can be viewed in the browser:
+ http://localhost:8088
Or you can use script start/stop-all.sh to manage
start-all.sh$ stop-all.sh
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Hadoop-Setup and configuration