Install and configure Hadoop 2.7.1 in RedHat Linux 6.5
1. Build a Linux environment
The environment I prepared is VM RedHat Linux 6.5 64bit.
Set a fixed IP Address
Vim/etc/sysconfig/network-scripts/ifcfg-eth0
Set the IP address to 192.168.38.128.
Modify the Host Name: vim/etc/hosts
Change the host name to itbuilder1.
2. Install JDK
Configure JDK Environment Variables
3. Install the Hadoop Environment
Download the hadoop core package 2.7.1 from the Apache website.
Address: http://archive.apache.org/dist/hadoop/core/stable2/hadoop-2.7.1.tar.gz
3.1 decompress the installation package to the specified directory
First create a directory: mkdir/usr/local/hadoop
Decompress the file to the/usr/local/hadoop Directory: tar-zxvf hadoop-2.7.1.tar.gz-C/usr/local/hadoop
3.2 modify the configuration file
Five configuration files need to be modified for hadoop2.7.1, as shown below:
1. hadoop-env.sh
2. core-site.xml
3. hdfs-site.xml
4. mapred-site.xml (mapred-site.xml.template)
5. yarn-site.xml
These five files are all in the etc under the hadoop wood, the specific directory is:/usr/local/hadoop/hadoop-2.7.1/etc/hadoop/
3.2.1 modifying environment variables (hadoop-env.sh)
Use the vim command to open the hadoop-env.sh File
Set the JDK root directory in the specified JavaHome directory, for example:
Export JAVA_HOME =/usr/java/jdk1.8.0 _ 20
3.2.2 core-site.xml configuration, specifying the HDFS namenode and temporary file address
<Configuration>
<! -- Specify the address of the HDFS boss (NameNode) -->
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // itbuilder1: 9000 </value>
</Property>
<! -- Specify the directory for storing files generated during hadoop running -->
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/usr/local/hadoop/hadoop-2.7.1/tmp </value>
</Property>
</Configuration>
3.2.3 hdfs-site.xml (number of specified replicas)
<! -- Determine the number of data copies stored in HDFS -->
<Configuration>
<Property>
<Name> dfs. replication </name>
<Value> 1 </value>
</Property>
</Configuration>
3.2.4 The mapred-site.xml tells hadoop that MR will run on yarn later
<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
</Property>
</Configuration>
3.2.5 yarn-site.xml
<Configuration>
<! -- Tell nodemanager to shuffle the way to get data -->
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<! -- Set the ResourceManager address. -->
<Property>
<Name> yarn. resourcemanager. hostname </name>
<Value> itbuilder1 </value>
</Property>
</Configuration>
4. Add hadoop to Environment Variables
Vim/etc/profile
Export JAVA_HOME =/usr/java/jdk1.8.0 _ 20
Export HADOOP_HOME =/usr/local/hadoop/hadoop-2.7.1
Export PATH = $ PATH: $ JAVA_HOME/bin: $ HADOOP_HOME/bin
# Refresh/etc/profile
Source/etc/profile
5. initialize (Format) the File System (HDFS)
# Hadoop namenode-format (outdated)
Hdfs namenode-format (the latest wait time is long)
6. Start hadoop (hdfs yarn)
./Start-all.sh (out of date, need to confirm multiple times and enter the linux Password) later use two Commands
/Start-hdfs.sh
/Start-yarn.sh
Run the jps command to view the currently opened processes.
[Root @ bkjia ~] # Jps
3461 ResourceManager
3142 DataNode
3751 NodeManager
3016 NameNode
Jps 5034
3307 SecondaryNameNode
Access Management Interface:
Http: // 192.168.38.128: 50070 (hdfs Management Interface)
Http: // 192.168.38.128: 8088 (mr Management Interface)
The two interfaces are displayed, indicating that the installation is successful.
Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04
Install and configure Hadoop2.2.0 on CentOS
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition