Complete steps for installing Hadoop2.7 on centos 7

Source: Internet
Author: User

Complete steps for installing Hadoop2.7 on centos 7

The general idea is to prepare a master-slave server, configure the master server to log on to the slave server without a password, decompress and install JDK, decompress and install Hadoop, and configure master-slave relationships such as hdfs and mapreduce.

1. Environment: 3 CentOS7, 64-bit, Hadoop2.7 requires 64-bit Linux, CentOS7 Minimal's ISO file is only 600 mb, and the operating system can be installed in 10 minutes,
Master 192.168.0.182
Slave1 192.168.0.183
Slave2 192.168.0.184

2. SSH password-free login because Hadoop needs to log on to each node through SSH for operations. I use a root user, and each server generates a public key, which is then merged into authorized_keys.
(1) by default, CentOS does not start ssh password-less logon. Remove the comments in line 2 of/etc/ssh/sshd_config, and set each server,
# RSAAuthentication yes
# PubkeyAuthentication yes
(2) enter the command, ssh-keygen-t rsa, generate the key, and do not enter the password. Press enter all the time./root will generate the. ssh folder, which must be set for each server,
(3) Merge the public key to the authorized_keys file, go to the/root/. ssh directory on the Master server, and run the SSH command to merge the files,
Cat id_rsa.pub> authorized_keys
Ssh root@192.168.0.183 (cat ~ /. Ssh/id_rsa.pub> authorized_keys
Ssh root@192.168.0.184 (cat ~ /. Ssh/id_rsa.pub> authorized_keys
(4) Copy authorized_keys and known_hosts of the Master server to the/root/. ssh directory of the Slave server.
(5) complete, ssh root@192.168.0.183, ssh root@192.168.0.184 does not need to enter the password

3. to install JDK, Hadoop2.7 requires JDK 7. Since my CentOS is minimal installation, you can directly decompress the downloaded JDK and configure the variables without OpenJDK.
(1x-download “jdk-7u79-linux-x64.gz "and put it in the/home/java directory.
(2) extract, enter the command, tar-zxvf jdk-7u79-linux-x64.gz
(3) edit/etc/profile
Export JAVA_HOME =/home/java/jdk1.7.0 _ 79
Export CLASSPATH =.: $ JAVA_HOME/jre/lib/rt. jar: $ JAVA_HOME/lib/dt. jar: $ JAVA_HOME/lib/tools. jar
Export PATH = $ PATH: $ JAVA_HOME/bin
(4) to make the configuration take effect, enter the command, source/etc/profile
(5) enter the command, java-version, to complete

4. Install Hadoop2.7, unzip the package on the Master server, and then copy the package to the Slave server.
(1)download hadoop-2.7.0.tar.gz "and put it in the/home/hadoop directory.
(2) extract, enter the command, tar-xzvf hadoop-2.7.0.tar.gz
(3) create data storage folders in the/home/hadoop directory, such as tmp, hdfs, hdfs/data, and hdfs/name.

5. Configure hadoop-2.7.0 under the/home/hadoop/core-site.xml/etc/hadoop directory
<Configuration>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // 192.168.0.182: 9000 </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value> file:/home/hadoop/tmp </value>
</Property>
<Property>
<Name> io. file. buffer. size </name>
<Value> 131702 </value>
</Property>
</Configuration>

6. Configure hadoop-2.7.0 under the/home/hadoop/hdfs-site.xml/etc/hadoop directory
<Configuration>
<Property>
<Name> dfs. namenode. name. dir </name>
<Value> file:/home/hadoop/dfs/name </value>
</Property>
<Property>
<Name> dfs. datanode. data. dir </name>
<Value> file:/home/hadoop/dfs/data </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 2 </value>
</Property>
<Property>
<Name> dfs. namenode. secondary. http-address </name>
<Value> 192.168.0.182: 9001 </value>
</Property>
<Property>
<Name> dfs. webhdfs. enabled </name>
<Value> true </value>
</Property>
</Configuration>

7. Configure hadoop-2.7.0 under the/home/hadoop/mapred-site.xml/etc/hadoop directory
<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. address </name>
<Value> 192.168.0.182: 10020 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. webapp. address </name>
<Value> 192.168.0.182: 19888 </value>
</Property>
</Configuration>


8. Configure hadoop-2.7.0 under the/home/hadoop/mapred-site.xml/etc/hadoop directory
<Configuration>
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. nodemanager. auxservices. mapreduce. shuffle. class </name>
<Value> org. apache. hadoop. mapred. ShuffleHandler </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address </name>
<Value> 192.168.0.182: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address </name>
<Value> 192.168.0.182: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address </name>
<Value> 192.168.0.182: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address </name>
<Value> 192.168.0.182: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address </name>
<Value> 192.168.0.182: 8088 </value>
</Property>
<Property>
<Name> yarn. nodemanager. resource. memory-mb </name>
<Value> 768 </value>
</Property>
</Configuration>

9, configuration/home/hadoop/hadoop-2.7.0/etc/hadoop directory under the hadoop-env.sh, yarn-env.sh JAVA_HOME, if not set, cannot start,
Export JAVA_HOME =/home/java/jdk1.7.0 _ 79

10. Configure slaves under the/home/hadoop/hadoop-2.7.0/etc/hadoop directory, delete the default localhost, add 2 slave nodes,
192.168.0.183
192.168.0.184

11. Copy the configured Hadoop to the corresponding locations of each node and transmit it through scp,
Scp-r/home/hadoop 192.168.0.183:/home/
Scp-r/home/hadoop 192.168.0.184:/home/

12. Start hadoop on the Master server, the slave node will automatically start and enter the/home/hadoop/hadoop-2.7.0 directory
(1) initialize, enter the command, bin/hdfs namenode-format
(2) All start sbin/start-all.sh, can also be separated sbin/start-dfs.sh, sbin/start-yarn.sh
(3) If you stop, enter the command, sbin/stop-all.sh
(4) enter the command, jps, and you can see the relevant information.

13. For Web access, open the port or directly disable the firewall.
(1) enter the command, systemctl stop firewalld. service
(2) Open http: // 192.168.0.182: 8088/in the browser/
(3) Open http: // 192.168.0.182: 50070/in the browser/

14. installation is complete. This is only the beginning of the big data application. Later, we will compile a program to call the Hadoop interface based on our own situation to play the role of hdfs and mapreduce.

You may also like the following articles about Hadoop:

Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.