1. Environment
Operating system: UBUNTU16
jdk:1.8
hadoop:2.9.1
Machine: 3 units, master:192.168.199.88,node1:192.168.199.89,node2:192.168.199.90
2. Construction steps
2.1 Modify hostname hostname, three machines execute the following commands, and then fill in Master,node1,node2
sudo vim/etc/hostname
2.2 Modify the Hosts file, and three machines execute sequentially
sudo vim/etc/hosts
2.3 Modifying environment variables, three in turn
Vim/etc/profile, and then source/etc/profile make it effective.
Java_home is the Java installation path, if you do not know your Java installation path, please refer to the following actions:
Which Java is targeted to the Java program execution path, not the installation path, after two times-LRT the final output is the installation path
2.3 Configure Master to Node1 and Node2-free login
The effect is to enter SSH Node1 on the master can log on Node1, otherwise when the Cluster service, master and node can not connect, will report connection refused
Before 2.3.1 configuration, make sure the Openssh-server is installed and not installed by default
Input dpkg--list | grep ssh, if there is no Openssh-server, execute the following command to install:
sudo apt-get install Openssh-server
2.3.2 Each machine executes ssh-keygen-t RSA, then enter
The generated public key is saved under ~/.ssh
2.3.3, put the public key into Authorized_keys on master, the command is as follows:
Cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
2.3.4, place the Authorized_keys on the master on the other machine
SCP ~/.ssh/authorized_keys [Email protected]:~/.ssh/
SCP ~/.ssh/authorized_keys [Email protected]:~/.ssh
2.3.5, test whether successful
2.4 Download Hadoop and modify the configuration file to execute on master, and then copy to other machines after modification
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable/hadoop-2.9.1.tar.gz ( I download the stable version here is 2.9.1, if updated, download the corresponding tar.gz package can be)
Decompression: Tar zxf hadoop-2.9.1.tar.gz
2.5 Create an HDFS storage directory
Enter the Unpacked folder: CD hadoop2.9.1
mkdir HDFs
CD HDFs
mkdir name Data tmp
./hdfs/name--Store Namenode file
./hdfs/data--Storing data
./hdfs/tmp--Storing temporary files
2.6 Modifying an XML configuration file
The XML file that needs to be modified under hadoop2.9.1/etc/hadoop/
There are 5 main files to modify:
hadoop-env.sh
Core-site.xml
Hdfs-site.xml
Mapred-site.xml
Yarn-site.xml
Slaves
2.6.1, vim hadoop-env.sh, fill in the Java installation path
2.6.2, vim core-site.xml,configuration tag insert the following content
2.6.3, Vim Hdfs-site.xml
2.6.4, Vim Mapred-site.xml
2.6.5, Vim Yarn-site.xml
2.6.6, Vim slaves, the inside of the localhost deleted, write from the node host name
2.7 remotely Copy the Hadoop folder to the node nodes
Scp-r hadoop-2.9.1 [Email protected]:/home/wang/
Scp-r hadoop-2.9.1 [Email protected]:/home/wang/
2.8 Starting Hadoop
2.8.1 Format the command before you start: Hadoop namenode-format
Because Master is namenode,node1 and Node2 are Datanode, they are only executed on master.
If successfully formatted is present, the format is successful and you see the current folder under Name
2.8.2 after the format is successful, execute the command on master: start-all.sh, the JPS command can be used to view the open process, the master has four processes, node is three
If not, please check the configuration file from beginning to end to see if there is a misspelled place
Hadoop Distributed Cluster Setup (2.9.1)