I'm giving my side. The cluster environment consists of a master node and three nodes slave:
Master 192.168.1.2
Slave1 192.168.1.3
Slave2 192.168.1.4
Slave3 192.168.1.5
Disclaimer: I am building this cluster is not in the ordinary user, so the operation is on the root of the super user.
One, the installation of virtual machines
A) We download CentOS7 from the CentOS website (download DVD iOS image on the line)
b) Install four virtual machines on the VMware Workstation 11, and I will not repeat the specific installation steps here.
Second, the virtual machine pre-configuration
A) After four virtual machines are installed, we set up the virtual machine as a static IP and modify the host name and IP address of the virtual machine, as well as add IP and host mappings:
1) Modify the hostname (in four virtual machine mechanic to: Master, slave1, Slave2 and Slave3):
Vi/etc/hostname
2) Modify the IP address and set it to static IP:
vi/etc/sysconfig/network-scripts/ifcfg-eno16777736 (the network card of the virtual machine is usually ifcfg-eno16777736 by default)
Add the following content:
Bootproto=static #设置为静态IP
Onboot=yes #打开网卡
ipaddr=192.168.1.2 #设置IP, corresponding to the four IP addresses given above, this is the IP of master
netmask=255.255.255.0 #设置子网掩码
gateway=192.168.1.1 #设置网关
dns1=8.8.8.8 #设置DNS, this is Google's two domain name resolution
dns2=8.8.4.4
3) Add IP and host mappings:
Vi/etc/hosts
Add the following content:
192.168.1.2 Master
192.168.1.3 slave1
192.168.1.4 Slave2
192.168.1.5 Slave3
4) Reboot restart four virtual machines after completing the above steps:
Reboot
Third, ssh password-free login
A) under root of each virtual machine, create the SSH public key:
1) ssh-keygen–t RSA #连续回车, the system automatically generates a graphical public key
2) in master, enter the. SSH directory and write the public key to Authorized_keys:
CD. ssh/
SSH master cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码
SSH slave1 cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码
SSH slave2 cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码
SSH slave3 cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码
3) Change the Authorized_keys property so that it cannot be modified:
chmod Authorized_keys
4) in Master's. SSH directory, copy the generated known_hosts and authorized_keys to each slave node:
SCP Authorized_keys [Email protected]:/root/#需要输入超级用户密码
SCP Authorized_keys [Email protected]:/root/#需要输入超级用户密码
SCP Authorized_keys [Email protected]:/root/#需要输入超级用户密码
SCP known_hosts [Email protected]:/root/
SCP known_hosts [Email protected]:/root/
SCP known_hosts [Email protected]:/root/
5) Verify password-free login for SSH:
Enter in master: SSH slave1, ssh slave2 and ssh slave3 require a password, and if not, SSH password-free configuration is successful.
Iv. Installation of JDK
Five, Scala installation
Six, Hadoop2.6 the decompression installation
A) We download the HADOOP-2.6.0-CDH5 version of the compressed package from Cloudera official website
b) in the master node, unzip the Hadoop package into/root/app/:
TAR–XZVF hadoop-2.6.0-cdh5.4.0.tar.gz
Vii. configuration of Hadoop in a clustered environment
In master, go to the configuration directory for Hadoop:
CD Hadoop-2.6.0-cdh5.4.0/etc/hadoop
A) configuration of the slaves:
VI Slaves
Delete localhost, add three slave nodes:
Slave1
Slave2
Slave3
b) configuration of the Core-site.xml:
Prior to this, you need to create data/tmpin the Hadoop directory:
mkdir data
CD data
Mrdir tmp
mkdir DFS
CD DFS
mkdir data
mkdir Name
Under the Hadoop-2.6.0-cdh5.4.0/etc/hadoop:
VI Core-site.xml
Added in Core-site.xml:
<property>
<name>hadoop.tmp.dir</name>
<value>file:/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp</value>
</property>
c) Configuration of the Hdfs-site.xml:
VI Hdfs-site.xml
Increase in Hdfs-site.xml:
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp/dfs/data</value>
</property>
d) configuration of the Mapred-site.xml:
VI Mapred-site.xml
Increase in Mapred-site.xml:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
e) configuration of Yarn-site.xml:
VI Yarn-site.xml
Increase in Yarn-site.xml:
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
f) After configuring Hadoop in master, copy the Hadoop to the corresponding location on each node:
scp–r/root/app/hadoop-2.6.0-cdh5.4.0 slave1:/root/app/
scp–r/root/app/hadoop-2.6.0-cdh5.4.0 slave2:/root/app/
scp–r/root/app/hadoop-2.6.0-cdh5.4.0 slave3:/root/app/
Eight, start Hadoop
A) in the directory of Hadoop of master node master:
Bin/hdfs Namenode–format
sbin/start-dfs.sh
sbin/start-yarn.sh
Then on the Master node Master , enter JPS to see the four processes of Namenode, Secondarynamenode, JPS, and ResourceManager.
Enter JPS from node slave to see three processes for JPS, NodeManager, and Datanode.
b) The above process prompts, congratulations, your Hadoop environment is set up!
Hadoop2.6.0 cluster construction in CentOS 7