Hadoop2.6.0 cluster construction in CentOS 7

Source: Internet
Author: User

I'm giving my side. The cluster environment consists of a master node and three nodes slave:

Master 192.168.1.2

Slave1 192.168.1.3

Slave2 192.168.1.4

Slave3 192.168.1.5

Disclaimer: I am building this cluster is not in the ordinary user, so the operation is on the root of the super user.

One, the installation of virtual machines

A) We download CentOS7 from the CentOS website (download DVD iOS image on the line)

b) Install four virtual machines on the VMware Workstation 11, and I will not repeat the specific installation steps here.

Second, the virtual machine pre-configuration

A) After four virtual machines are installed, we set up the virtual machine as a static IP and modify the host name and IP address of the virtual machine, as well as add IP and host mappings:

1) Modify the hostname (in four virtual machine mechanic to: Master, slave1, Slave2 and Slave3):

Vi/etc/hostname

2) Modify the IP address and set it to static IP:

vi/etc/sysconfig/network-scripts/ifcfg-eno16777736 (the network card of the virtual machine is usually ifcfg-eno16777736 by default)

Add the following content:

Bootproto=static #设置为静态IP

Onboot=yes #打开网卡

ipaddr=192.168.1.2 #设置IP, corresponding to the four IP addresses given above, this is the IP of master

netmask=255.255.255.0 #设置子网掩码

gateway=192.168.1.1 #设置网关

dns1=8.8.8.8 #设置DNS, this is Google's two domain name resolution

dns2=8.8.4.4

3) Add IP and host mappings:

Vi/etc/hosts

Add the following content:

192.168.1.2 Master

192.168.1.3 slave1

192.168.1.4 Slave2

192.168.1.5 Slave3

4) Reboot restart four virtual machines after completing the above steps:

Reboot

Third, ssh password-free login

A) under root of each virtual machine, create the SSH public key:

1) ssh-keygen–t RSA #连续回车, the system automatically generates a graphical public key

2) in master, enter the. SSH directory and write the public key to Authorized_keys:

CD. ssh/

SSH master cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码

SSH slave1 cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码

SSH slave2 cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码

SSH slave3 cat id_rsa.pub>> Authorized_keys #需要输入超级用户密码

3) Change the Authorized_keys property so that it cannot be modified:

chmod Authorized_keys

4) in Master's. SSH directory, copy the generated known_hosts and authorized_keys to each slave node:

SCP Authorized_keys [Email protected]:/root/#需要输入超级用户密码

SCP Authorized_keys [Email protected]:/root/#需要输入超级用户密码

SCP Authorized_keys [Email protected]:/root/#需要输入超级用户密码

SCP known_hosts [Email protected]:/root/

SCP known_hosts [Email protected]:/root/

SCP known_hosts [Email protected]:/root/

5) Verify password-free login for SSH:

Enter in master: SSH slave1, ssh slave2 and ssh slave3 require a password, and if not, SSH password-free configuration is successful.

Iv. Installation of JDK

Five, Scala installation

Six, Hadoop2.6 the decompression installation

A) We download the HADOOP-2.6.0-CDH5 version of the compressed package from Cloudera official website

b) in the master node, unzip the Hadoop package into/root/app/:

TAR–XZVF hadoop-2.6.0-cdh5.4.0.tar.gz

Vii. configuration of Hadoop in a clustered environment

In master, go to the configuration directory for Hadoop:

CD Hadoop-2.6.0-cdh5.4.0/etc/hadoop

A) configuration of the slaves:

VI Slaves

Delete localhost, add three slave nodes:

Slave1

Slave2

Slave3

b) configuration of the Core-site.xml:

Prior to this, you need to create data/tmpin the Hadoop directory:

mkdir data

CD data

Mrdir tmp

mkdir DFS

CD DFS

mkdir data

mkdir Name

Under the Hadoop-2.6.0-cdh5.4.0/etc/hadoop:

VI Core-site.xml

Added in Core-site.xml:

<property>

<name>hadoop.tmp.dir</name>

<value>file:/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp</value>

</property>

<property>

<name>fs.defaultFS</name>

<value>hdfs://master:8020</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp</value>

</property>

c) Configuration of the Hdfs-site.xml:

VI Hdfs-site.xml

Increase in Hdfs-site.xml:

<property>

<name>dfs.namenode.secondary.http-address</name>

<value>master:50090</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/root/app/hadoop-2.6.0-cdh5.4.0/data/tmp/dfs/data</value>

</property>

d) configuration of the Mapred-site.xml:

VI Mapred-site.xml

Increase in Mapred-site.xml:

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

e) configuration of Yarn-site.xml:

VI Yarn-site.xml

Increase in Yarn-site.xml:

<property>

<name>yarn.resourcemanager.hostname</name>

<value>master</value>

</property>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

f) After configuring Hadoop in master, copy the Hadoop to the corresponding location on each node:

scp–r/root/app/hadoop-2.6.0-cdh5.4.0 slave1:/root/app/

scp–r/root/app/hadoop-2.6.0-cdh5.4.0 slave2:/root/app/

scp–r/root/app/hadoop-2.6.0-cdh5.4.0 slave3:/root/app/

Eight, start Hadoop

A) in the directory of Hadoop of master node master:

Bin/hdfs Namenode–format

sbin/start-dfs.sh

sbin/start-yarn.sh

Then on the Master node Master , enter JPS to see the four processes of Namenode, Secondarynamenode, JPS, and ResourceManager.

Enter JPS from node slave to see three processes for JPS, NodeManager, and Datanode.

b) The above process prompts, congratulations, your Hadoop environment is set up!

Hadoop2.6.0 cluster construction in CentOS 7

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.