Hadoop Distributed Cluster Setup (2.9.1)

Source: Internet
Author: User

1. Environment

Operating system: UBUNTU16

jdk:1.8

hadoop:2.9.1

Machine: 3 units, master:192.168.199.88,node1:192.168.199.89,node2:192.168.199.90

2. Construction steps

  2.1 Modify hostname hostname, three machines execute the following commands, and then fill in Master,node1,node2

sudo vim/etc/hostname

    

  2.2 Modify the Hosts file, and three machines execute sequentially

sudo vim/etc/hosts

    

  2.3 Modifying environment variables, three in turn

Vim/etc/profile, and then source/etc/profile make it effective.

    

    Java_home is the Java installation path, if you do not know your Java installation path, please refer to the following actions:

    Which Java is targeted to the Java program execution path, not the installation path, after two times-LRT the final output is the installation path

  2.3 Configure Master to Node1 and Node2-free login

The effect is to enter SSH Node1 on the master can log on Node1, otherwise when the Cluster service, master and node can not connect, will report connection refused

Before 2.3.1 configuration, make sure the Openssh-server is installed and not installed by default

Input dpkg--list | grep ssh, if there is no Openssh-server, execute the following command to install:

sudo apt-get install Openssh-server

2.3.2 Each machine executes ssh-keygen-t RSA, then enter

The generated public key is saved under ~/.ssh

2.3.3, put the public key into Authorized_keys on master, the command is as follows:

Cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys

2.3.4, place the Authorized_keys on the master on the other machine

SCP ~/.ssh/authorized_keys [Email protected]:~/.ssh/

SCP ~/.ssh/authorized_keys [Email protected]:~/.ssh

2.3.5, test whether successful

      

  2.4 Download Hadoop and modify the configuration file to execute on master, and then copy to other machines after modification

     wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable/hadoop-2.9.1.tar.gz ( I download the stable version here is 2.9.1, if updated, download the corresponding tar.gz package can be)

Decompression: Tar zxf hadoop-2.9.1.tar.gz

  2.5 Create an HDFS storage directory

Enter the Unpacked folder: CD hadoop2.9.1

mkdir HDFs

CD HDFs

mkdir name Data tmp

./hdfs/name--Store Namenode file

./hdfs/data--Storing data

./hdfs/tmp--Storing temporary files 

  2.6 Modifying an XML configuration file 

The XML file that needs to be modified under hadoop2.9.1/etc/hadoop/

There are 5 main files to modify:

hadoop-env.sh

Core-site.xml

Hdfs-site.xml

Mapred-site.xml

Yarn-site.xml

Slaves

     2.6.1, vim hadoop-env.sh, fill in the Java installation path

      

    2.6.2, vim core-site.xml,configuration tag insert the following content

2.6.3, Vim Hdfs-site.xml

     

2.6.4, Vim Mapred-site.xml

     

2.6.5, Vim Yarn-site.xml

     

2.6.6, Vim slaves, the inside of the localhost deleted, write from the node host name

     

  2.7 remotely Copy the Hadoop folder to the node nodes

Scp-r hadoop-2.9.1 [Email protected]:/home/wang/

Scp-r hadoop-2.9.1 [Email protected]:/home/wang/

2.8 Starting Hadoop

2.8.1 Format the command before you start: Hadoop namenode-format

Because Master is namenode,node1 and Node2 are Datanode, they are only executed on master.

    If successfully formatted is present, the format is successful and you see the current folder under Name

2.8.2 after the format is successful, execute the command on master: start-all.sh, the JPS command can be used to view the open process, the master has four processes, node is three

    

    

    If not, please check the configuration file from beginning to end to see if there is a misspelled place

Hadoop Distributed Cluster Setup (2.9.1)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.