ubuntu14.04 Building a Hadoop cluster (distributed) environment

Source: Internet
Author: User
Tags hdfs dfs

This article to operate the virtual machine is on the basis of pseudo-distributed configuration, the specific configuration of this article will not repeat, please refer to my blog: http://www.cnblogs.com/VeryGoodVeryGood/p/8507795.html

This article mainly refer to the Bowen--hadoop cluster installation configuration tutorial _hadoop2.6.0_ubuntu/centos, and "Hadoop Application development Technology (Liu Gang)"

This article mainly uses 3 virtual machines to build the Hadoop distributed environment, the three virtual machines topology diagram as shown

The roles for each node in the Hadoop cluster are shown in the following table

Host Name Hadoop role IP Address Hadoop JPS Command Results Hadoop users Hadoop installation directory
Master

Master

Slave

192.168.8.210

Jps

NameNode

Secondarynamenode

ResourceManager

Jobhistoryserver

Hadoop /usr/local/hadoop
Slave1 Slave 192.168.8.211

Jps

NameNode

DataNode

Slave2 Slave 192.168.8.212

Jps

NameNode

DataNode

Windows Development environment 192.168.0.169

First, network settings

1. Virtual machine set to bridge mode

Network configuration method See blog: http://blog.csdn.net/zhongyoubing/article/details/71081464

2. Modify the corresponding hostname according to the table above, config file/etc/hostname

3. Set IP map, config file/etc/hosts, same configuration on all nodes

127.0. 0.1     localhost192.168. 8.210     Master192.168. 8.211     Slave1192.168. 8.212    Slave2

4. Restart to test if it is ping to each other

Ping 3 Ping 3 Ping 3

SSH login node without password

Master :

RM ~/. SSH SSH ~/. SSH Ssh-keygen -t RSAcat ./id_rsa.pub >>./authorized_keysSCP ~/. ssh/id_rsa.pub [email protected]:/home/hadoop/SCP ~/. ssh/id_rsa.pub [email protected]:/home/hadoop/

Slave1 & Slave2 :

RM ~/. SSH mkdir ~/. SSH cat ~/id_rsa.pub >> ~/. ssh/authorized_keysrm ~/id_rsa.pub

Master :

Login Node Slave2

SSH Slave2

Exit

Exit

Third, the Master node configuration distributed environment

configuration file in directory /usr/local/hadoop/etc/hadoop/ under

Slaves

Slave1slave2

Core-site.xml

<Configuration>        < Property>                <name>Hadoop.tmp.dir</name>                <value>File:/usr/local/hadoop/tmp</value>        </ Property>        < Property>                <name>Fs.defaultfs</name>                <value>hdfs://master:9000</value>        </ Property></Configuration>

Hdfs-site.xml

<Configuration>        < Property>                <name>Dfs.replication</name>                <value>2</value>        </ Property>        < Property>                <name>Dfs.namenode.secondary.http-address</name>                <value>master:50090</value>        </ Property>        < Property>                <name>Dfs.namenode.name.dir</name>                <value>File:/usr/local/hadoop/tmp/dfs/name</value>        </ Property>        < Property>                <name>Dfs.datanode.data.dir</name>                <value>File:/usr/local/hadoop/tmp/dfs/data</value>        </ Property></Configuration>

Mapred-site.xml

<Configuration>        < Property>                <name>Mapreduce.framework.name</name>                <value>Yarn</value>        </ Property>        < Property>                <name>Mapreduce.jobhistory.address</name>                <value>master:10020</value>        </ Property>        < Property>                <name>Mapreduce.jobhistory.webapp.address</name>                <value>master:19888</value>        </ Property></Configuration>

Yarn-site.xml

<Configuration>        < Property>                <name>Yarn.resourcemanager.hostname</name>                <value>Master</value>        </ Property>        < Property>                <name>Yarn.nodemanager.aux-services</name>                <value>Mapreduce_shuffle</value>        </ Property></Configuration>

Iv. Other nodes Configure the distributed environment

Master :

cd/usr/localsudorm -R/hadoop/tmpsudorm -R./hadoop/ Logs/*tar-zcf ~/hadoop.master.tar.gz./HADOOPCD ~scp./hadoop.master.tar.gz slave1:/home/hadoop 

Slave1 & Slave2 :

sudo rm -r/usr/local/hadoopsudotar -zxf ~/hadoop.master. tar. gz-c/usr/localsudochown -R hadoop/usr/local/hadoop

Five, start Hadoop

Master :

HDFs Namenode-formatstart-all. SH Mr-jobhistory-daemon. sh start historyserver

View process

JPs

View Datanode

HDFs Dfsadmin-report

Slave1 & Slave2 :

View process

JPs

Vii. Distributed Instances

1. Create a file Test.txt

Hello Worldhello Worldhello Worldhello Worldhello World

2. Create a user directory in HDFS

HDFs DFS-mkdir -p/user/hadoop

3. Create the input directory

HDFs DFS-mkdir input

4. Copy the local file into input

HDFs dfs-put./test.txt input

5. See if the upload was successful

HDFs DFS-ls /user/hadoop/input

6. Operation

The HDFs DFS-RM -R output #Hadoop运行程序时, the export directory cannot exist, otherwise it will prompt for the wrong Hadoop jar. /share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar Wordcount/user/hadoop/input/test.txt/user/hadoop/output

7. View running Results

HDFs DFS-cat Output/*

8. Retrieve the running results locally

RM -R./-get output./outputcat ./output/*

9. Delete the output directory

HDFs DFS-RM -R outputRM -R./output

Above

ubuntu14.04 Build a Hadoop clustered (distributed) environment

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.