This article to operate the virtual machine is on the basis of pseudo-distributed configuration, the specific configuration of this article will not repeat, please refer to my blog: http://www.cnblogs.com/VeryGoodVeryGood/p/8507795.html
This article mainly refer to the Bowen--hadoop cluster installation configuration tutorial _hadoop2.6.0_ubuntu/centos, and "Hadoop Application development Technology (Liu Gang)"
This article mainly uses 3 virtual machines to build the Hadoop distributed environment, the three virtual machines topology diagram as shown
The roles for each node in the Hadoop cluster are shown in the following table
Host Name |
Hadoop role |
IP Address |
Hadoop JPS Command Results |
Hadoop users |
Hadoop installation directory |
Master |
Master Slave |
192.168.8.210 |
Jps NameNode Secondarynamenode ResourceManager Jobhistoryserver |
Hadoop |
/usr/local/hadoop |
Slave1 |
Slave |
192.168.8.211 |
Jps NameNode DataNode |
Slave2 |
Slave |
192.168.8.212 |
Jps NameNode DataNode |
Windows |
Development environment |
192.168.0.169 |
|
First, network settings
1. Virtual machine set to bridge mode
Network configuration method See blog: http://blog.csdn.net/zhongyoubing/article/details/71081464
2. Modify the corresponding hostname according to the table above, config file/etc/hostname
3. Set IP map, config file/etc/hosts, same configuration on all nodes
127.0. 0.1 localhost192.168. 8.210 Master192.168. 8.211 Slave1192.168. 8.212 Slave2
4. Restart to test if it is ping to each other
Ping 3 Ping 3 Ping 3
SSH login node without password
Master :
RM ~/. SSH SSH ~/. SSH Ssh-keygen -t RSAcat ./id_rsa.pub >>./authorized_keysSCP ~/. ssh/id_rsa.pub [email protected]:/home/hadoop/SCP ~/. ssh/id_rsa.pub [email protected]:/home/hadoop/
Slave1 & Slave2 :
RM ~/. SSH mkdir ~/. SSH cat ~/id_rsa.pub >> ~/. ssh/authorized_keysrm ~/id_rsa.pub
Master :
Login Node Slave2
SSH Slave2
Exit
Exit
Third, the Master node configuration distributed environment
configuration file in directory /usr/local/hadoop/etc/hadoop/ under
Slaves
Slave1slave2
Core-site.xml
<Configuration> < Property> <name>Hadoop.tmp.dir</name> <value>File:/usr/local/hadoop/tmp</value> </ Property> < Property> <name>Fs.defaultfs</name> <value>hdfs://master:9000</value> </ Property></Configuration>
Hdfs-site.xml
<Configuration> < Property> <name>Dfs.replication</name> <value>2</value> </ Property> < Property> <name>Dfs.namenode.secondary.http-address</name> <value>master:50090</value> </ Property> < Property> <name>Dfs.namenode.name.dir</name> <value>File:/usr/local/hadoop/tmp/dfs/name</value> </ Property> < Property> <name>Dfs.datanode.data.dir</name> <value>File:/usr/local/hadoop/tmp/dfs/data</value> </ Property></Configuration>
Mapred-site.xml
<Configuration> < Property> <name>Mapreduce.framework.name</name> <value>Yarn</value> </ Property> < Property> <name>Mapreduce.jobhistory.address</name> <value>master:10020</value> </ Property> < Property> <name>Mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </ Property></Configuration>
Yarn-site.xml
<Configuration> < Property> <name>Yarn.resourcemanager.hostname</name> <value>Master</value> </ Property> < Property> <name>Yarn.nodemanager.aux-services</name> <value>Mapreduce_shuffle</value> </ Property></Configuration>
Iv. Other nodes Configure the distributed environment
Master :
cd/usr/localsudorm -R/hadoop/tmpsudorm -R./hadoop/ Logs/*tar-zcf ~/hadoop.master.tar.gz./HADOOPCD ~scp./hadoop.master.tar.gz slave1:/home/hadoop
Slave1 & Slave2 :
sudo rm -r/usr/local/hadoopsudotar -zxf ~/hadoop.master. tar. gz-c/usr/localsudochown -R hadoop/usr/local/hadoop
Five, start Hadoop
Master :
HDFs Namenode-formatstart-all. SH Mr-jobhistory-daemon. sh start historyserver
View process
JPs
View Datanode
HDFs Dfsadmin-report
Slave1 & Slave2 :
View process
JPs
Vii. Distributed Instances
1. Create a file Test.txt
Hello Worldhello Worldhello Worldhello Worldhello World
2. Create a user directory in HDFS
HDFs DFS-mkdir -p/user/hadoop
3. Create the input directory
HDFs DFS-mkdir input
4. Copy the local file into input
HDFs dfs-put./test.txt input
5. See if the upload was successful
HDFs DFS-ls /user/hadoop/input
6. Operation
The HDFs DFS-RM -R output #Hadoop运行程序时, the export directory cannot exist, otherwise it will prompt for the wrong Hadoop jar. /share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar Wordcount/user/hadoop/input/test.txt/user/hadoop/output
7. View running Results
HDFs DFS-cat Output/*
8. Retrieve the running results locally
RM -R./-get output./outputcat ./output/*
9. Delete the output directory
HDFs DFS-RM -R outputRM -R./output
Above
ubuntu14.04 Build a Hadoop clustered (distributed) environment