After the accumulation of the front, today finally realized the cluster environment to deploy Hadoop, and successfully ran the official example.
 
Work as follows:
 
Two machines:
 
Namenode: Internet Small, 3G memory, machine name: yp-x100e,ip:192.168.101.130.
 
Datanode: Virtual machine, Win7 download VMWare10 virtual UBUNTU14, virtual machine name: ph-v370,ip:192.168.101.110
 
Ensure that you can ping each other, according to the machine name and IP configuration of the respective machine/etc/hosts files and/etc/hostname files, my hosts configuration is as follows
 
127.0.0.1 localhost  
192.168.101.130 yp-x100e  
192.168.101.110 ph-v370
 
Installation environment, please refer to
 
UBUNTU14 Hadoop Development <1> Foundation Environment installation
 
UBUNTU14 Hadoop Development <2> compilation 64-bit Hadoop2.4
 
Configure various configuration information, please refer to the
 
Hadoop 2.4.0 a fully distributed platform to build, configure, install
 
Setting up a single Node Cluster.
 
Precautions:
 
A: Master-slave SSH settings, the host in the terminal input
 
Ssh-keygen-t Dsa-p '-F ~/.SSH/ID_DSA
 
Cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
 
Copy files to from node
 
SCP Authorized_keys ph-v370:~/.ssh/
 
In the beginning, I always failed because the user is inconsistent, resulting in the inability to login ssh, and later I created a new Hadoop user, and granted permission to the Hadoop folder
 
SCP Authorized_keys ph-v370:~/.ssh/
 
In the beginning, I always failed because the user is inconsistent, resulting in the inability to login ssh, and later I created a new Hadoop user, and granted permission to the Hadoop folder
 
Useradd-m Hadoop  
passwd hadoop  
chown hadoop:hadoop hadoop-2.4.0
 
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Servers/cloud-computing/
 
It's OK to reuse users of Hadoop (including starting Hadoop services, preferably with this user)
 
B: Execute start-dfs.sh, sub-node report exception "WARNorg.apache.hadoop.hdfs.server.datanode.DataNode:Problem connecting to server:yp-x100e /192.168.101.130:9000 "
 
Here is a card point, you need to modify the master node of the hosts file, comment out the 127.0.1.1 this line, you can do before the comment
 
Netstat-an | grep 9000
 
You will see that the 9000 port is occupied by 127.0.1.1, so there will be an exception
 
C: The command to format the file system should be
 
HDFs Namenode-format
 
D:hadoop Services and yarn services need to be started separately
 
start-dfs.sh  
start-yarn.sh
 
E: Configure all the configuration files on the primary node and copy them directly from the node
 
F: Unlike when doing a single node example, I need to make a specific path when copying files, such as this:
 
Originally directly executed  
$ bin/hdfs dfs-put etc/hadoop input  
now needs to execute  
$ bin/hdfs dfs-put etc/hadoop/user/chenph/input
 
G: Check process use command is: PS-EF | grep ' search content ', kill process is: Kill-s 9 process number, view firewall is: sudo ufw status
 
H: Access to the master node to view the state of Hadoop http://YP-X100e:50070, as shown in the following figure, there is an active from the node, that is, the Ubuntu in my virtual machine
 
 
Author: csdn Blog Yueritian