After the accumulation of the front, today finally realized the cluster environment to deploy Hadoop, and successfully ran the official example.
Work as follows:
Two machines:
Namenode: Internet Small, 3G memory, machine name: yp-x100e,ip:192.168.101.130.
Datanode: Virtual machine, Win7 download VMWare10 virtual UBUNTU14, virtual machine name: ph-v370,ip:192.168.101.110
Ensure that you can ping each other, according to the machine name and IP configuration of the respective machine/etc/hosts files and/etc/hostname files, my hosts configuration is as follows
127.0.0.1 localhost
192.168.101.130 yp-x100e
192.168.101.110 ph-v370
Installation environment, please refer to
UBUNTU14 Hadoop Development <1> Foundation Environment installation
UBUNTU14 Hadoop Development <2> compilation 64-bit Hadoop2.4
Configure various configuration information, please refer to the
Hadoop 2.4.0 a fully distributed platform to build, configure, install
Setting up a single Node Cluster.
Precautions:
A: Master-slave SSH settings, the host in the terminal input
Ssh-keygen-t Dsa-p '-F ~/.SSH/ID_DSA
Cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Copy files to from node
SCP Authorized_keys ph-v370:~/.ssh/
In the beginning, I always failed because the user is inconsistent, resulting in the inability to login ssh, and later I created a new Hadoop user, and granted permission to the Hadoop folder
SCP Authorized_keys ph-v370:~/.ssh/
In the beginning, I always failed because the user is inconsistent, resulting in the inability to login ssh, and later I created a new Hadoop user, and granted permission to the Hadoop folder
Useradd-m Hadoop
passwd hadoop
chown hadoop:hadoop hadoop-2.4.0
More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/Servers/cloud-computing/
It's OK to reuse users of Hadoop (including starting Hadoop services, preferably with this user)
B: Execute start-dfs.sh, sub-node report exception "WARNorg.apache.hadoop.hdfs.server.datanode.DataNode:Problem connecting to server:yp-x100e /192.168.101.130:9000 "
Here is a card point, you need to modify the master node of the hosts file, comment out the 127.0.1.1 this line, you can do before the comment
Netstat-an | grep 9000
You will see that the 9000 port is occupied by 127.0.1.1, so there will be an exception
C: The command to format the file system should be
HDFs Namenode-format
D:hadoop Services and yarn services need to be started separately
start-dfs.sh
start-yarn.sh
E: Configure all the configuration files on the primary node and copy them directly from the node
F: Unlike when doing a single node example, I need to make a specific path when copying files, such as this:
Originally directly executed
$ bin/hdfs dfs-put etc/hadoop input
now needs to execute
$ bin/hdfs dfs-put etc/hadoop/user/chenph/input
G: Check process use command is: PS-EF | grep ' search content ', kill process is: Kill-s 9 process number, view firewall is: sudo ufw status
H: Access to the master node to view the state of Hadoop http://YP-X100e:50070, as shown in the following figure, there is an active from the node, that is, the Ubuntu in my virtual machine
Author: csdn Blog Yueritian