Through the accumulation of the front, today finally realized the cluster environment to deploy Hadoop, and successfully run the official example. Work as follows: Two machines: NameNode: Internet Small, 3G memory, machine name: yp-x100e,ip:192.168.101.130. DataNode: Virtual machine, Win7 download VMWare10 virtual UBUNTU14, virtual machine name: ph-v370,ip:192.168.101.110 Ensure that you can ping each other, according to the machine name and IP configuration of the respective machine's/etc/hosts files and /etc/hostname file, my hosts configuration content is as follows
127.0.0.1 localhost
192.168.101.130 yp-x100e
192.168.101.110 ph-v370
Installation environment, refer to UBUNTU14 under Hadoop development <1> Infrastructure Installation UBUNTU14 Hadoop development <2> compilation 64-bit Hadoop2.4
Configure various configuration information, refer to Hadoop 2.4.0 fully distributed platform Setup, configuration, installation
Setting up a single Node Cluster.
Note: A: SSH settings of master and slave, input in terminal of host machine
Ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Copying files to slave nodes
SCP Authorized_keys ph-v370:~/.ssh/
At first I failed because of user inconsistency, which prevented me from landing ssh, and then I created a new Hadoop user and granted permissions to the Hadoop folder.
Useradd-m Hadoop
passwd hadoop
chown hadoop:hadoop hadoop-2.4.0
Re-use the Hadoop user to do it (including starting the Hadoop service, etc., preferably with this user)
B: Execute start-dfs.sh, child node report exception "WARNorg.apache.hadoop.hdfs.server.datanode.DataNode:Problem connecting to server:yp-x100e /192.168.101.130:9000 "Here is a card point, need to modify the master node of the hosts file, comment out 127.0.1.1 this line, you can do before the comment
Netstat-an | grep 9000
You will see that port 9000 is 127.0.1.1 occupied, so the exception C: Format file system command should be
HDFs Namenode-format
D:hadoop Services and yarn services need to be started separately
start-dfs.sh
start-yarn.sh
E: In the master node configuration of all the configuration files, directly copied to the slave node can F: the same time as the single node example, I need to make a copy of the file when the exact path, such as this:
Originally directly executed
$ bin/hdfs dfs-put etc/hadoop input
now needs to execute
$ bin/hdfs dfs-put etc/hadoop/user/chenph/input
G: The check process using command is: PS-EF | grep ' search content ', kill process is: Kill-s 9 process number, view firewall is: sudo UFW status H: Access to the master node to view the status of Hadoop http://YP-X100e:50070, the following figure can be seen, there is an active from the node, that is, That Ubuntu in my virtual machine