AC Group: 335671559 Hadoop Cluster
Hadoop Cluster Build
The IP address of the master computer is assumed to be the 192.168.1.2 slaves2 assumption of the 192.168.1.1 Slaves1 as 192.168.1.3
The user of each machine is Redmap, the Hadoop root directory is:/hadoop (that is, directly in the filesystem to install Hadoop)
(1) 192.168.1.1 Redmap-master as Namenode, Jobtracker, Secondarynamenode
(2) 192.168.1.2 redmap-slaves1 as Datanode, Tasktracker
(3) 192.168.1.3 Redmap-slaves2 as Datanode, Tasktracker
Configuration
First, you need to configure the mutual access between machines:
1, according to the above settings to modify the machine (including master and slaves) of the host file, to ensure that the machine can communicate normally. The Hosts file (related section) on the Master machine is:
127.0.0.1 localhost
192.168.1.1 Redmap-master
192.168.1.2 redmap-slaves1
192.168.1.3 Redmap-slaves2
The other machine host content is the same (the Red Word section).
2, the configuration SSH automatic landing (on the master machine):
$ ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA
When completed, two files are generated in ~/.ssh/: Id_dsa and Id_dsa.pub.
Then append the id_dsa.pub to the authorization key (there are no Authorized_keys files currently):
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
When you are done, you can login without password:
$ ssh localhost
3. Append the Id_dsa.pub file in master to 2 and 3 Authorized_keys (take slaves1 node for example):
#拷贝master的id_dsa. pub file (performed on the Master machine)
$ SCP id_dsa.pub redmap@192.168.1.2:/home/maple/
Note: (Simply run the Ssh-kegen program on the primary node.) After the directory structure of the other nodes is created, copy the keys that were created in the primary node through the SCP to the same directory from the node. )
In practice, we manually copy the id_dsa.pub to two slaves nodes, rather than using the SCP command. It is best to manually copy the past so that the permissions of the file are consistent.
Login 192.168.1.2, entry/directory execution:
$ cat Id_dsa.pub >> Ssh/authorized_keys
You can then direct SSH access without entering a password on master slaves1
4, configure the cluster node file
Complete on Master machine (Namenode, 192.168.1.1)
5, modify the conf/directory of the master file, the contents are as follows:
Redmap-master (Namenode)
6, modify the conf/directory under the slaves file, the contents are as follows:
Redmap-slaves1 (Datanode)
Redmap-slaves2 (Datanode)
7, modify the conf/hadoop-env.sh file environment variables (master and slaves environment variables consistent):
# The Java implementation to use. Required.
Export java_home=/usr/java/jdk1.6
8, modify the Conf/core-site.xml configuration file, the contents are as follows:
< XML version= "1.0″?>
< Xml-stylesheet type= "text/xsl" href= "configuration.xsl"?>
<!–put site-specific property overrides in this file. –>
<configuration>
< property>
< name>hadoop.tmp.dir</name>
< value>/hadoop/tmp</value>
< Description>a base for other temporary directories.</description>
</property>
< property>
< name>fs.default.name</name>
< value>hdfs://redmap-master:9000</value>
</property>
</configuration>
9, modify the Conf/hdfs-site.xml configuration file, the contents are as follows:
< XML version= "1.0″?>
< Xml-stylesheet type= "text/xsl" href= "configuration.xsl"?>
<!–put site-specific property overrides in this file. –><configuration>
<property>
< name>dfs.replication</name>
<!– Set the number of copies stored in a block of data –>
<value>1</value>
</property>
<property>
< name>dfs.data.dir</name>
< value>/hadoop/hadoopfs/data</value>
</property>
< property>
< name>dfs.name.dir</name>
< value>/hadoop/hadoopfs/name</value>
</property>
</configuration>
10, modify the Conf/mapred-site.xml configuration file, the contents are as follows: