Use 3 machines to build HDFS fully distributed cluster 201 (NameNode), 202 (DataNode), 203 (DataNode)
Overall architecture
NameNode (192.168.1.201)
DataNode (192.168.1.202,192.168.1.203)
Secondarynamenode (192.168.1.202)
1. Download the Hadoop package from the official website and upload it to the Linux system
Hadoop-1.2.1.tar.gz
Extract
TAR-ZXVF hadoop-1.2.1.tar.gz Linux Server requires a JDK environment
Because the name is long, you can add a soft connection
ln-sf/root/hodoop-1.2.1/home/hodoop-1.2
2. Modify the Core-site.xml configuration file
Vi/home/hadoop-1.2/conf
Configure Namenode host and port number, configure working directory
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.1.201:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-1.2</value>
</property>
</configuration>
The default working directory is in the TMP directory, and the TMP directory is emptied when the Linux system Qidong
After extracting the Hadoop compression pack
/hadoop-1.2.1/docs/core-default.html
The working directory for HDFs is based on the TMP temp directory
3. Configure Conf/hdfs-site.xml
Configure Dfs.replication, configure the number of replicas for Datanode 202,203 as Datanode, so the number of replicas <= 2
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
4. Configure the Datanode node
Vi/conf/slaves (can use host name without IP address)
5. Configure Secondarynamenode, note that you cannot be on the same machine as the Namenode
Vi/conf/masters
192.168.1.202
6. Configure Password-free login
Password-Free login You can enter a command on any machine to start a process on all machines
If you do not password-free login, you need to enter the boot process command on each machine
Password-free login on configuration 201
Generate secret key on 201
Ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA
Generated under the. SSH Directory
[email protected]. ssh]# lsauthorized_keys ID_DSA id_dsa.pub
ID_DSA is the private key, Id_dsa.pub is the public key
Configure password-free login for a single machine
Execute the following command
" -F ~/.ssh/~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Configure password-free logon across nodes
Execute first
‘‘
Generate Id_dsa.pub Public key
Copy the id_dsa.pub to the machine that will be password-free to log on
SCP id_dsa.pub [email protected]192.168. 1.202: ~
Append id_dsa.pub to the Authorized_keys day file on 192.168.1.202
$ cat ~/id_dsa.pub >> ~/.ssh/authorized_keys
Use the more Authorized_keys to view
Log on to 202 on 201 using SSH 192.168.1.202:22
Need to do a local password-free login, and then do cross-node password-free login
The result of the configuration is 201-->202,201-->203, if the opposite is necessary, the main reverse process is repeated above
7. All nodes are configured identically
Copy Compressed Package
Scp-r ~/hadoop-1.2.1.tar.gz [Email protected]:~/
Extract
TAR-ZXVF hadoop-1.2.1.tar.gz
Create a soft connection
ln-sf/root/hadoop-1.2.1/home/hodoop-1.2
To format
[Email protected] bin]#/hadoop Namenode-format
Configure Java_home
# Set hadoop-Specific environment variables here.# the only required environment variable isJava_home. All others are# optional. When running a distributed configuration it isBest to#SetJava_homeinch ThisFile, so, it iscorrectly defined on# remote nodes.# the Java implementation to use. Required. Export Java_home =/usr/java/jdk1.7 . 0_75# Extra Java CLASSPATH elements. optional.# Export Hadoop_classpath=# The maximum amount of heap to use,inchMB. Default is +. # Export Hadoop_heapsize= -# Extra Java runtime options. Empty bydefault. # Export Hadoop_opts=-server# Command Specific options appended to hadoop_opts when specified"hadoop-env.sh" 57L, 2433C
Copy the configured configuration file to another machine (copy to 202,203)
[Email protected] conf]# SCP./* [Email protected]:/home/hadoop-1.2/conf/
Start
[Email Protected]ogon bin]#./start-dfs.sh
Need to shut down the firewall before booting
Service Iptables Stop
After starting, you can use JPS to see if it started successfully.
Hadoop-hdfs Distributed File System