Installation steps:
1. Install centos5.5
Add the machine name and corresponding IP address to/etc/hosts:
For example:
192.168.1.107 hdfs1 master
192.168.1.109 hdfs3 slave
2. Enable the SSH service
Install OpenSSH-server: Yum-y install OpenSSH
Machine name IP
Hdfs1 192.168.1.107 namenode, master, and jobtracker
Hdfs2 192.168.1.108 datanode, slave, tasktracker
Hdfs3 192.168.1.109 datanode, slave, tasktracker
3. Create an SSH password-less Login
(1) Implement password-free logon to the local machine on namenode:
$ Ssh-keygen-t dsa-p'-f ~ /. Ssh/id_dsa,
Press enter ~ /. Ssh/generate two files: id_dsa and id_dsa.pub. These two are paired
Appears, similar to keys and locks. Append id_dsa.pub to the authorization key (no authorized_keys
File): $ cat ~ /. Ssh/id_dsa.pub> ~ /. Ssh/authorized_keys. After completion, you can achieve no password
Log on to the local machine: $ SSH localhost. In addition to doing it in the master, it also needs to be executed in the slave
(2) Implement namenode login to other datanode without a password:
Append the id_dsa.pub file on namenode to the authorized_keys of datanode (
192.168.1.107 node ):
A. Copy the id_dsa.pub file of namenode:
$ SCP id_dsa.pub root@192.168.1.108:/home/hadoop/
B. log on to 192.168.1.108 and run $ cat id_dsa.pub>. Ssh/authorized_keys.
Other datanode perform the same operation.
Note: If the configuration is complete and the namenode still cannot access datanode, you can modify
Authorized_keys: $ chmod 600 authorized_keys.
4. Disable the Firewall
$ Sudo UFW disable
Note: This step is very important. If you do not close it, The datanode cannot be found.
5. Install jdk1.6
After installation, add the following statement to/etc/profile:
Export java_home =/home/hadoop/jdk1.6.0 _ 22
Export jre_home =/home/hadoop/jdk1.6.0 _ 22/JRE
Export classpath =.: $ java_home/lib: $ jre_home/lib: $ classpath
Export Path = $ java_home/bin: $ jre_home/bin: $ path
Note: The Java environment of each machine is best consistent. If the installation is interrupted, switch to the root permission for installation.
6. Install hadoop
Download hadoop-0.4102.tar.gz:
Unzip: $ tar-zvxf hadoop-0.20.2.tar.gz
Add the hadoop installation path to the Environment/etc/profile:
Export hadoop_home =/ home/hexianghui/hadoop-0.20.2
Export Path = $ hadoop_home/bin: $ path
7. Configure hadoop
The main configuration of hadoop is under the hadoop-0.20.2/CONF.
(1) configure the Java environment in CONF/hadoop-env.sh (namenode and datanode have the same configuration ):
$ Gedit hadoop-env.sh
$ Export java_home =/home/hexianghui/jdk1.6.0 _ 14
(2) configure the conf/masters and CONF/slaves files: (only configured on namenode)
MASTERS: 192.168.1.107
Slaves:
192.168.1.108
192.168.1.109
(3) Configure CONF/core-site.xml, CONF/hdfs-site.xml and CONF/mapred-site.xml (simple configuration)
Set, datanode configuration is the same)
Core-site.xml:
<Configuration>
<! --- Global properties -->
<Property>
<Name> hadoop. tmp. dir </Name>
<Value>/home/hexianghui/tmp </value>
<Description> a base for other temporary directories. </description>
</Property>
<! -- File System Properties -->
<Property>
<Name> fs. Default. Name </Name>
<Value> HDFS: // 192.168.1.107: 9000 </value>
</Property>
</Configuration>
Hdfs-site.xml :( replication defaults to 3, if not modified, less than three datanode will report an error)
<Configuration>
<Property>
<Name> DFS. Replication </Name>
<Value> 1 </value>
</Property>
</Configuration>
Mapred-site.xml:
<Configuration>
<Property>
<Name> mapred. Job. Tracker </Name>
<Value> 192.168.1.107: 9001 </value>
</Property>
</Configuration>
8. Run hadoop
Go to the hadoop-0.20.1/bin and first format the file system: $ hadoop namenode-format
Start hadoop: $ start-all.sh
View Cluster status: $ hadoop dfsadmin-Report
Hadoop Web View: http: // 192.168.1.107: 50070
9. Run the wordcount. Java program
(1) create two input files file01 and file02 on the local disk:
$ Echo "Hello World bye world"> file01
$ Echo "Hello hadoop goodbye hadoop"> file02
(2) create an input directory in HDFS: $ hadoop FS-mkdir Input
(3) Copy file01 and file02 to HDFS:
$ Hadoop FS-copyfromlocal/home/hexianghui/soft/file0 * Input
(4) execute wordcount:
$ Hadoop jars hadoop-0.20.2-examples.jar wordcount Input Output
(5) view the result after the task is completed:
$ Hadoop FS-cat output/part-r-00000
A Simple and Direct method is to configure a machine and test other machines directly to modify a small amount of things.