The following Installation Manual is what I did in the first version of hadoop, and it is not in line with the current hadoop 1. Preparation: Download hadoop: http://hadoop.apache.org/core/releases.htmlhttp://hadoop.apache.org/common/releases.htmlhttp://www.apache.o
The following Installation Manual was created in the first version of hadoop, which is not consistent with the current version of hadoop.
I. preparations:
Download the hadoop: http://hadoop.apache.org/core/releases.html
Http://hadoop.apache.org/common/releases.html
Http://www.apache.org/dyn/closer.cgi/hadoop/core/
Http://labs.xiaonei.com/apache-mirror/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
Http://labs.xiaonei.com/apache-mirror/hadoop/
II. hardware environment
There are three machines in total, all of which use CentOS, and Java uses jdk1.6.0.
3. Install Java 6
Sudo apt-get install sun-java6-jdk
/Etc/environment
After opening, add: # The Center is separated by an English colon. remember to use the English semicolon as the separator in windows.
CLASSPATH =.:/usr/local/java/lib
JAVA_HOME =/usr/local/java
3. configure the host table
[Root @ hadoop ~] # Vi/etc/hosts
127.0.0.1 localhost
192.168.13.100 namenode
192.168.13.108 datanode1
192.168.13.110 datanode2
[Root @ test ~] # Vi/etc/hosts
127.0.0.1 localhost
192.168.13.100 namenode
192.168.13.108 datanode1
[Root @ test2 ~] # Vi/etc/host
127.0.0.1 localhost
192.168.13.100 namenode
192.168.13.110 datanode2
Add users and user groups
Addgroup hadoop
Adduser hadoop
Usermod-a-G hadoop
Passwd hadoop
Configure ssh:
Server:
Su hadoop
Ssh-keygen-t rsa
Cp id_rsa.pub authorized_keys
Client
Chmod 700/home/hadoop
Chmod 755/home/hadoop/. ssh
Su hadoop
Cd/home
Mkdir. ssh
Server:
Chmod 644/home/hadoop/. ssh/authorized_keys
Scp authorized_keys datanode1:/home/hadoop/. ssh/
Scp authorized_keys datanode2:/home/hadoop/. ssh/
Ssh datanode1
Ssh datanode2
If ssh is configured, the following message is displayed:
The authenticity of host [dbrg-2] can't be established.
Key fingerpr is 1024 5f: a0: 0b: 65: d3: 82: df: AB: 44: 62: 6d: 98: 9c: fe: e9: 52.
Are you sure you want to continue connecting (yes/no )?
OpenSSH tells you that it does not know this host, but you don't have to worry about this problem. you log on to this host for the first time and type "yes ".
This host "recognition mark" is added to "~ /. Ssh/know_hosts "file will not display this prompt message for 2nd accesses to this host
But don't forget to test the local ssh dbrg-1
Mkdir/home/hadoop/HadoopInstall
Tar-zxvf hadoop-0.20.1.tar.gz-C/home/hadoop/HadoopInstall/
Cd/home/hadoop/HadoopInstall/
Ln-s hadoop-0.20.1 hadoop
Export JAVA_HOME =/usr/local/java
Export CLASSPATH =.:/usr/local/java/lib
Export HADOOP_HOME =/home/hadoop/HadoopInstall/hadoop
Export HADOOP_CONF_DIR =/home/hadoop-conf
Export PATH = $ HADOOP_HOME/bin: $ PATH
Cd $ HADOOP_HOME/conf/
Mkdir/home/hadoop-conf
Cp hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml masters slaves/home/hadoop-conf
Vi $ HADOOP_HOME/hadoop-conf/hadoop-env.sh
# The java implementation to use. Required. -- change it to The jdk installation directory.
Export JAVA_HOME =/usr/local/java
Export HADOOP_CLASSPATH =.:/usr/local/java/lib
# The maximum amount of heap to use, in MB. Default is 1000. -- adjust according to your memory size
Export HADOOP_HEAPSIZE = 200
Vi/home/hadoop/. bashrc
Export JAVA_HOME =/usr/local/java
Export CLASSPATH =.:/usr/local/java/lib
Export HADOOP_HOME =/home/hadoop/HadoopInstall/hadoop
Export HADOOP_CONF_DIR =/home/hadoop-conf
Export PATH = $ HADOOP_HOME/bin: $ PATH
Configuration
Namenode
# Vi $ HADOOP_CONF_DIR/slaves
192.168.13.108
192.168.13.110
# Vi $ HADOOP_CONF_DIR/core-site.xml
Fs. default. name
Hdfs: // 192.168.13.100: 9000
# Vi $ HADOOP_CONF_DIR/hdfs-site.xml
Dfs. replication
3
Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
# Vi $ HADOOP_CONF_DIR/mapred-site.xml
Mapred. job. tracker
192.168.13.100: 11000
~
The configuration file on slave is as follows (hdfs-site.xml does not need to be configured ):
# Cat core-site.xml [root @ test12 conf] #
Fs. default. name
Hdfs: // namenode: 9000
# Cat mapred-site.xml [root @ test12 conf] #
Mapred. job. tracker
Namenode: 11000
Start
Export PATH = $ HADOOP_HOME/bin: $ PATH
Hadoop namenode-format
Start-all.sh
Stop stop-all.sh
Create the danchentest folder on hdfs and upload the file to this directory.
$ HADOOP_HOME/bin/hadoop fs-mkdir danchentest
$ HADOOP_HOME/bin/hadoop fs-put $ HADOOP_HOME/README.txt danchentest
Cd $ HADOOP_HOME
Hadoop jar hadoop-0.20.1-examples.jar wordcount/user/hadoop/Danchen test/README.txt output1
09/12/21 18:31:44 INFO input. FileInputFormat: Total input paths to process: 1
09/12/21 18:31:45 INFO mapred. JobClient: Running job: job_2009122111__0002
09/12/21 18:31:46 INFO mapred. JobClient: map 0% reduce 0%
09/12/21 18:31:53 INFO mapred. JobClient: map 100% reduce 0%
09/12/21 18:32:05 INFO mapred. JobClient: map 100% reduce 100%
09/12/21 18:32:07 INFO mapred. JobClient: Job complete: job_2009122111__0002
09/12/21 18:32:07 INFO mapred. JobClient: Counters: 17
09/12/21 18:32:07 INFO mapred. JobClient: Job Counters
09/12/21 18:32:07 INFO mapred. JobClient: Launched reduce tasks = 1
View the output result file on hdfs.
[Root @ test11 hadoop] # hadoop fs-ls output1
Found 2 items
Drwxr-xr-x-root supergroup 0/user/root/output1/_ logs
-Rw-r -- 3 root supergroup 1306/user/root/output1/part-r-00000
[Root @ test11 hadoop] # hadoop fs-cat output1/part-r-00000
(BIS), 1
(ECCN) 1
View the running status of hdfs. you can access http: // 192.168.13.100: 50070/dfshealth. jsp through the web interface; View map-reduce information,
You can access http: // 192.168.13.100: 50030/jobtracker. jsp through the web interface. The following figure shows the result through the command line.
08/01/25 16:31:40 INFO ipc. Client: Retrying connect to server: foo.bar.com/1.1.1.53567. Already tried 1 time (s ).
Is not formatted: hadoop namenode-format
This article is from the "one party" blog