Hadoop 1.0.3 Installation Process on centos 6.2 [the entire process of personal installation is recorded]

Source: Internet
Author: User
Tags hadoop fs
// Install SSH

[Root @ localhost/] # sudo Yum Install SSH

 

// Generate the key

[Root @ localhost/] # ssh-keygen (You can press enter all the way) to generate the following two files:/root/. Ssh/id_rsa/root/. Ssh/id_rsa.pub

 

[Root @ localhost. Ssh] # cd/root/. Ssh/

 

// The actual situation is to copy the public key to another machine and write it to the authorized_keys file on another machine.

[Root @ localhost. Ssh] # Cat./id_rsa.pub>./authorized_keys

 

[Root @ localhost. Ssh] # cd/home

// Configure JDK Environment Variables

[Root @ localhost opt] # vi/etc/profile

 

Export java_home =/opt/jdk1.6.0 _ 31 Export Path = $ java_home/bin: $ path :.

// Make the configuration take effect

[Root @ localhost opt] # source/etc/profile

 

// Install hadoop 1.0.3

[Root @ localhost opt] # rpm-I hadoop-1.0.3-1.x86_64.rpm

 

// View the hadoop version number after installation

[Root @ localhost opt] # hadoop version

 

(If an error is reported, checkHadoop-env.shIn Java)

Modify the hadoop configuration file (/etc/hadoop)

[Root @ localhost hadoop] # cd/etc/hadoop[Root @ localhost hadoop] # hadoop-env.sh vi

 

Export java_home =/opt/jdk1.6.0 _ 31

 

 

[Root @ localhost hadoop] # core-site.xml vi

 

<Configuration> <property> <Name> FS. default. name </Name> <value> HDFS: // 192.168.1.101: 9000 </value> </property> <Name> hadoop. TMP. dir </Name> <value>/hadoop </value> </property> </configuration>

 

[Root @ localhost hadoop] # hdfs-site.xml vi

 

<Configuration> <property> <Name> DFS. Replication </Name> <value> 1 </value> </property> </configuration>

 

 

[Root @ localhost hadoop] # mapred-site.xml vi

 

<Configuration> <property> <Name> mapred. Job. Tracker </Name> <value> 192.168.1.101: 9001 </value> </property> </configuration>

 

// Format the File System

[Root @ localhost opt] # hadoop namenode-format

 

 

// Start all hadoop-related services (/usr/sbin) [root @ localhost sbin] # start-all.sh or [Root @ localhost opt] #/usr/sbin/start-all.sh

 

(If you do not have the execution permission, you need to set the execution permission for the sh file under the/usr/sbin directory) Description: start-all.shstop-all.shstart-dfs.shstop-dfs.shstart-mapred.shstop-mapred.shslaves.sh

 

// View the started service process in JPs

[Root @ localhost hadoop] # JPs

 

5131 namenode5242 datanode5361 secondarynamenode5583 tasktracker5463 jobtracker6714 JPs

 

Port to be opened by the firewall: 9000

9001

50010

 

 

(Access http: // 192.168.1.101: 50070 http: // 192.168.1.101: 50030) [Root @ localhost hadoop] # hadoop dfsadmin-Report

 

 

Prepare for running example wordcount [Root @ localhost opt] # hadoop FS-mkdir Input

 

[Root @ localhost opt] # echo "Hello World bye world"> file01 [root @ localhost opt] # echo "Hello hadoop goodbye hadoop"> file02

 

[Root @ localhost opt] # hadoop FS-copyfromlocal./file0 * Input

 

 

Running example wordcount [Root @ localhost opt] # hadoop JAR/usr/share/hadoop/hadoop-examples-1.0.3.jar wordcount Input Output

 

12/08/11 12:00:30 info input. fileinputformat: total input paths to process: 212/08/11 12:00:30 info util. nativecodeloader: loaded the native-hadoop library12/08/11 12:00:30 warn snappy. loadsnappy: snappy native library not loaded12/08/11 12:00:31 info mapred. jobclient: running job: job_2012081137_000112/08/11 12:00:32 info mapred. jobclient: Map 0% reduce 0% 12/08/11 12:01:05 info mapred. jobclient: Map 100% reduce 0% 12/08/11 12:01:20 info mapred. jobclient: Map 100% reduce 100% 12/08/11 12:01:25 info mapred. jobclient: job complete: job_201208111137_000112/08/11 12:01:25 info mapred. jobclient: counters: 2912/08/11 12:01:25 info mapred. jobclient: Job counters 12/08/11 12:01:25 info mapred. jobclient: Launched reduce tasks = 112/08/11 12:01:25 info mapred. jobclient: slots_millis_maps = 4949912/08/11 12:01:25 info mapred. jobclient: total time spent by all CES waiting after reserving slots (MS) = 012/08/11 12:01:25 info mapred. jobclient: total time spent by all maps waiting after reserving slots (MS) = 012/08/11 12:01:25 info mapred. jobclient: Launched map tasks = 212/08/11 12:01:25 info mapred. jobclient: Data-local map tasks = 212/08/11 12:01:25 info mapred. jobclient: slots_millis_reduces = 1283912/08/11 12:01:25 info mapred. jobclient: file output format counters 12/08/11 12:01:25 info mapred. jobclient: bytes written = 4112/08/11 12:01:25 info mapred. jobclient: filesystemcounters12/08/11 12:01:25 info mapred. jobclient: file_bytes_read = 7912/08/11 12:01:25 info mapred. jobclient: hdfs_bytes_read = 27612/08/11 12:01:25 info mapred. jobclient: file_bytes_written = 6470512/08/11 12:01:25 info mapred. jobclient: hdfs_bytes_written = 4112/08/11 12:01:25 info mapred. jobclient: file input format counters 12/08/11 12:01:25 info mapred. jobclient: bytes READ = 5012/08/11 12:01:25 info mapred. jobclient: Map-Reduce framework12/08/11 12:01:25 info mapred. jobclient: map output materialized bytes = 8512/08/11 12:01:25 info mapred. jobclient: Map input records = 212/08/11 12:01:25 info mapred. jobclient: reduce shuffle bytes = 8512/08/11 12:01:25 info mapred. jobclient: spilled records = 1212/08/11 12:01:25 info mapred. jobclient: map output bytes = 8212/08/11 12:01:25 info mapred. jobclient: CPU time spent (MS) = 477012/08/11 12:01:25 info mapred. jobclient: total committed heap usage (bytes) = 24675123212/08/11 12:01:25 info mapred. jobclient: Combine input records = 812/08/11 12:01:25 info mapred. jobclient: split_raw_bytes = 22612/08/11 12:01:25 info mapred. jobclient: Reduce input records = 612/08/11 12:01:25 info mapred. jobclient: Reduce input groups = 512/08/11 12:01:25 info mapred. jobclient: Combine output records = 612/08/11 12:01:25 info mapred. jobclient: physical memory (bytes) snapshot = 39163494412/08/11 12:01:25 info mapred. jobclient: reduce output records = 512/08/11 12:01:25 info mapred. jobclient: Virtual Memory (bytes) snapshot = 315978137612/08/11 12:01:25 info mapred. jobclient: map output records = 8

 

 

// View the statistical result [Root @ localhost opt] # hadoop FS-cat output/part-r-00000

 

Bye 1 goodbye 1 hadoop 2 Hello 2 World 2

 

 

//---------------------------------------

Job log storage directory:

/Var/log/hadoop/root/userlogs/

 

 

// --------------------------------------- After installing the hadoop-1.0.3-1, the stored directory is: /etc/hadoop/var/run/hadoop/var/log/hadoop/usr/share/doc/hadoop/usr/etc/hadoop/usr/bin /hadoop (file) /usr/include/hadoop

 

 

 

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.