Source Cloud Computing Technology Series (vii) Cloudera (Hadoop 0.20)

Source: Internet
Author: User
Keywords nbsp; xml name 2009 nbsp; xml name 2009

Virtual set of CentOS 5.3 os.

Download Jdk-6u16-linux-i586-rpm.bin

[Root@hadoop ~]# chmod +x jdk-6u16-linux-i586-rpm.bin

[Root@hadoop ~]#./jdk-6u16-linux-i586-rpm.bin

[Root@hadoop ~]# Java-version
Java Version "1.6.0"
OpenJDK Runtime Environment (build 1.6.0-b09)
OpenJDK Client VM (build 1.6.0-b09, Mixed mode)

[Root@hadoop yum.repos.d]# wget Http://archive.cloudera.com/redhat/cdh/cloudera-testing.repo

[Root@hadoop yum.repos.d]# ls
Centos-base.repo Centos-base.repo.bak Centos-media.repo Cloudera-testing.repo

[Root@hadoop ~]# Yum Install hadoop-0.20-y
Loaded Plugins:fastestmirror
Loading mirror speeds from cached hostfile
Setting up Install Process
Parsing Package Install arguments
Resolving dependencies
--> Running Transaction Check
---> Package hadoop-0.20.noarch 0:0.20.0+69-1 set to be updated
--> finished Dependency resolution

Dependencies resolved

===============================================================================
Package                  arch                version                    repository                     Size
============ ===================================================================
Installing:
hadoop-0.20              noarch              0.20.0+69-1                cloudera-testing               M

Transaction Summary
===================================================================================== ====================
install      1 Package (s)          
update       0 Package (s)          
remove       0 Package (s)          

Total Download Size:18 M
Downloading Packages:
hadoop-0.20-0.20.0+69-1.noarch.rpm | MB 01:34
Running Rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
installing:hadoop-0.20 [1/1]

Installed:hadoop-0.20.noarch 0:0.20.0+69-1
complete!

Root@hadoop conf]# Yum Install hadoop-0.20-conf-pseudo-y
Loaded Plugins:fastestmirror
Loading mirror speeds from cached hostfile
Setting up Install Process
Parsing Package Install arguments
Resolving dependencies
--> Running Transaction Check
---> Package hadoop-0.20-conf-pseudo.noarch 0:0.20.0+69-1 set to be updated
--> finished Dependency resolution

Dependencies resolved

=========================================================================================================
package                           arch             version                repository                 Size
=========================================================================================================
Installing:
hadoop-0.20-conf-pseudo          noarch           0.20.0+69-1             cloudera-testing           k

Transaction Summary
===================================================================================== ====================
install      1 Package (s)          
update       0 Package (s)          
remove       0 Package (s)          

Total Download Size:11 k
Downloading Packages:
hadoop-0.20-conf-pseudo-0.20.0+69-1.noarch.rpm | One KB 00:00
Running Rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing:hadoop-0.20-conf-pseudo [1/1]

Installed:hadoop-0.20-conf-pseudo.noarch 0:0.20.0+69-1
complete!

You can see it in this directory after installation.

[Root@hadoop conf.pseudo]# rpm-ql Hadoop-0.20-conf-pseudo
/etc/hadoop-0.20/conf.pseudo
/etc/hadoop-0.20/conf.pseudo/readme
/etc/hadoop-0.20/conf.pseudo/capacity-scheduler.xml
/etc/hadoop-0.20/conf.pseudo/configuration.xsl
/etc/hadoop-0.20/conf.pseudo/core-site.xml
/etc/hadoop-0.20/conf.pseudo/fair-scheduler.xml
/etc/hadoop-0.20/conf.pseudo/hadoop-env.sh
/etc/hadoop-0.20/conf.pseudo/hadoop-metrics.properties
/etc/hadoop-0.20/conf.pseudo/hadoop-policy.xml
/etc/hadoop-0.20/conf.pseudo/hdfs-site.xml
/etc/hadoop-0.20/conf.pseudo/log4j.properties
/etc/hadoop-0.20/conf.pseudo/mapred-site.xml
/etc/hadoop-0.20/conf.pseudo/masters
/etc/hadoop-0.20/conf.pseudo/slaves
/etc/hadoop-0.20/conf.pseudo/ssl-client.xml.example
/etc/hadoop-0.20/conf.pseudo/ssl-server.xml.example
/var/lib/hadoop-0.20
/var/lib/hadoop-0.20/cache

[Root@hadoop conf.pseudo]# pwd
/etc/hadoop-0.20/conf.pseudo

[Root@hadoop conf.pseudo]# more Core-site.xml
<?xml version= "1.0"?>
<?xml-stylesheet type= "text/xsl" href= "configuration.xsl"?>

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop-0.20/cache/${user.name}</value>
</property>
</configuration>

Start Hadoop related services:

[Root@hadoop conf.pseudo]# for service in/etc/init.d/hadoop-0.20-*


&gt; Do


&gt; sudo $service start


&gt; Done


Starting Hadoop Datanode Daemon (hadoop-datanode): Starting Datanode, logging to/usr/lib/hadoop-0.20/bin/. /logs/hadoop-hadoop-datanode-hadoop.out


[OK]


starting Hadoop jobtracker Daemon (hadoop-jobtracker): Starting Jobtracker, logging to/usr/lib/hadoop-0.20/bin/. /logs/hadoop-hadoop-jobtracker-hadoop.out


[OK]


starting Hadoop namenode Daemon (hadoop-namenode): Starting Namenode, logging to/usr/lib/hadoop-0.20/bin/. /logs/hadoop-hadoop-namenode-hadoop.out


[OK]


Starting Hadoop Secondarynamenode Daemon (hadoop-secondarynamenode): Starting Secondarynamenode, logging to/usr/lib/ hadoop-0.20/bin/. /logs/hadoop-hadoop-secondarynamenode-hadoop.out


[OK]


starting Hadoop tasktracker Daemon (hadoop-tasktracker): Starting Tasktracker, logging to/usr/lib/hadoop-0.20/bin/ .. /logs/hadoop-hadoop-tasktracker-hadoop.out


[OK]

Verify Startup success:

Hadoop 3503 1 8 18:33? 00:00:03/usr/java/jdk1.6.0_16/bin/java-xmx1000m-dcom.sun.manage
Hadoop 3577 1 10 18:33? 00:00:04/usr/java/jdk1.6.0_16/bin/java-xmx1000m-dcom.sun.manage
Hadoop 3657 1 15 18:33? 00:00:05/usr/java/jdk1.6.0_16/bin/java-xmx1000m-dcom.sun.manage
Hadoop 3734 1 11 18:33? 00:00:04/usr/java/jdk1.6.0_16/bin/java-xmx1000m-dcom.sun.manage
Hadoop 3827 1 7 18:33? 00:00:02/usr/java/jdk1.6.0_16/bin/java-xmx1000m-dhadoop.log.di

Test a few examples:

Root@hadoop conf.pseudo]# hadoop-0.20 fs-mkdir
[root@hadoop conf.pseudo]# hadoop-0.20 fs-put/etc/hadoop-0 .20/conf/*.xml input
[root@hadoop conf.pseudo]# hadoop-0.20 fs-ls input
Found 6 items
-rw-r--r--   1 root supergroup       6275 2009-08-25 18:34/user/root/input/capacity-scheduler.xml
-rw-r--r--   1 root supergroup        338 2009-08-25 18:34/user/ Root/input/core-site.xml
-rw-r--r--   1 root supergroup       3032 2009-08-25 18:34/user/root/input/fair-scheduler.xml
-rw-r--r--   1 root supergroup        4190 2009-08-25 18:34/user/root/input/hadoop-policy.xml
-rw-r--r--   1 root supergroup        496 2009-08-25 18:34/user/root/input/hdfs-site.xml
- rw-r--r--   1 root supergroup        213 2009-08-25 18:34/user/root/input/mapred-site.xml

[Root@hadoop conf.pseudo]# hadoop-0.20 jar/usr/lib/hadoop-0.20/hadoop-*-examples.jar grep input Output ' dfs[a-z.] +'


09/08/25 18:34:59 INFO mapred. Fileinputformat:total input paths to Process:6


09/08/25 18:35:00 INFO mapred. Jobclient:running job:job_200908251833_0001


09/08/25 18:35:01 INFO mapred. Jobclient:map 0% reduce 0%


09/08/25 18:35:20 INFO mapred. Jobclient:map 33% reduce 0%


09/08/25 18:35:33 INFO mapred. Jobclient:map 66% reduce 11%


09/08/25 18:35:42 INFO mapred. Jobclient:map 66% reduce 22%


09/08/25 18:35:45 INFO mapred. Jobclient:map 100% reduce 22%


09/08/25 18:35:57 INFO mapred. Jobclient:map 100% reduce 100%


09/08/25 18:35:59 INFO mapred. Jobclient:job complete:job_200908251833_0001


09/08/25 18:35:59 INFO mapred. Jobclient:counters:18


09/08/25 18:35:59 INFO mapred. Jobclient:job counters


09/08/25 18:35:59 INFO mapred. jobclient:launched Reduce Tasks=1


09/08/25 18:35:59 INFO mapred. jobclient:launched Map Tasks=6


09/08/25 18:35:59 INFO mapred. Jobclient:data-local Map Tasks=6


09/08/25 18:35:59 INFO mapred. Jobclient:filesystemcounters


09/08/25 18:35:59 INFO mapred. jobclient:file_bytes_read=100


09/08/25 18:35:59 INFO mapred. jobclient:hdfs_bytes_read=14544


09/08/25 18:35:59 INFO mapred. jobclient:file_bytes_written=422


09/08/25 18:35:59 INFO mapred. jobclient:hdfs_bytes_written=204


09/08/25 18:35:59 INFO mapred. Jobclient:map-reduce Framework


09/08/25 18:35:59 INFO mapred. Jobclient:reduce input groups=4


09/08/25 18:35:59 INFO mapred. Jobclient:combine Output records=4


09/08/25 18:35:59 INFO mapred. Jobclient:map input records=364


09/08/25 18:35:59 INFO mapred. Jobclient:reduce Shuffle bytes=124


09/08/25 18:35:59 INFO mapred. Jobclient:reduce Output records=4


09/08/25 18:35:59 INFO mapred. jobclient:spilled records=8


09/08/25 18:35:59 INFO mapred. Jobclient:map Output bytes=86


09/08/25 18:35:59 INFO mapred. Jobclient:map input bytes=14544


09/08/25 18:35:59 INFO mapred. Jobclient:combine input records=4


09/08/25 18:35:59 INFO mapred. Jobclient:map Output records=4


09/08/25 18:35:59 INFO mapred. Jobclient:reduce input records=4


09/08/25 18:35:59 WARN mapred. Jobclient:use Genericoptionsparser for parsing the arguments. Applications should implement Tool for the same.


09/08/25 18:35:59 INFO mapred. Fileinputformat:total input paths to process:1


09/08/25 18:36:00 INFO mapred. Jobclient:running job:job_200908251833_0002


09/08/25 18:36:01 INFO mapred. Jobclient:map 0% reduce 0%


09/08/25 18:36:12 INFO mapred. Jobclient:map 100% reduce 0%


09/08/25 18:36:24 INFO mapred. Jobclient:map 100% reduce 100%


09/08/25 18:36:26 INFO mapred. Jobclient:job complete:job_200908251833_0002


09/08/25 18:36:26 INFO mapred. Jobclient:counters:18


09/08/25 18:36:26 INFO mapred. Jobclient:job counters


09/08/25 18:36:26 INFO mapred. jobclient:launched Reduce Tasks=1


09/08/25 18:36:26 INFO mapred. jobclient:launched Map Tasks=1


09/08/25 18:36:26 INFO mapred. Jobclient:data-local Map Tasks=1


09/08/25 18:36:26 INFO mapred. Jobclient:filesystemcounters


09/08/25 18:36:26 INFO mapred. jobclient:file_bytes_read=100


09/08/25 18:36:26 INFO mapred. jobclient:hdfs_bytes_read=204


09/08/25 18:36:26 INFO mapred. jobclient:file_bytes_written=232


09/08/25 18:36:26 INFO mapred. jobclient:hdfs_bytes_written=62


09/08/25 18:36:26 INFO mapred. Jobclient:map-reduce Framework


09/08/25 18:36:26 INFO mapred. Jobclient:reduce input Groups=1


09/08/25 18:36:26 INFO mapred. Jobclient:combine Output Records=0


09/08/25 18:36:26 INFO mapred. Jobclient:map input records=4


09/08/25 18:36:26 INFO mapred. Jobclient:reduce Shuffle bytes=0


09/08/25 18:36:26 INFO mapred. Jobclient:reduce Output records=4


09/08/25 18:36:26 INFO mapred. jobclient:spilled records=8


09/08/25 18:36:26 INFO mapred. Jobclient:map Output bytes=86


09/08/25 18:36:26 INFO mapred. Jobclient:map input bytes=118


09/08/25 18:36:26 INFO mapred. Jobclient:combine input Records=0


09/08/25 18:36:26 INFO mapred. Jobclient:map Output records=4


09/08/25 18:36:26 INFO mapred. Jobclient:reduce input records=4

[Root@hadoop conf.pseudo]# hadoop-0.20 Fs-ls
Found 2 Items
Drwxr-xr-x-root supergroup 0 2009-08-25 18:34/user/root/input
Drwxr-xr-x-root supergroup 0 2009-08-25 18:36/user/root/output

[Root@hadoop conf.pseudo]# hadoop-0.20 fs-ls output
Found 2 Items
Drwxr-xr-x-root supergroup 0 2009-08-25 18:36/user/root/output/_logs
-rw-r--r--1 root supergroup 2009-08-25 18:36/user/root/output/part-00000

[Root@hadoop conf.pseudo]# hadoop-0.20 fs-cat output/part-00000 | head
1       Dfs.name.dir
1       dfs.permissions
1       Dfs.replication
1       dfsadmin

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.