源雲計算技術系列(七)Cloudera (hadoop 0.20)

來源:互聯網
上載者:User

虛擬一套centos 5.3 os.

下載 jdk-6u16-linux-i586-rpm.bin

[root@hadoop ~]# chmod +x jdk-6u16-linux-i586-rpm.bin

[root@hadoop ~]# ./jdk-6u16-linux-i586-rpm.bin

[root@hadoop ~]#  java -version
java version "1.6.0"
OpenJDK  Runtime Environment (build 1.6.0-b09)
OpenJDK Client VM (build 1.6.0-b09, mixed mode)

[root@hadoop yum.repos.d]# wget HTTP://archive.cloudera.com/redhat/cdh/cloudera-testing.repo

[root@hadoop yum.repos.d]# ls
CentOS-Base.repo  CentOS-Base.repo.bak  CentOS-Media.repo  cloudera-testing.repo

[root@hadoop ~]# yum install hadoop-0.20 -y
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Setting up Install Process
Parsing package install arguments
Resolving Dependencies
--> Running transaction check
---> Package hadoop-0.20.noarch 0:0.20.0+69-1 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

===============================================================================
Package                  Arch                Version                    Repository                     Size
============ ===================================================================
Installing:
hadoop-0.20              noarch              0.20.0+69-1                cloudera-testing               18 M

Transaction Summary
===================================================================================== ====================
Install      1 Package(s)         
Update       0 Package(s)         
Remove       0 Package(s)         

Total download size: 18 M
Downloading Packages:
hadoop-0.20-0.20.0+69-1.noarch.rpm                                                |  18 MB     01:34
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing     : hadoop-0.20                                       [1/1]

Installed: hadoop-0.20.noarch 0:0.20.0+69-1
Complete!

root@hadoop conf]# yum install hadoop-0.20-conf-pseudo -y
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Setting up Install Process
Parsing package install arguments
Resolving Dependencies
--> Running transaction check
---> Package hadoop-0.20-conf-pseudo.noarch 0:0.20.0+69-1 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

=========================================================================================================
Package                           Arch             Version                Repository                 Size
=========================================================================================================
Installing:
hadoop-0.20-conf-pseudo          noarch           0.20.0+69-1             cloudera-testing           11 k

Transaction Summary
===================================================================================== ====================
Install      1 Package(s)         
Update       0 Package(s)         
Remove       0 Package(s)         

Total download size: 11 k
Downloading Packages:
hadoop-0.20-conf-pseudo-0.20.0+69-1.noarch.rpm                                    |  11 kB     00:00
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
Installing     : hadoop-0.20-conf-pseudo                           [1/1]

Installed: hadoop-0.20-conf-pseudo.noarch 0:0.20.0+69-1
Complete!

安裝完後可以在這個目錄下看到。

[root@hadoop conf.pseudo]# rpm -ql hadoop-0.20-conf-pseudo
/etc/hadoop-0.20/conf.pseudo
/etc/hadoop-0.20/conf.pseudo/README
/etc/hadoop-0.20/conf.pseudo/capacity-scheduler.xml
/etc/hadoop-0.20/conf.pseudo/configuration.xsl
/etc/hadoop-0.20/conf.pseudo/core-site.xml
/etc/hadoop-0.20/conf.pseudo/fair-scheduler.xml
/etc/hadoop-0.20/conf.pseudo/hadoop-env.sh
/etc/hadoop-0.20/conf.pseudo/hadoop-metrics.properties
/etc/hadoop-0.20/conf.pseudo/hadoop-policy.xml
/etc/hadoop-0.20/conf.pseudo/hdfs-site.xml
/etc/hadoop-0.20/conf.pseudo/log4j.properties
/etc/hadoop-0.20/conf.pseudo/mapred-site.xml
/etc/hadoop-0.20/conf.pseudo/masters
/etc/hadoop-0.20/conf.pseudo/slaves
/etc/hadoop-0.20/conf.pseudo/ssl-client.xml.example
/etc/hadoop-0.20/conf.pseudo/ssl-server.xml.example
/var/lib/hadoop-0.20
/var/lib/hadoop-0.20/cache

[root@hadoop conf.pseudo]# pwd
/etc/hadoop-0.20/conf.pseudo

[root@hadoop conf.pseudo]# more core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop-0.20/cache/${user.name}</value>
</property>
</configuration>

啟動hadoop相關服務:

[root@hadoop conf.pseudo]# for service in /etc/init.d/hadoop-0.20-*


&gt; do


&gt; sudo $service start


&gt; done


Starting Hadoop datanode daemon (hadoop-datanode): starting datanode, logging to /usr/lib/hadoop-0.20/bin/.. /logs/hadoop-hadoop-datanode-hadoop.out


                                                           [  OK  ]


Starting Hadoop jobtracker daemon (hadoop-jobtracker): starting jobtracker, logging to /usr/lib/hadoop-0.20/bin/.. /logs/hadoop-hadoop-jobtracker-hadoop.out


[  OK  ]


Starting Hadoop namenode daemon (hadoop-namenode): starting namenode, logging to /usr/lib/hadoop-0.20/bin/.. /logs/hadoop-hadoop-namenode-hadoop.out


                                                           [  OK  ]


Starting Hadoop secondarynamenode daemon (hadoop-secondarynamenode): starting secondarynamenode, logging to /usr/lib/ hadoop-0.20/bin/.. /logs/hadoop-hadoop-secondarynamenode-hadoop.out


                                                           [  OK  ]


Starting Hadoop tasktracker daemon (hadoop-tasktracker): starting tasktracker, logging to /usr/lib/hadoop-0.20/bin/ .. /logs/hadoop-hadoop-tasktracker-hadoop.out


[  OK  ]

驗證一下啟動成功:

hadoop    3503     1  8 18:33 ?        00:00:03 /usr/java/jdk1.6.0_16/bin/java -Xmx1000m -Dcom.sun.manage
hadoop    3577     1 10 18:33 ?        00:00:04 /usr/java/jdk1.6.0_16/bin/java -Xmx1000m -Dcom.sun.manage
hadoop    3657     1 15 18:33 ?        00:00:05 /usr/java/jdk1.6.0_16/bin/java -Xmx1000m -Dcom.sun.manage
hadoop    3734     1 11 18:33 ?        00:00:04 /usr/java/jdk1.6.0_16/bin/java -Xmx1000m -Dcom.sun.manage
hadoop    3827     1  7 18:33 ?        00:00:02 /usr/java/jdk1.6.0_16/bin/java -Xmx1000m -Dhadoop.log.di

測試幾個例子:

root@hadoop conf.pseudo]# hadoop-0.20 fs -mkdir input
[root@hadoop conf.pseudo]# hadoop-0.20 fs -put /etc/hadoop-0 .20/conf/*.xml input
[root@hadoop conf.pseudo]# hadoop-0.20 fs -ls input
Found 6 items
-rw-r--r--   1 root supergroup       6275 2009-08-25 18:34 /user/root/input/capacity-scheduler.xml
-rw-r--r--   1 root supergroup        338 2009-08-25 18:34 /user/ root/input/core-site.xml
-rw-r--r--   1 root supergroup       3032 2009-08-25 18:34 /user/root/input/fair-scheduler.xml
-rw-r--r--   1 root supergroup        4190 2009-08-25 18:34 /user/root/input/hadoop-policy.xml
-rw-r--r--   1 root supergroup        496 2009-08-25 18:34 /user/root/input/hdfs-site.xml
- rw-r--r--   1 root supergroup        213 2009-08-25 18:34 /user/root/input/mapred-site.xml

[root@hadoop conf.pseudo]# hadoop-0.20 jar /usr/lib/hadoop-0.20/hadoop-*-examples.jar grep input output 'dfs[a-z.] +'


09/08/25 18:34:59 INFO mapred. FileInputFormat: Total input paths to process : 6


09/08/25 18:35:00 INFO mapred. JobClient: Running job: job_200908251833_0001


09/08/25 18:35:01 INFO mapred. JobClient:  map 0% reduce 0%


09/08/25 18:35:20 INFO mapred. JobClient:  map 33% reduce 0%


09/08/25 18:35:33 INFO mapred. JobClient:  map 66% reduce 11%


09/08/25 18:35:42 INFO mapred. JobClient:  map 66% reduce 22%


09/08/25 18:35:45 INFO mapred. JobClient:  map 100% reduce 22%


09/08/25 18:35:57 INFO mapred. JobClient:  map 100% reduce 100%


09/08/25 18:35:59 INFO mapred. JobClient: Job complete: job_200908251833_0001


09/08/25 18:35:59 INFO mapred. JobClient: Counters: 18


09/08/25 18:35:59 INFO mapred. JobClient:   Job Counters


09/08/25 18:35:59 INFO mapred. JobClient:     Launched reduce tasks=1


09/08/25 18:35:59 INFO mapred. JobClient:     Launched map tasks=6


09/08/25 18:35:59 INFO mapred. JobClient:     Data-local map tasks=6


09/08/25 18:35:59 INFO mapred. JobClient:   FileSystemCounters


09/08/25 18:35:59 INFO mapred. JobClient:     FILE_BYTES_READ=100


09/08/25 18:35:59 INFO mapred. JobClient:     HDFS_BYTES_READ=14544


09/08/25 18:35:59 INFO mapred. JobClient:     FILE_BYTES_WRITTEN=422


09/08/25 18:35:59 INFO mapred. JobClient:     HDFS_BYTES_WRITTEN=204


09/08/25 18:35:59 INFO mapred. JobClient:   Map-Reduce Framework


09/08/25 18:35:59 INFO mapred. JobClient:     Reduce input groups=4


09/08/25 18:35:59 INFO mapred. JobClient:     Combine output records=4


09/08/25 18:35:59 INFO mapred. JobClient:     Map input records=364


09/08/25 18:35:59 INFO mapred. JobClient:     Reduce shuffle bytes=124


09/08/25 18:35:59 INFO mapred. JobClient:     Reduce output records=4


09/08/25 18:35:59 INFO mapred. JobClient:     Spilled Records=8


09/08/25 18:35:59 INFO mapred. JobClient:     Map output bytes=86


09/08/25 18:35:59 INFO mapred. JobClient:     Map input bytes=14544


09/08/25 18:35:59 INFO mapred. JobClient:     Combine input records=4


09/08/25 18:35:59 INFO mapred. JobClient:     Map output records=4


09/08/25 18:35:59 INFO mapred. JobClient:     Reduce input records=4


09/08/25 18:35:59 WARN mapred. JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.


09/08/25 18:35:59 INFO mapred. FileInputFormat: Total input paths to process : 1


09/08/25 18:36:00 INFO mapred. JobClient: Running job: job_200908251833_0002


09/08/25 18:36:01 INFO mapred. JobClient:  map 0% reduce 0%


09/08/25 18:36:12 INFO mapred. JobClient:  map 100% reduce 0%


09/08/25 18:36:24 INFO mapred. JobClient:  map 100% reduce 100%


09/08/25 18:36:26 INFO mapred. JobClient: Job complete: job_200908251833_0002


09/08/25 18:36:26 INFO mapred. JobClient: Counters: 18


09/08/25 18:36:26 INFO mapred. JobClient:   Job Counters


09/08/25 18:36:26 INFO mapred. JobClient:     Launched reduce tasks=1


09/08/25 18:36:26 INFO mapred. JobClient:     Launched map tasks=1


09/08/25 18:36:26 INFO mapred. JobClient:     Data-local map tasks=1


09/08/25 18:36:26 INFO mapred. JobClient:   FileSystemCounters


09/08/25 18:36:26 INFO mapred. JobClient:     FILE_BYTES_READ=100


09/08/25 18:36:26 INFO mapred. JobClient:     HDFS_BYTES_READ=204


09/08/25 18:36:26 INFO mapred. JobClient:     FILE_BYTES_WRITTEN=232


09/08/25 18:36:26 INFO mapred. JobClient:     HDFS_BYTES_WRITTEN=62


09/08/25 18:36:26 INFO mapred. JobClient:   Map-Reduce Framework


09/08/25 18:36:26 INFO mapred. JobClient:     Reduce input groups=1


09/08/25 18:36:26 INFO mapred. JobClient:     Combine output records=0


09/08/25 18:36:26 INFO mapred. JobClient:     Map input records=4


09/08/25 18:36:26 INFO mapred. JobClient:     Reduce shuffle bytes=0


09/08/25 18:36:26 INFO mapred. JobClient:     Reduce output records=4


09/08/25 18:36:26 INFO mapred. JobClient:     Spilled Records=8


09/08/25 18:36:26 INFO mapred. JobClient:     Map output bytes=86


09/08/25 18:36:26 INFO mapred. JobClient:     Map input bytes=118


09/08/25 18:36:26 INFO mapred. JobClient:     Combine input records=0


09/08/25 18:36:26 INFO mapred. JobClient:     Map output records=4


09/08/25 18:36:26 INFO mapred. JobClient:     Reduce input records=4

[root@hadoop conf.pseudo]#    hadoop-0.20 fs -ls
Found 2 items
drwxr-xr-x   - root supergroup          0 2009-08-25 18:34 /user/root/input
drwxr-xr-x   - root supergroup          0 2009-08-25 18:36 /user/root/output

[root@hadoop conf.pseudo]# hadoop-0.20 fs -ls output
Found 2 items
drwxr-xr-x   - root supergroup          0 2009-08-25 18:36 /user/root/output/_logs
-rw-r--r--   1 root supergroup         62 2009-08-25 18:36 /user/root/output/part-00000

[root@hadoop conf.pseudo]# hadoop-0.20 fs -cat output/part-00000 | head
1       dfs.name.dir
1       dfs.permissions
1       dfs.replication
1       dfsadmin

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.