標籤:
原創文章,轉載請註明: 轉載自工學1號館
歡迎關注我的個人部落格:www.wuyudong.com, 更多雲端運算與大資料的精彩文章
在hadoop-1.0中,不像0.20.2版本,有現成的eclipse-plugin源碼包,而是在HADOOP_HOME/src/contrib/eclipse-plugin目錄下放置了eclipse外掛程式的源碼,這篇文章 ,我想詳細記錄一下自己是如何編譯此源碼產生適用於Hadoop1.0的eclipse外掛程式
1、安裝環境
作業系統:Ubuntu14.4
軟體:
eclipse
java
Hadoop 1.0
2、編譯步驟
(1)首先下載ant與ivy的安裝包
將安裝包解壓縮到指定的目錄,然後將ivy包中的ivy-2.2.0.jar包放到ant安裝目錄的lib目錄下,然後在/etc/profile中添加以下內容以設定配置環境:
export ANT_HOME=/home/wu/opt/apache-ant-1.8.3
export PATH=”$ANT_HOME/bin:$PATH”
(2)終端轉到hadoop安裝目錄下,執行ant compile,結果如下:
……………………
compile:
[echo] contrib: vaidya
[javac] /home/wu/opt/hadoop-1.0.1/src/contrib/build-contrib.xml:185: warning: ‘includeantruntime’ was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
[javac] Compiling 14 source files to /home/wu/opt/hadoop-1.0.1/build/contrib/vaidya/classes
[javac] Note: /home/wu/opt/hadoop-1.0.1/src/contrib/vaidya/src/java/org/apache/hadoop/vaidya/statistics/job/JobStatistics.java uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
compile-ant-tasks:
[javac] /home/wu/opt/hadoop-1.0.1/build.xml:2170: warning: ‘includeantruntime’ was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
[javac] Compiling 5 source files to /home/wu/opt/hadoop-1.0.1/build/ant
compile:
BUILD SUCCESSFUL
Total time: 12 minutes 29 seconds
可以看到編譯成功!花的時間比較長,可以泡壺茶休息一下~~
(3)再將終端定位到HADOOP_HOME/src/contrib/eclipse-plugin,然後執行下面的命令:
ant -Declipse.home=/home/wu/opt/eclipse -Dversion=1.0.1 jar
編譯完成後就可以找到eclipse外掛程式了
3、安裝步驟
(1)偽分布式的配置過程也很簡單,只需要修改幾個檔案,在代碼的conf檔案夾內,就可以找到下面幾個設定檔,具體過程我就不多說了,這裡列出我的配置:
core-site.xml
<configuration><property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property><property><name>hadoop.tmp.dir</name><value>/home/wu/hadoop-0.20.2/tmp</value></property></configuration>
hdfs-site.xml
<configuration><property><name>dfs.replication</name><value>1</value></property></configuration>
mapred-site.xml
<configuration><property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property><property><name>mapred.job.tracker</name><value>hdfs://localhost:9001</value></property></configuration>
進入conf檔案夾,修改設定檔:hadoop-env.sh,將裡面的JAVA_HOME注釋開啟,並把裡面的地址配置正確
(2)運行hadoop
進入hadoop目錄,首次運行,需要格式檔案系統,輸入命令:
bin/hadoop namenode -format
輸入命令,啟動所有進出:
bin/start-all.sh
關閉hadoop可以用:
bin/stop-all.sh
最後驗證hadoop是否安裝成功,開啟瀏覽器,分別輸入:
http://localhost:50030/ (MapReduce的web頁面)
http://localhost:50070/ (HDFS的web頁面)
用jps命令看一下有幾個java進程在運行,如果是下面幾個就正常了
[email protected]:~/opt/hadoop-1.0.1$ jps
4113 SecondaryNameNode
4318 TaskTracker
3984 DataNode
3429
3803 NameNode
4187 JobTracker
4415 Jps
系統啟動正常後,現在來跑個程式:
$mkdir input$cd input$echo "hello world">test1.txt$echo "hello hadoop">test2.txt$cd ..$bin/hadoop dfs -put input in$bin/hadoop jar hadoop-examples-1.0.1.jar wordcount in out$bin/hadoop dfs -cat out/*
出現一長串的運行過程:
****hdfs://localhost:9000/user/wu/in
15/05/29 10:51:41 INFO input.FileInputFormat: Total input paths to process : 2
15/05/29 10:51:42 INFO mapred.JobClient: Running job: job_201505291029_0001
15/05/29 10:51:43 INFO mapred.JobClient: map 0% reduce 0%
15/05/29 10:52:13 INFO mapred.JobClient: map 100% reduce 0%
15/05/29 10:52:34 INFO mapred.JobClient: map 100% reduce 100%
15/05/29 10:52:39 INFO mapred.JobClient: Job complete: job_201505291029_0001
15/05/29 10:52:39 INFO mapred.JobClient: Counters: 29
15/05/29 10:52:39 INFO mapred.JobClient: Job Counters
15/05/29 10:52:39 INFO mapred.JobClient: Launched reduce tasks=1
15/05/29 10:52:39 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=43724
15/05/29 10:52:39 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
15/05/29 10:52:39 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
15/05/29 10:52:39 INFO mapred.JobClient: Launched map tasks=2
15/05/29 10:52:39 INFO mapred.JobClient: Data-local map tasks=2
15/05/29 10:52:39 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=20072
15/05/29 10:52:39 INFO mapred.JobClient: File Output Format Counters
15/05/29 10:52:39 INFO mapred.JobClient: Bytes Written=25
15/05/29 10:52:39 INFO mapred.JobClient: FileSystemCounters
15/05/29 10:52:39 INFO mapred.JobClient: FILE_BYTES_READ=55
15/05/29 10:52:39 INFO mapred.JobClient: HDFS_BYTES_READ=239
15/05/29 10:52:39 INFO mapred.JobClient: FILE_BYTES_WRITTEN=64837
15/05/29 10:52:39 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25
15/05/29 10:52:39 INFO mapred.JobClient: File Input Format Counters
15/05/29 10:52:39 INFO mapred.JobClient: Bytes Read=25
15/05/29 10:52:39 INFO mapred.JobClient: Map-Reduce Framework
15/05/29 10:52:39 INFO mapred.JobClient: Map output materialized bytes=61
15/05/29 10:52:39 INFO mapred.JobClient: Map input records=2
15/05/29 10:52:39 INFO mapred.JobClient: Reduce shuffle bytes=61
15/05/29 10:52:39 INFO mapred.JobClient: Spilled Records=8
15/05/29 10:52:39 INFO mapred.JobClient: Map output bytes=41
15/05/29 10:52:39 INFO mapred.JobClient: CPU time spent (ms)=7330
15/05/29 10:52:39 INFO mapred.JobClient: Total committed heap usage (bytes)=247275520
15/05/29 10:52:39 INFO mapred.JobClient: Combine input records=4
15/05/29 10:52:39 INFO mapred.JobClient: SPLIT_RAW_BYTES=214
15/05/29 10:52:39 INFO mapred.JobClient: Reduce input records=4
15/05/29 10:52:39 INFO mapred.JobClient: Reduce input groups=3
15/05/29 10:52:39 INFO mapred.JobClient: Combine output records=4
15/05/29 10:52:39 INFO mapred.JobClient: Physical memory (bytes) snapshot=338845696
15/05/29 10:52:39 INFO mapred.JobClient: Reduce output records=3
15/05/29 10:52:39 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1139433472
15/05/29 10:52:39 INFO mapred.JobClient: Map output records=4
查看out檔案夾:
[email protected]:~/opt/hadoop-1.0.1$ bin/hadoop dfs -cat out/*
hadoop 1
hello 2
world 1
編譯hadoop eclipse的外掛程式(hadoop1.0)