Hadoop 2.x偽分布式環境搭建詳細步驟,hadoop2.x

來源:互聯網
上載者:User

Hadoop 2.x偽分布式環境搭建詳細步驟,hadoop2.x

本文以圖文結合的方式詳細介紹了Hadoop 2.x偽分布式環境搭建的全過程,供大家參考,具體內容如下

1、修改hadoop-env.sh、yarn-env.sh、mapred-env.sh

方法:使用notepad++(beifeng使用者)開啟這三個檔案

添加代碼:export JAVA_HOME=/opt/modules/jdk1.7.0_67

2、修改core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml設定檔

1)修改core-site.xml

<configuration>  <property>    <name>fs.defaultFS</name>    <value>hdfs://Hadoop-senior02.beifeng.com:8020</value>  </property>  <property>    <name>hadoop.tmp.dir</name>    <value>/opt/modules/hadoop-2.5.0/data</value>  </property></configuration>

2)修改hdfs-site.xml

<configuration>  <property>    <name>dfs.replication</name>    <value>1</value>  </property>  <property>    <name>dfs.namenode.http-address</name>    <value>Hadoop-senior02.beifeng.com:50070</value>  </property></configuration>

3)修改yarn-site.xml

<configuration>  <property>    <name>yarn.nodemanager.aux-services</name>    <value>mapreduce_shuffle</value>  </property>  <property>    <name>yarn.resourcemanager.hostname</name>    <value>Hadoop-senior02.beifeng.com</value>  </property>  <property>    <name>yarn.log-aggregation-enable</name>    <value>true</value>  </property>  <property>    <name>yarn.log-aggregation.retain-seconds</name>    <value>86400</value>  </property></configuration>

4)修改mapred-site.xml

<configuration>  <property>    <name>mapreduce.framework.name</name>    <value>yarn</value>  </property>  <property>    <name>mapreduce.jobhistory.webapp.address</name>    <value>0.0.0.0:19888</value>  </property></configuration>

3、啟動hdfs

1)格式化namenode:$ bin/hdfs namenode -format

2)啟動namenode:$sbin/hadoop-daemon.sh start namenode

3)啟動datanode:$sbin/hadoop-daemon.sh start datanode

4)hdfs監控web頁面:http://hadoop-senior02.beifeng.com:50070

4、啟動yarn

1)啟動resourcemanager:$sbin/yarn-daemon.sh start resourcemanager

2)啟動nodemanager:sbin/yarn-daemon.sh start nodemanager

3)yarn監控web頁面:http://hadoop-senior02.beifeng.com:8088

5、測試wordcount jar包

1)定位路徑:/opt/modules/hadoop-2.5.0

2)代碼測試:bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount /input/sort.txt /output6/

運行過程:

16/05/08 06:39:13 INFO client.RMProxy: Connecting to ResourceManager at Hadoop-senior02.beifeng.com/192.168.241.130:8032
16/05/08 06:39:15 INFO input.FileInputFormat: Total input paths to process : 1
16/05/08 06:39:15 INFO mapreduce.JobSubmitter: number of splits:1
16/05/08 06:39:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1462660542807_0001
16/05/08 06:39:16 INFO impl.YarnClientImpl: Submitted application application_1462660542807_0001
16/05/08 06:39:16 INFO mapreduce.Job: The url to track the job: http://Hadoop-senior02.beifeng.com:8088/proxy/application_1462660542807_0001/
16/05/08 06:39:16 INFO mapreduce.Job: Running job: job_1462660542807_0001
16/05/08 06:39:36 INFO mapreduce.Job: Job job_1462660542807_0001 running in uber mode : false
16/05/08 06:39:36 INFO mapreduce.Job: map 0% reduce 0%
16/05/08 06:39:48 INFO mapreduce.Job: map 100% reduce 0%
16/05/08 06:40:04 INFO mapreduce.Job: map 100% reduce 100%
16/05/08 06:40:04 INFO mapreduce.Job: Job job_1462660542807_0001 completed successfully
16/05/08 06:40:04 INFO mapreduce.Job: Counters: 49

3)結果查看:bin/hdfs dfs -text /output6/par*

運行結果:

hadoop 2
jps 1
mapreduce 2
yarn 1

6、MapReduce曆史伺服器

1)啟動:sbin/mr-jobhistory-daemon.sh start historyserver

2)web ui介面:http://hadoop-senior02.beifeng.com:19888

7、hdfs、yarn、mapreduce功能

1)hdfs:Distributed File System,高容錯性的檔案系統,適合部署在廉價的機器上。

hdfs是一個主從結構,分為namenode和datanode,其中namenode是命名空間,datanode是儲存空間,datanode以資料區塊的形式進行儲存,每個資料區塊128M

2)yarn:通用資源管理系統,為上層應用提供統一的資源管理和調度。

yarn分為resourcemanager和nodemanager,resourcemanager負責資源調度和分配,nodemanager負責資料處理和資源

3)mapreduce:MapReduce是一種計算模型,分為Map(映射)和Reduce(歸約)。

map將每一行資料處理後,以索引值對的形式出現,並傳給reduce;reduce將map傳過來的資料進行匯總和統計。

以上就是本文的全部內容,希望對大家的學習有所協助。

相關文章

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.