centos 6.5中安裝hadoop2.2

來源:互聯網
上載者:User

標籤:

1.配置叢集機器之間ssh免密碼登入(1) ssh-keygen -t dsa -P ‘‘ -f ~/.ssh/id_dsa將id_dsa.pub 公開金鑰 加入授權的key中去這條命令的功能是把公開金鑰加到用於認證的公開金鑰檔案中,這裡的authorized_keys 是用於認證的公開金鑰檔案 (2) cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys(3)這樣就把登陸原生密鑰加入公開金鑰之中,以後登陸本機就無需輸入密碼了,但是叢集之間還是不能免密碼登陸,我們還要把叢集之中其他機器登陸的密鑰檔案id_dsa.pub加入authorized_keys之中。我們叢集的組成是3台機器,分別是master,slave1,slave2,我們在3台主機上執行上述命令,這樣叢集中每台主機都產生了id_dsa.pub檔案,我們將slave1與slave2主機的id_dsa.pub檔案內容都加入master主機的authorized_keys檔案中,處理之後,master主機的authorized_keys檔案就像這樣:ssh-dss AAAAB3NzaC1kc3MAAACBAKpCe9woQHMTehKLJA+6GdseAMPGnykirGIzbqqwhU/dHVNMyaxwGrK42c0Sxrtg9Q/zeaAmvbtxjmtVIJ9EImWGH7U0/ijs+PVspGpp1RZoI+5eSBwCUDRF93yT9/hVm/X9mP+k/bETwC7zi1mei+ai/V6re6fTelwS9dkiYHsfAAAAFQCoai5Gh74xcauX8ScXqCZK8FOHVwAAAIAajMwOhEnRSANAtjfFo0Fx2Dhq8VZqGlJzT2xqKQv0VkxqJgE8WNv4IMIIehdhl0kSFE6640zi3B2CZ3muTQxNOK4kxWxi36HhffvLpzcVrme6HVhOGnZFrbqpmo0cLZdK99aMF/TkEF2UhRb6pL2QWAyZgIrZbWm5iGq8W47UsgAAAIAGB3DfhF9GjnrZKIIsIeSrETo1ebJfZK1z7hf3CIHWb51I+gNHVtLZuuljeLIS8oTtKu0IZcI3zvCWWGi+anAhAK+9N/VWppzC75q7Tp+XPw0OAwHeC7OjHnj4oIUYnV8+QQDgK51njl8pwQNcW5ytAr1GXMxfPnq1Do29JW5FDQ== [email protected]ssh-dss AAAAB3NzaC1kc3MAAACBAJN2NYZap/VXLECMgCFXWyvz2uY9ciLwhOhTqnLeX5giJUWfEvvlzpuxzhrMmJdo40Rn6h/ggf2qgrCDo0NM7aaoo3nG2cW3e1mrpkDgpI+qYrNUwtdZ6a2jWs//gourBa359v/8NQgkdPZXw1JCnE3qzLxJQ2YfTPLFMmV7yv01AAAAFQDoIbKLeHjrtgHuCCT6CHbmV69jJwAAAIEAgj9piFkKUDAVeP60YQy3+CI2RSaU1JBopXOuzLJcYZcsZm+z1+b4HKgF23MsK0nEpl0UgnlicGk6GgiulBHTAMoq/GO6Hn5I1tEtXjDKlWG1PaGoH8Wua6GlziyxrZ/0OKjTdJaOirctVFnD/yyoO3xE8jpGzJwqWuScW44W3zQAAACADGFDYzG34Jr3M+BUkB11vGcv6NKeyU/CP/OSx5LGjQwwwD2f0UdSYEAuqvvkccNB9MB10H0OJCSFNGtbULA8kpDXM03q2VkJcJXQcRx+C9QoHCtF1EaM7GFmSuAEegzvv2UR122qXsxsxZIiJXhKZKzbznTIoipm0KEAqp0cz48= [email protected]
ssh-dss AAAAB3NzaC1kc3MAAACBAOLxtxe3HLhc01szJFXktBJUfjnQwan/EvXcalvHv/DX9jsp5OroEclNE9NLzeL+NU9Ax0Jh7zYbyvQ2xK/lW9syfkJWntdwXcpeTBRrH1NX+dV1LentHyvgAj411LHZLfnkYaztXPWB/ux8JK9F6GB16uVWTG1KjCQwo44q5MtFAAAAFQDw/590kNub5MXnQCMBe4ggfK8dmQAAAIAg2GEhEPak+ETd9UekWL/k5168ng9SmA7sWvABs/dVePFdpP2WY+WNOOmyryvvtpsBfEyAM/NCaTsrMWcGorOdAJ4IKyMDl3QLTolelnjBaC8pcHEZ1igKR2JPGDIQSSlBkvB/Q8+qVmwYlHIQnEoYgGOoEokdtmHVMwOR053/hAAAAIB/kGh9FN4ie+5zRmQLiYTDES3ztm/Ik3UU0fOoNWkdeTVAXvp1xXotkQIkeh3bGFHwGfDUjNtTlrS+qqvAQqCpcj8LR8+pQh0UbxT2rZ1AsGviUVoK8mbosJ3eUjcigCCbF3SChy8TYIU7fsAynavqFubsbmV/6HpbHJNyC1+MAA== [email protected]然後將master主機處理之後的authorized_keys檔案覆蓋slave1和slave2主機~/.ssh/ 目錄下的authorized_keys檔案,這樣叢集內部各主機都實現了免密碼登陸。重啟電腦,我們任意選擇一個主機,分別ssh 其他兩台主機,如果能夠不輸入密碼就能直接登陸,那麼就配置成功了。2.配置hadoop中的一些設定檔解壓 hadoop安裝檔案至/cloud目錄下,如下:(1)編輯設定檔hadoop-env.sh 指定JAVA_HOME的目錄首先查看一下JAVA_HOME的地址 :echo $JAVA_HOME可以知道JAVA_HOME的地址如下:/usr/lib/jvm/java-1.7.0-openjdk.x86_64vi /cloud/hadoop-2.2/etc/hadoop/hadoop-env.sh(2)設定檔core-site.xml,添加以下內容:vi /cloud/hadoop-2.2/etc/hadoop/core-site.xml <configuration>    <property>          <name>fs.default.name</name>          <value>hdfs://master:9000</value>      </property>    <property>            <name>dfs.replication</name>            <value>3</value>          </property>  <property>    <name>hadoop.tmp.dir</name>    <value>/cloud/hadoopData</value>  </property></configuration> ①設定hdfs的訪問地址是hdfs://110.64.76.130:9000,②臨時檔案的存放地址是/cloud/hadoopData, 要注意建立此目錄(3)設定檔hdfs-site.xmlvi /cloud/hadoop-2.2/etc/hadoop/hdfs-site.xml添加以下內容:<configuration><property>     <name>dfs.replication</name>     <value>2</value>   </property>    <property>     <name>dfs.namenode.name.dir</name>     <value>/cloud/hadoopData/name</value>   </property>     <property>     <name>dfs.datanode.data.dir</name>     <value>/cloud/hadoopData/data</value>   </property>   </configuration> (4)設定檔yarn-site.xmlvi /cloud/hadoop-2.2/etc/hadoop/yarn-site.xml添加以下內容:<?xml version="1.0"?> <configuration><property>    <name>yarn.resourcemanager.resource-tracker.address</name>    <value>master:8031</value>    <description>host is the hostname of the resource manager and    port is the port on which the NodeManagers contact the Resource Manager.    </description>  </property>   <property>    <name>yarn.resourcemanager.scheduler.address</name>    <value>master:8030</value>    <description>host is the hostname of the resourcemanager and port is the port    on which the Applications in the cluster talk to the Resource Manager.    </description>  </property>   <property>    <name>yarn.resourcemanager.scheduler.class</name>    <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>    <description>In case you do not want to use the default scheduler</description>  </property>   <property>    <name>yarn.resourcemanager.address</name>    <value>master:8032</value>    <description>the host is the hostname of the ResourceManager and the port is the port on    which the clients can talk to the Resource Manager. </description>  </property>   <property>    <name>yarn.nodemanager.address</name>    <value>0.0.0.0:8034</value>    <description>the nodemanagers bind to this port</description>  </property>   <property>    <name>yarn.nodemanager.resource.memory-mb</name>    <value>10240</value>    <description>the amount of memory on the NodeManager in GB</description>  </property>  <property>    <name>yarn.nodemanager.aux-services</name>    <value>mapreduce_shuffle</value>    <description>shuffle service that needs to be set for Map Reduce to run </description>  </property></configuration> (5)設定檔 slaves 修改成以下內容:slave1slave2 3.將hadoop添加到環境變數在/etc/profile檔案中添加以下內容,並且更新系統配置。export HADOOP_HOME=/cloud/hadoop-2.2expoer PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin執行下述命令,使環境變數設定生效 source /etc/profile  4.將hadoop安裝設定檔複製分發到叢集的其他主機上cd /cloudscp -r hadoop-2.2 [email protected]:/cloudscp -r hadoopData [email protected]:/cloud scp -r hadoop-2.2 [email protected]:/cloudscp -r hadoopData [email protected]:/cloud5.格式化hdfs檔案系統以下操作在master主機上進行cd /cloud/binhdfs namenode -format (只需運行一次)6. 啟動每個hadoop節點上的hadoop服務cd /cloud/hadoop-2.2/sbinmaster:./start-dfs.sh./start-yarn.shslave1與slave2:在Hadoop 2.x中,MapReduce Job不需要額外的daemon進程,在Job開始的時候,NodeManager會啟動一個MapReduce Application Master(相當與一個精簡的JobTracker),Job結束的時候自動被關閉。 所以無需在slave1和slave2執行命令來啟動節點。 7.測試hadoop 叢集

可以用瀏覽器開啟NameNode, ResourceManager和各個NodeManager的web介面,

    - NameNode web UI, http://master:50070/    - ResourceManager web UI, http://master:8088/    - NodeManager web UI, http://slave01:8042

 

 

還可以啟動JobHistory Server,能夠通過Web頁面查看叢集的曆史Job,執行如下命令:

mr-jobhistory-daemon.sh start historyserver

預設使用19888連接埠,通過訪問http://master:19888/查看曆史資訊。

終止JobHistory Server,執行如下命令:

mr-jobhistory-daemon.sh stop historyserver

9.運行wordcount樣本程式

hdfs dfs -mkdir /user

hdfs dfs -mkdir /user/root    用於建立使用者檔案夾,以後如果不指明路徑,預設儲存在使用者目錄下

hdfs dfs -put ./test.txt input  將本地目錄中的test.txt 檔案複製到使用者路勁下作為input檔案

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount input output

hdfs dfs -cat output/*

10.停止運行hadoop叢集

在master上執行:

cd /cloud/hadoop-2.2/sbin

./stop-yarn.sh

./stop-dfs.sh



centos 6.5中安裝hadoop2.2

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.