先參考:《hadoop-2.3.0-cdh5.1.0偽分布安裝(基於centos)》
http://blog.csdn.net/jameshadoop/article/details/39055493
註:本例使用root使用者搭建
一、環境
作業系統:CentOS 6.5 64位作業系統
註:Hadoop2.0以上採用的是jdk環境是1.7,Linux內建的jdk卸載掉,重新安裝
下載地址:http://www.oracle.com/technetwork/java/javase/downloads/index.html
軟體版本:hadoop-2.3.0-cdh5.1.0.tar.gz, zookeeper-3.4.5-cdh5.1.0.tar.gz
下載地址:http://archive.cloudera.com/cdh5/cdh/5/
c1:192.168.58.11
c2:192.168.58.12
c3:192.168.58.13
二、安裝JDK(略)見上面的參考文章 三、配置環境變數 (配置jdk和hadoop的環境變數) 四、系統配置
1關閉防火牆
chkconfig iptables off(永久性關閉)
配置主機名稱和hosts檔案
2、SSH無密碼驗證配置
因為Hadoop運行過程需要遠端管理Hadoop的守護進程,NameNode節點需要通過SSH(Secure Shell)連結各個DataNode節點,停止或啟動他們的進程,所以SSH必須是沒有密碼的,所以我們要把NameNode節點和DataNode節點配製成無秘密通訊,同理DataNode也需要配置無密碼連結NameNode節點。
在每一台機器上配置:
vi /etc/ssh/sshd_config開啟
RSAAuthentication yes # 啟用 RSA 認證,PubkeyAuthentication yes # 啟用公開金鑰私密金鑰配對認證方式
Master01:運行:ssh-keygen –t rsa –P '' 不輸入密碼直接enter
預設存放在 /root/.ssh目錄下,
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@master01 .ssh]# ls
authorized_keys id_rsa id_rsa.pub known_hosts 遠程copy: scp authorized_keys c2:~/.ssh/ scp authorized_keys c3:~/.ssh/
五、配置幾個檔案(各個節點一樣) 5.1. hadoop/etc/hadoop/hadoop-env.sh 添加:
# set to the root ofyour Java installation export JAVA_HOME=/usr/java/latest # Assuming your installation directory is/usr/local/hadoop export HADOOP_PREFIX=/usr/local/hadoop
5.2. etc/hadoop/core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://c1:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/cdh/hadoop/data/tmp</value> </property> </configuration>
5.3. etc/hadoop/hdfs-site.xml
<configuration> <property> <!--開啟web hdfs--> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/cdh/hadoop/data/dfs/name</value> <description> namenode 存放name table(fsimage)本地目錄(需要修改)</description> </property> <property> <name>dfs.namenode.edits.dir</name> <value>${dfs.namenode.name.dir}</value> <description>namenode粗放 transactionfile(edits)本地目錄(需要修改)</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/cdh/hadoop/data/dfs/data</value> <description>datanode存放block本地目錄(需要修改)</description> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.permissions.enabled</name> <value>false</value></property></configuration>
5.4 etc/hadoop/mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
5.5 etc/hadoop/yarn-env.sh
# some Java parametersexport JAVA_HOME=/usr/local/java/jdk1.7.0_67
5.6 etc/hadoop/yarn-site.xml
<configuration><property><name>yarn.resourcemanager.address</name><value>c1:8032</value></property><property><name>yarn.resourcemanager.scheduler.address</name><value>c1:8030</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>c1:8031</value></property><property><name>yarn.resourcemanager.admin.address</name><value>c1:8033</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>c1:8088</value></property><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value></property> </configuration>
5.7. etc/hadoop/slaves
c2c3
六:啟動及驗證安裝是否成功
格式化:要先格式化HDFS:
[html] view plain copy bin/hdfs namenode -format 啟動:
sbin/start-dfs.sh
sbin/start-yarn.sh
[root@c1 hadoop]# jps
3250 Jps
2491 ResourceManager
2343 SecondaryNameNode
2170 NameNode
datanode節點:
[root@c2 ~]# jps
4196 Jps
2061 DataNode
2153 NodeManager
[html] view plain copy 1. 開啟瀏覽器 NameNode - http://localhost:50070/ 2. 建立檔案夾 3.$ bin/hdfs dfs -mkdir /user $ bin/hdfs dfs -mkdir /user/<username>