兩台測試虛機,系統為REHL 5.3 x64,正常安裝最新版本的JDK,正確設定SSH無密碼登入。
伺服器一:192.168.56.101 dev1
伺服器二:192.168.56.102 dev2
從http://apache.freelamp.com/hadoop/core/hadoop-0.20.1/下載hadoop-0.20.1.tar.gz,把hadoop-0.20.1.tar.gz拷貝到dev1的“/usr/software/hadoop”目錄下。登入dev1執行以下命令:
# cd /usr/software/hadoop
# tar zxvf hadoop-0.20.1.tar.gz
# cp -a hadoop-0.20.1 /usr/hadoop
# cd /usr/hadoop/conf
修改hadoop環境設定檔hadoop-env.sh
# vi hadoop-env.sh
添加以下內容:
export JAVA_HOME=/usr/java/jdk1.6.0_16
修改hadoop主要設定檔core-site.xml
# vi core-site.xml
添加以下內容(可以根據需求自行定義):
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://dev1</value>
<description>The name of the default file system. Either the literal string "local" or a host:port for DFS.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/usr/hadoop/filesystem/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/hadoop/filesystem/data</value>
<description>
Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are i
gnored.
</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication. The actual number of replications can be specified when the file is created. The default isused if replication is not specified in create time.</description>
</property>
</configuration>
修伽hadoop的mapred-site.xml檔案
# vi mapred-site.xml
添加如下內容:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>dev1:9001</value>
<description>
The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and
reduce task.
</description>
</property>
</configuration>
修改hadoop定義namenode的masters檔案:
# vi masters
添加以下內容:
dev1
修改hadoop定義datanode的slaves檔案:
# vi slaves
添加以下內容:
dev2
在dev2按以上步驟安裝hadoop。
格式化namenode:
# ./hadoop namenode -format
到此所有安裝和配置完成。
在dev1執行以下命令,啟動hadoop:
# cd /usr/hadoop/bin
# ./start-all.sh
啟動完成後,可以以下執行命令來查看hadoop查看其基本情況:
# ./hadoop dfsadmin -report
或在瀏覽器中輸入http://192.168.56.101:50070/dfshealth.jsp查看。