My environment is: ubuntu14.04+hadoop2.6.0+jdk1.8.0_25
Official website 2.6.0 Installation Tutorial: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html
For the sake of configuration, I've used Hadoop users on every machine to do it, and that's really handy.
Node information: (Distributed Cluster Architecture: Master primary node, rest from node)
Machine name |
IP |
function |
master |
122.205.135.25 4 |
namenode andjobtracker |
slave1 |
122.205.135.21 2 |
DataNode and Tasktracker |
Installation of 1.JDK
First, the Hadoop runtime requires Java support, so you must install the JDK on all nodes in the cluster,
Detailed installation of jdk1.7.0_60 Note: It is a good idea to install the JDK in the cluster in the same directory for easy configuration. In practice, the author generally installs the JDK in the/usr/java directory, but it is not necessary.
2. Configure the Hosts file
To modify the/etc/hosts of all machines in the cluster, the command to open the file is as follows:
sudo gedit/etc/hosts
Add (example):
122.205.135.254 Master
122.205.135.212 slave1
Note: Here the master, slave1, slave2, etc., refers to the machine name (using the command hostname can see the machine name), remember, if not the machine name will be problematic, and all nodes in the cluster should have different machine names.
3.SSH Login without password
Hadoop master-Slave login installation configuration details see my other chapter: Click to open the link
Installation and configuration of 4.Hadoop
(1). Download and unzip the Hadoop stable version
I'm using a hadoop-2.6.0,:http://mirrors.hust.edu.cn/apache/hadoop/common/stable/.
Copy the downloaded Hadoop to the Hadoop directory, unzip it to the master server's/hadoop directory (Configure the master node and copy it to the other servers, in general, all Hadoop in the cluster is installed in the same directory):
Unzip the command as follows:
Tar xzfv hadoop-2.6.0.tar.gz
(2). Configure Hadoop
1. Modify hadoop-2.6.0/etc/hadoop/hadoop-env.sh to add JDK support:
Export java_home=/usr/java/jdk1.8.0_25
If you do not know your JDK directory, use the command echo $JAVA _home to view.
2. Modify Hadoop-2.6.0/etc/hadoop/core-site.xml
Note: Must be added within the <configuration></configuration> node
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-2.6.0/tmp</value>
<description>abase for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
3. Modify Hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-2.6.0/dfs/name</value>
<description>path on the local filesystem where the NameNode stores the namespace and transactions logs persistently .</description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-2.6.0/dfs/data</value>
<description>comma separated list of paths on the local filesystem of a DataNode where it should store its BLOCKS.&L T;/description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
4. Modify Hadoop-2.6.0/etc/hadoop/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
<description>host or IP and port of jobtracker.</description>
</property>
5. Modify Hadoop-2.6.0/etc/hadoop/masters
List all the master nodes:
Master
6. Modify Hadoop-2.6.0/etc/hadoop/slaves
This is the machine of all datanode, for example:
Slave1
Slave2
Slave3
Slave4
7. Copy the configured Hadoop folder on the master node to all slave nodes
Take slave1 as an example: The command is as follows:
Scp-r ~/hadoop-2.6.0 [email protected]: ~/
After the installation is complete, we will format HDFs and then start all nodes of the cluster.
5. Start Hadoop
1. Formatting the namenode of the HDFs file system
(here to enter the hadoop-2.6.0 directory to format better):
CD hadoop-2.6.0//Enter the hadoop-2.6.0 directory
Bin/hdfs Namenode-format//Formatting
2. Start the Hadoop cluster
The start HDRs command is as follows:
SBIN/START-DFS.SH//Open process
The following interface will appear when successful input JPS:
Add, the command to close the HDFs cluster is as follows:
sbin/stop-dfs.sh
We can also use the website to see if the normal installation and configuration, the address is as follows: http://master:50070/
Ubuntu14.04 installation configuration Hadoop2.6.0 (fully distributed) run with WordCount instances