Ubuntu14.04 installation configuration Hadoop2.6.0 (fully distributed) run with WordCount instances

Source: Internet
Author: User

My environment is: ubuntu14.04+hadoop2.6.0+jdk1.8.0_25

Official website 2.6.0 Installation Tutorial: http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/SingleCluster.html

For the sake of configuration, I've used Hadoop users on every machine to do it, and that's really handy.

Node information: (Distributed Cluster Architecture: Master primary node, rest from node)

Machine name

IP

function

master

122.205.135.25 4

namenode andjobtracker 

slave1

122.205.135.21 2

DataNode and Tasktracker

Installation of 1.JDK

First, the Hadoop runtime requires Java support, so you must install the JDK on all nodes in the cluster,

Detailed installation of jdk1.7.0_60 Note: It is a good idea to install the JDK in the cluster in the same directory for easy configuration. In practice, the author generally installs the JDK in the/usr/java directory, but it is not necessary.

2. Configure the Hosts file

To modify the/etc/hosts of all machines in the cluster, the command to open the file is as follows:

sudo gedit/etc/hosts

Add (example):

122.205.135.254 Master

122.205.135.212 slave1

Note: Here the master, slave1, slave2, etc., refers to the machine name (using the command hostname can see the machine name), remember, if not the machine name will be problematic, and all nodes in the cluster should have different machine names.

3.SSH Login without password

Hadoop master-Slave login installation configuration details see my other chapter: Click to open the link

Installation and configuration of 4.Hadoop

(1). Download and unzip the Hadoop stable version

I'm using a hadoop-2.6.0,:http://mirrors.hust.edu.cn/apache/hadoop/common/stable/.

Copy the downloaded Hadoop to the Hadoop directory, unzip it to the master server's/hadoop directory (Configure the master node and copy it to the other servers, in general, all Hadoop in the cluster is installed in the same directory):

Unzip the command as follows:

Tar xzfv hadoop-2.6.0.tar.gz

(2). Configure Hadoop

1. Modify hadoop-2.6.0/etc/hadoop/hadoop-env.sh to add JDK support:

Export java_home=/usr/java/jdk1.8.0_25

If you do not know your JDK directory, use the command echo $JAVA _home to view.

2. Modify Hadoop-2.6.0/etc/hadoop/core-site.xml

Note: Must be added within the <configuration></configuration> node

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-2.6.0/tmp</value>
<description>abase for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

3. Modify Hadoop-2.6.0/etc/hadoop/hdfs-site.xml

<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop-2.6.0/dfs/name</value>
<description>path on the local filesystem where the NameNode stores the namespace and transactions logs persistently .</description>
</property>

<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop-2.6.0/dfs/data</value>
<description>comma separated list of paths on the local filesystem of a DataNode where it should store its BLOCKS.&L T;/description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

4. Modify Hadoop-2.6.0/etc/hadoop/mapred-site.xml

<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
<description>host or IP and port of jobtracker.</description>
</property>

5. Modify Hadoop-2.6.0/etc/hadoop/masters

List all the master nodes:

Master

6. Modify Hadoop-2.6.0/etc/hadoop/slaves

This is the machine of all datanode, for example:

Slave1

Slave2

Slave3

Slave4

7. Copy the configured Hadoop folder on the master node to all slave nodes

Take slave1 as an example: The command is as follows:

Scp-r ~/hadoop-2.6.0 [email protected]: ~/

After the installation is complete, we will format HDFs and then start all nodes of the cluster.

5. Start Hadoop

1. Formatting the namenode of the HDFs file system

(here to enter the hadoop-2.6.0 directory to format better):

CD hadoop-2.6.0//Enter the hadoop-2.6.0 directory

Bin/hdfs Namenode-format//Formatting

2. Start the Hadoop cluster
The start HDRs command is as follows:

SBIN/START-DFS.SH//Open process

The following interface will appear when successful input JPS:

Add, the command to close the HDFs cluster is as follows:

sbin/stop-dfs.sh

We can also use the website to see if the normal installation and configuration, the address is as follows: http://master:50070/

Ubuntu14.04 installation configuration Hadoop2.6.0 (fully distributed) run with WordCount instances

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.