Ubuntu Install Hadoop (pseudo distribution mode)

Source: Internet
Author: User
Tags hadoop fs

InUbuntu14.04 installation of Hadoop2.4.0 (standalone mode) based on configuration

First, configure the Core-site.xml

The/usr/local/hadoop/etc/hadoop/core-site.xml contains configuration information when Hadoop starts.

Open this file in the editor

sudo gedit/usr/local/hadoop/etc/hadoop/core-site.xml

Add the following between the <configuration></configuration> of the file:

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

Save, close the edit window.

The contents of the final modified file are as follows:

Second, the configuration Yarn-site.xml

The/usr/local/hadoop/etc/hadoop/yarn-site.xml contains configuration information for MapReduce when it is started.

Open this file in the editor

sudo gedit yarn-site.xml

Add the following between the <configuration></configuration> of the file:

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

Save, close the edit window

The contents of the final modified file are as follows

iii. Creating and configuring Mapred-site.xml

By default, there is a mapred.xml.template file under the/usr/local/hadoop/etc/hadoop/folder, and we want to copy the file and name it Mapred.xml, which is used to specify the framework used by MapReduce.

Copy and rename

CP Mapred-site.xml.template Mapred-site.xml

Editor opens this new file

sudo gedit mapred-site.xml

Add the following between the <configuration></configuration> of the file:

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

Save, close the edit window

The contents of the final modified file are as follows

iv. Configuration Hdfs-site.xml

/usr/local/hadoop/etc/hadoop/hdfs-site.xml is used to configure each host in the cluster to be available, specifying the directory on the host as Namenode and Datanode.

Create the folder as shown in

You can also create a folder under another path, the name can be different, but it needs to be consistent with the configuration in Hdfs-site.xml.

Editor opens Hdfs-site.xml

Add the following between the <configuration></configuration> of the file:

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/usr/local/hadoop/hdfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/usr/local/hadoop/hdfs/data</value>

</property>

Save, close the edit window

The contents of the final modified file are as follows:

v. Format HDFs

HDFs Namenode-format

It only needs to be executed once, and if it is executed again after Hadoop has been used, all the data on the HDFS will be erased.

vi. start Hadoop

After the configuration and operation described above, you can start this single-node cluster

To execute a startup command:

sbin/start-dfs.sh

When you execute this command, if you have a yes/no prompt, enter Yes and return.

Next, execute:

sbin/start-yarn.sh

After executing these two commands, Hadoop will start and run

Execute the JPS command and you will see Hadoop-related processes such as:

Browser opens http://localhost:50070/, you will see the HDFs administration page

Browser opens http://localhost:8088, you will see the Hadoop Process Management page

Seven, WordCountValidation

Create input directory on DFS

Bin/hadoop fs-mkdir-p Input

Copy the README.txt from the Hadoop directory to the DFS new input

Hadoop fs-copyfromlocal README.txt Input

Run WordCount

Hadoop jar Share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.4.0-sources.jar Org.apache.hadoop.examples.WordCount Input Output

You can see the execution process

When you are finished running, view the Word statistics results

Hadoop Fs-cat output/*

Ubuntu Install Hadoop (pseudo distribution mode)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.