Installation of Hadoop1.0.4 pseudo-distributed method

Source: Internet
Author: User
Tags manual

I. The use of the environment


Operating system: Ubuntu12.04

Hadoop version: 1.0.4


Second, the official website of Hadoop installation manual
Single-machine installation is divided into "standalone" and "pseudo-distributed" two modes, I use pseudo distributed mode.

The manual installation steps are very detailed and there is no need to be a translator. But the handbook sacrifices some key issues for versatility. Mentioned in section III.


Third, the manual defect
The manual has two main questions that are not clear:

Which directory is 1.Hadoop installed in?

There are two types of choices:/usr/local and the home directory. I chose to install the home directory, because when I started playing Hadoop, I avoided the hassle of permissions, and if I wanted to deploy the application, I should install it in/usr/local.

2. Problems with configuration files

This is the most important step to installing Hadoop. All configuration files are in the ${hadoop_home}/conf directory.

If you completely follow the manual configuration, your Hadoop may often encounter situations where namenode or datanode cannot start up. The configuration file differs mainly in Core-site.xml.

The manual is configured as follows:


<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration> my configuration is as follows:

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/zhengeek/hadoop-tmp</value>
<description>a base for other temporary directories.</description>
</property>
</configuration>


There are two different places:

Both ports 1.9000 and 8020 are available, as long as they are not occupied.

2. The easiest place to go wrong is right here. Like the problem that this guy is having.

Hadoop runs, HDFs and MapReduce have a lot of data to save. This can be done through Dfs.name.dir and dfs.data.dir settings. If not set, then the data will be stored by default in the/tmp directory, if the machine is restarted, the data in the TMP directory will be lost, namenode nature can not start up.

If Hadoop.tmp.dir is set, the data is stored in a modified directory by default.

A discussion about Hadoop.tmp.dir settings.

Four, start Hadoop

1. Format HDFs

$bin/hadoop Namenode-format

2. Start HDFs

$bin/start-all.sh

3. Check for Success

$jps

If successful, you will see Namenode,secondarynamenode,datanode,jobtracker and tasktracker five processes.

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.