Ubuntu under hadoop2.5.1 (pseudo distribution mode) configuration work

Last Update:2015-01-19 Source: Internet

Author: User

Tags xsl

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

One: Install JDK

Hadoop is written in the Java language, so you need to pre-install the JDK on your local computer, and the method for installing the JDK is no longer detailed here.

II: Create a Hadoop user

Create a dedicated user for Hadoop and put all of Hadoop's work under this user.

$sudo AddUser Hadoop

Enter the password to create a new user's Hadoop after entering the password, there will be some options to confirm:

Changing the user information for username
Enter the new value, or press ENTER for the default
Full Name []:
Number []:
Work Phone []:
Home Phone []:
Other []:
Is this information correct? [y/n]

The carriage return remains the default.

Three: Download Hadoop

Download the stable version of the release package from Apache Hadoop, which uses hadoop-2.5.1. : http://hadoop.apache.org/releases.html

Configuration ~/.BASHRC (pending scrutiny)

$ sudo gedit ~/.BASHRC

Add information about the following JDK to the back:

Export java_home=/usr/lib/jvm/jdk1.7.0_67

Export JRE_HOME=${JAVA_HOME}/JRE

Export Classpath=.:${java_home}/lib:${jre_home}/lib

Export Path=${java_home}/bin: $PATH

Four: Installation configuration ssh1. Installing SSH

$sudo apt-get Install SSH openssh-server

2.SSH login without password authentication
1) generate the SSH public key of the current user

$ssh-keygen-t rsa-p ""

2) Add ~/.ssh/id_rsa.pub to the target machine's ~/.ssh/authorized_keys file

$cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

3) Use

$ssh localhost

command to password-free to log on locally.

Five: Unzip the Hadoop source package

Use the CD command under Terminal to enter the directory of the Hadoop source package downloaded in step three and copy the Hadoop source package to/home/hadoop using the copy command

$CP hadoop-2.5.1.tar.gz  /home/hadoop

Unzip the hadoop-2.5.1.tar.gz to the current folder under/home/hadoop

$tar-XZVF    hadoop-2.5.1.tar.gz

VI: Configure hadoop-env.sh,core-site.xml,mapred-site.xml,hdfs-site.xml under Hadoop-2.5.1/etc/hadoop for Hadoop 1. Configure hadoop-env.sh, command line:

$gedit  /home/hadoop/hadoop-2.5.1/etc/hadoop/hadoop-env.sh

will be the following original:

# The Java implementation to Use.export Java_home=${java_home}

Change ${java_home} to its own JDK path, for example my following:

# The Java implementation to Use.export java_home=/usr/lib/jvm/jdk1.7.0_67

2. Configure core-site.xml

$gedit  /home/hadoop/hadoop-2.5.1/etc/hadoop/core-site.xml

Create a new hadoop_tmp directory under/HOME/HADOOP/HADOOP-2.5.1/, Core-site.xml, add information between <configuration></configuration>

<?xml version= "1.0" encoding= "UTF-8"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"? ><! --Put Site-specific property overrides in this file. --><configuration>       <property>            <name>fs.default.name</name>            <value >hdfs://localhost:9000</value>        </property>    <property>          <name> hadoop.tmp.dir</name>          <value>/home/hadoop/hadoop-2.5.1/hadoop_tmp</value>         < Description>a base for other temporary directories.</description>    </property></configuration >

3. Configure mapred-site.xml

Add the following information between <configuration></configuration>

<configuration>    <property>        <name>mapred.job.tracker</name>        <value> Localhost:9001</value>    </property></configuration>

4. Configure hdfs-site.xml

Hdfs-site.xml is used to configure each host in the cluster to be available, specifying the directory on the host as Namenode and Datanode.

Create a folder under/home/hadoop/hadoop-2.5.1 HDFs:

$CD  /home/hadoop/hadoop-2.5.1

After entering:

$mkdir HDFs
$mkdir Hdfs/name
$mkdir Hdfs/data

Use the Gedit command hdfs-site.xml as follows:

$gedit  /home/hadoop/hadoop-2.5.1/etc/hadoop/hdfs-site.xml

Add the following between the <configuration></configuration> of the file:

<configuration><property>        <name>dfs.namenode.name.dir</name>        <value>file :/home/hadoop/hadoop-2.5.1/hdfs/name</value>    </property>    <property>        <name> Dfs.datanode.data.dir</name>        <value>file:/home/hadoop/hadoop-2.5.1/hdfs/data</value>    </property>   <property>        <name>dfs.replication</name>        <value>1 </value>    </property></configuration>

Save, close the edit window

Seven: Format HDFs:

After the CD enters hadoop-2.5.1:

$ bin/hadoop Namenode-format

Eight: Start Hadoop

After the CD enters the hadoop-2.5.1 directory:

$sbin/start-dfs.sh

Execute the JPS command and you will see the Hadoop-related processes:

$jps

The following information will be available:

[Email protected]:~/hadoop-2.5.1$ JPS
11409 NameNode
11760 Secondarynamenode
11874 Jps
11569 DataNode

The browser opens http://localhost:50070/and you see the HDFs administration page.

If you want to turn off Hadoop you can use:

$sbin/stop-all.sh

At this point, the construction of the Hadoop pseudo-distribution is basically complete.

Summary of various issues:

(1) SSH password-free login configuration after the failure of one of the reasons:

Permissions issues for. SSH and its subordinate sub-files:

1. First, the parent directory file permission for SSH should be 755, which is the user file of the owning user (a user file from/home).

2 then. SSH directory permissions are 700, two DSA and RSA private key permissions are 600, and the remaining file permissions are 644.

Ubuntu under hadoop2.5.1 (pseudo distribution mode) configuration work

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More