Ubuntu under hadoop2.5.1 (pseudo distribution mode) configuration work

Source: Internet
Author: User
Tags xsl


One: Install JDK

Hadoop is written in the Java language, so you need to pre-install the JDK on your local computer, and the method for installing the JDK is no longer detailed here.

II: Create a Hadoop user

Create a dedicated user for Hadoop and put all of Hadoop's work under this user.

$sudo AddUser Hadoop

Enter the password to create a new user's Hadoop after entering the password, there will be some options to confirm:

Changing the user information for username
Enter the new value, or press ENTER for the default
Full Name []:
Number []:
Work Phone []:
Home Phone []:
Other []:
Is this information correct? [y/n]

The carriage return remains the default.

Three: Download Hadoop

Download the stable version of the release package from Apache Hadoop, which uses hadoop-2.5.1. : http://hadoop.apache.org/releases.html

Configuration ~/.BASHRC (pending scrutiny)

$ sudo gedit ~/.BASHRC

Add information about the following JDK to the back:

Export java_home=/usr/lib/jvm/jdk1.7.0_67

Export JRE_HOME=${JAVA_HOME}/JRE

Export Classpath=.:${java_home}/lib:${jre_home}/lib

Export Path=${java_home}/bin: $PATH

Four: Installation configuration ssh1. Installing SSH

$sudo apt-get Install SSH openssh-server



2.SSH login without password authentication
1) generate the SSH public key of the current user

$ssh-keygen-t rsa-p ""

2) Add ~/.ssh/id_rsa.pub to the target machine's ~/.ssh/authorized_keys file

$cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys



3) Use

$ssh localhost

command to password-free to log on locally.


Five: Unzip the Hadoop source package

Use the CD command under Terminal to enter the directory of the Hadoop source package downloaded in step three and copy the Hadoop source package to/home/hadoop using the copy command

$CP hadoop-2.5.1.tar.gz  /home/hadoop

Unzip the hadoop-2.5.1.tar.gz to the current folder under/home/hadoop

$tar-XZVF    hadoop-2.5.1.tar.gz

VI: Configure hadoop-env.sh,core-site.xml,mapred-site.xml,hdfs-site.xml under Hadoop-2.5.1/etc/hadoop for Hadoop 1. Configure hadoop-env.sh, command line:

$gedit  /home/hadoop/hadoop-2.5.1/etc/hadoop/hadoop-env.sh



will be the following original:

# The Java implementation to Use.export Java_home=${java_home}


Change ${java_home} to its own JDK path, for example my following:

# The Java implementation to Use.export java_home=/usr/lib/jvm/jdk1.7.0_67


2. Configure core-site.xml

$gedit  /home/hadoop/hadoop-2.5.1/etc/hadoop/core-site.xml

Create a new hadoop_tmp directory under/HOME/HADOOP/HADOOP-2.5.1/, Core-site.xml, add information between <configuration></configuration>

<?xml version= "1.0" encoding= "UTF-8"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"? ><! --Put Site-specific property overrides in this file. --><configuration>       <property>            <name>fs.default.name</name>            <value >hdfs://localhost:9000</value>        </property>    <property>          <name> hadoop.tmp.dir</name>          <value>/home/hadoop/hadoop-2.5.1/hadoop_tmp</value>         < Description>a base for other temporary directories.</description>    </property></configuration >


3. Configure mapred-site.xml

Add the following information between <configuration></configuration>

<configuration>    <property>        <name>mapred.job.tracker</name>        <value> Localhost:9001</value>    </property></configuration>


4. Configure hdfs-site.xml

Hdfs-site.xml is used to configure each host in the cluster to be available, specifying the directory on the host as Namenode and Datanode.

Create a folder under/home/hadoop/hadoop-2.5.1 HDFs:

$CD  /home/hadoop/hadoop-2.5.1

After entering:

$mkdir HDFs
$mkdir Hdfs/name
$mkdir Hdfs/data

Use the Gedit command hdfs-site.xml as follows:

$gedit  /home/hadoop/hadoop-2.5.1/etc/hadoop/hdfs-site.xml

Add the following between the <configuration></configuration> of the file:

<configuration><property>        <name>dfs.namenode.name.dir</name>        <value>file :/home/hadoop/hadoop-2.5.1/hdfs/name</value>    </property>    <property>        <name> Dfs.datanode.data.dir</name>        <value>file:/home/hadoop/hadoop-2.5.1/hdfs/data</value>    </property>   <property>        <name>dfs.replication</name>        <value>1 </value>    </property></configuration>


Save, close the edit window

Seven: Format HDFs:

After the CD enters hadoop-2.5.1:

$ bin/hadoop Namenode-format


Eight: Start Hadoop

After the CD enters the hadoop-2.5.1 directory:

$sbin/start-dfs.sh

Execute the JPS command and you will see the Hadoop-related processes:

$jps

The following information will be available:

[Email protected]:~/hadoop-2.5.1$ JPS
11409 NameNode
11760 Secondarynamenode
11874 Jps
11569 DataNode


The browser opens http://localhost:50070/and you see the HDFs administration page.

If you want to turn off Hadoop you can use:

$sbin/stop-all.sh

At this point, the construction of the Hadoop pseudo-distribution is basically complete.

Summary of various issues:

(1) SSH password-free login configuration after the failure of one of the reasons:

Permissions issues for. SSH and its subordinate sub-files:

1. First, the parent directory file permission for SSH should be 755, which is the user file of the owning user (a user file from/home).

2 then. SSH directory permissions are 700, two DSA and RSA private key permissions are 600, and the remaining file permissions are 644.

Ubuntu under hadoop2.5.1 (pseudo distribution mode) configuration work

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.