One: Install JDK
Hadoop is written in the Java language, so you need to pre-install the JDK on your local computer, and the method for installing the JDK is no longer detailed here.
II: Create a Hadoop user
Create a dedicated user for Hadoop and put all of Hadoop's work under this user.
$sudo AddUser Hadoop
Enter the password to create a new user's Hadoop after entering the password, there will be some options to confirm:
Changing the user information for username
Enter the new value, or press ENTER for the default
Full Name []:
Number []:
Work Phone []:
Home Phone []:
Other []:
Is this information correct? [y/n]
The carriage return remains the default.
Three: Download Hadoop
Download the stable version of the release package from Apache Hadoop, which uses hadoop-2.5.1. : http://hadoop.apache.org/releases.html
Configuration ~/.BASHRC (pending scrutiny)
$ sudo gedit ~/.BASHRC
Add information about the following JDK to the back:
Export java_home=/usr/lib/jvm/jdk1.7.0_67
Export JRE_HOME=${JAVA_HOME}/JRE
Export Classpath=.:${java_home}/lib:${jre_home}/lib
Export Path=${java_home}/bin: $PATH
Four: Installation configuration ssh1. Installing SSH
$sudo apt-get Install SSH openssh-server
2.SSH login without password authentication
1) generate the SSH public key of the current user
$ssh-keygen-t rsa-p ""
2) Add ~/.ssh/id_rsa.pub to the target machine's ~/.ssh/authorized_keys file
$cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
3) Use
$ssh localhost
command to password-free to log on locally.
Five: Unzip the Hadoop source package
Use the CD command under Terminal to enter the directory of the Hadoop source package downloaded in step three and copy the Hadoop source package to/home/hadoop using the copy command
$CP hadoop-2.5.1.tar.gz /home/hadoop
Unzip the hadoop-2.5.1.tar.gz to the current folder under/home/hadoop
$tar-XZVF hadoop-2.5.1.tar.gz
VI: Configure hadoop-env.sh,core-site.xml,mapred-site.xml,hdfs-site.xml under Hadoop-2.5.1/etc/hadoop for Hadoop 1. Configure hadoop-env.sh, command line:
$gedit /home/hadoop/hadoop-2.5.1/etc/hadoop/hadoop-env.sh
will be the following original:
# The Java implementation to Use.export Java_home=${java_home}
Change ${java_home} to its own JDK path, for example my following:
# The Java implementation to Use.export java_home=/usr/lib/jvm/jdk1.7.0_67
2. Configure core-site.xml
$gedit /home/hadoop/hadoop-2.5.1/etc/hadoop/core-site.xml
Create a new hadoop_tmp directory under/HOME/HADOOP/HADOOP-2.5.1/, Core-site.xml, add information between <configuration></configuration>
<?xml version= "1.0" encoding= "UTF-8"? ><?xml-stylesheet type= "text/xsl" href= "configuration.xsl"? ><! --Put Site-specific property overrides in this file. --><configuration> <property> <name>fs.default.name</name> <value >hdfs://localhost:9000</value> </property> <property> <name> hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-2.5.1/hadoop_tmp</value> < Description>a base for other temporary directories.</description> </property></configuration >
3. Configure mapred-site.xml
Add the following information between <configuration></configuration>
<configuration> <property> <name>mapred.job.tracker</name> <value> Localhost:9001</value> </property></configuration>
4. Configure hdfs-site.xml
Hdfs-site.xml is used to configure each host in the cluster to be available, specifying the directory on the host as Namenode and Datanode.
Create a folder under/home/hadoop/hadoop-2.5.1 HDFs:
$CD /home/hadoop/hadoop-2.5.1
After entering:
$mkdir HDFs
$mkdir Hdfs/name
$mkdir Hdfs/data
Use the Gedit command hdfs-site.xml as follows:
$gedit /home/hadoop/hadoop-2.5.1/etc/hadoop/hdfs-site.xml
Add the following between the <configuration></configuration> of the file:
<configuration><property> <name>dfs.namenode.name.dir</name> <value>file :/home/hadoop/hadoop-2.5.1/hdfs/name</value> </property> <property> <name> Dfs.datanode.data.dir</name> <value>file:/home/hadoop/hadoop-2.5.1/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>1 </value> </property></configuration>
Save, close the edit window
Seven: Format HDFs:
After the CD enters hadoop-2.5.1:
$ bin/hadoop Namenode-format
Eight: Start Hadoop
After the CD enters the hadoop-2.5.1 directory:
$sbin/start-dfs.sh
Execute the JPS command and you will see the Hadoop-related processes:
$jps
The following information will be available:
[Email protected]:~/hadoop-2.5.1$ JPS
11409 NameNode
11760 Secondarynamenode
11874 Jps
11569 DataNode
The browser opens http://localhost:50070/and you see the HDFs administration page.
If you want to turn off Hadoop you can use:
$sbin/stop-all.sh
At this point, the construction of the Hadoop pseudo-distribution is basically complete.
Summary of various issues:
(1) SSH password-free login configuration after the failure of one of the reasons:
Permissions issues for. SSH and its subordinate sub-files:
1. First, the parent directory file permission for SSH should be 755, which is the user file of the owning user (a user file from/home).
2 then. SSH directory permissions are 700, two DSA and RSA private key permissions are 600, and the remaining file permissions are 644.
Ubuntu under hadoop2.5.1 (pseudo distribution mode) configuration work