Hadoop learning notes (1) Environment setup

Source: Internet
Author: User
Tags rsync
Hadoop learning notes (1) Environment setup

My environment is:

Install hadoop1.0.0 in ubuntu11.10 (standalone pseudo-distributed)

Install SSH

Apt-Get Install SSH
 
Install rsync
Apt-Get install rsync
 
Configure SSH password-free Login
Ssh-keygen-t dsa-p'-f ~ /. Ssh/id_dsa
Cat ~ /. Ssh/id_dsa.pub> ~ /. Ssh/authorized_keys
 
Verify whether it is successful
SSH localhost
 
Install hadoop1.0.0 and JDK
 
Create a Linux terminal and an app directory. Both Java and hadoop are installed in this directory.
 
Mkdir/home/APP
 
Next, install Java and hadoop and decompress hadoop.

CD/home/APP
Chmod + x jdk-6u31-linux-i586.bin
/Jdk-6u31-linux-i586.bin
Tar zxf hadoop-1.0.0-bin.tar.gz
 
 
Configure JDK Environment Variables
 
VI/etc/profile
 
 
Add the following statement to the end

Export java_home =/home/APP/jdk1.6.0 _ 31
Export Path = $ java_home/bin: $ path
Export classpath =.: $ java_home/lib/dt. jar: $ java_home/lib/tools. Jar
 
 
Configure hadoop

 
Go to the hadoop directory

CD/home/APP/hadoop-1.0.0
 
 
Modify the configuration file and specify the JDK installation path.
Vi conf/hadoop-env.sh
Export java_home =/home/APP/jdk1.6.0 _ 31
 
 
Modify the hadoop core configuration file core-site.xml, which configures the address and port number of HDFS
Vi conf/core-site.xml
 

<Configuration>
<Property>
<Name> fs. Default. Name </Name>
<Value> HDFS :/// localhost: 9000 </value>
</Property>
</Configuration>
 
 
Modify the HDFS configuration in hadoop. The default backup mode is 3. Because the single-host version is installed, you need to change it to 1.
Vi conf/hdfs-site.xml
 

<Configuration>
<Property>
<Name> DFS. Replication </Name>
<Value> 1 </value>
</Property>
</Configuration>
 
 
Modify the mapreduce configuration file in hadoop, which configures the address and port of jobtracker.
 
Vi conf/mapred-site.xml
 

<Configuration>
<Property>
<Name> mapred. Job. Tracker </Name>
<Value> localhost: 9001 </value>
</Property>
</Configuration>
 
 
Next, start hadoop. Before starting hadoop, format the hadoop File System HDFS, enter the hadoop folder, and enter the following command
Bin/hadoop namenode-format
 
 
Start hadoop and enter the command
Bin/start-all.sh
 
 
This command starts all services.
 
 
Finally, verify that hadoop is successfully installed.Open your browser and enter the URL:
 
Http: // localhost: 50030 (mapreduce web page)
 

Http: // lcoalhost: 50070 (HDFS web page)

 
If all data can be viewed, the installation is successful.

Hadoop divides hosts into two roles from three perspectives:

First, it is divided into master and slave.

Second, from the HDFS perspective, the host is divided into namenode and datanode (in Distributed File Systems, directory management is very important, and Directory management is equivalent to the master, namenode is the Directory Manager ).

Third, from the perspective of mapreduce, the host is divided into jobtracker and tasktracker (a job is often divided into multiple tasks, from which it is not difficult to understand the relationship between them ).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.