Ubuntu 14.04 LTS installed on the deployment of Hadoop 2.7.1

Last Update:2015-10-24 Source: Internet

Author: User

Tags ssh server

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Install Ubuntu 14.04 Desktop LTS

Download the ISO file, Ubuntu-14.04.1-desktop-amd64.iso.
Create a new virtual machine in VirtualBox or VMware and set the ISO file as the startup disc.
Install Ubuntu 14.04 http://www.linuxidc.com/Linux/2014-04/100473.htm under Windows 7 with VMware Workstation 10 virtual machines
Next, enter the user name LINUXIDC where you want to enter the user, until the system installation is complete.
Restart the virtual machine and log on to the system using LINUXIDC. To set the root password, enter the sudo passwd command:

[Email protected]:~$ sudo passwd
[sudo] password for LINUXIDC: (Enter your own password)
Enter a new UNIX password: (set root password)
Re-enter the new UNIX password: (Set the root password again)
passwd: Password has been successfully updated

installs enhancements in Ubuntu (VirtualBox enhancements or Vmtools) for later operation
To add a user to the sudo group:

Command:sudo adduser linuxidc sudo. If the user is already prompted within this group.

Add the group Hadoop and add LINUXIDC to the group Hadoop (groups command to see the group the user is in ):

Su
AddGroup Hadoop
Usermod-a-G Hadoop LINUXIDC

2. Configure SSH password-free login

Install SSH server:sudo apt-get install openssh-server
CD ~/.ssh/ # If you do not have this directory, first execute SSH localhostssh-keygen-t RSA # Will be prompted, press ENTER to be able to cat id_rsa.pub >> authorized_keys # Add to license
Use ssh localhost to try to log in directly

3. Configuring the JDK

Download JDK1.7 to Oracle website,http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html. Download jdk-7u79-linux-x64.tar.gz.
Because you are ready to install the JDK under/USR/LIB/JVM, set up the JVM folder in the/usr/lib/directory first

cd/usr/lib/
sudo mkdir JVM

Unzip the downloaded installation package to the JVM folder (jdk1.7.0_79)

sudo tar zxvf/home/linuxidc/desktop/jdk-7u79-linux-x64.tar.gz-c/USR/LIB/JVM

Set Java_home:

Nano ~/.BASHRC
Add before the first line: Export java_home=/usr/lib/jvm/jdk1.7.0_79, save exit.
Make the setting effective immediately:source ~/.BASHRC

Test whether the Java_home is set successfully, output the path set above to indicate success:

Echo $JAVA _home

4, Installation Hadoop2.7.1

Download the file, http://hadoop.apache.org/releases.html, select 2.7.1 binary format to download . The file name is: hadoop-2.7.1.tar.gz.
We chose to install Hadoop to/usr/local/

sudo tar-zxvf /home/linuxidc/desktop/hadoop-2.7.1.tar.gz -c/usr/local
cd/usr/local/ sudo mv./hadoop-2.7.1/./hadoop # Change folder name to Hadoop sudo chown-r linuxidc./hadoop # Modify file Permissions Test if the installation is successful if the command help indicates success:/usr/local/hadoop/bin/hadoop

Configuring Environment variables

Nano ~/.BASHRC
After opening the interface, enter it after the Java_home that you configured earlier :

export hadoop_install=/usr/local/HADOOP

export path= $PATH: $HADOOP _install/bin

export path= $PATH: $HADOOP _install/sbin

export hadoop_mapred_home= $HADOOP _install

Export hadoop_common_home= $HADOOP _install

export hadoop_hdfs_home= $HADOOP _install

Export Yarn_home= $HADOOP _install

Note: Configure export hadoop_home=/usr/local/HADOOP
Turn on the Hadoop2.6.0 of SSH can not resolve the host name and other error prompt solution! (http://shisiyou.blog.51cto.com/9864147/1608060)
Adding the following statements to ~/.bash_profile or/etc/profile is no problem.
#vi/etc/profile or VI ~/.bash_profile
Export hadoop_common_lib_native_dir= $HADOOP _home/lib/native
Export hadoop_opts= "-djava.library.path= $HADOOP _home/lib"
Then recompile with source to make it effective!
#source/etc/profile or Source ~/.bash_profile
Make the setting effective immediately:source ~/.BASHRC
Checks whether the installation succeeds, executes the command, and indicates success if the command help appears: HDFs

5. Configuring Pseudo-distributed

Switch to profile directory: cd /usr/local/hadoop/etc/hadoop
Configure Core-site.xml, execute command:nano core-site.xml. Replace <configuration></configuration> in file with:

<configuration><property><name>hadoop.tmp.dir</name><value>file:/usr/local/hadoop/tmp</value><description>abase for other temporary directories.</description></property><property><name>fs.defaultFS</name><value>hdfs://localhost:9000</value></property></configuration>

Configure Hdfs-site.xml, execute command:nano hdfs-site.xml, modified to:

<configuration><property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop/tmp/dfs/name</value></property><property><name>dfs.datanode.data.dir</name><value>file:/usr/local/hadoop/tmp/dfs/data</value></property></configuration>

A little explanation of Hadoop configuration items: Although you only need to configure FS.DEFAULTFS and dfs.replication to run (this is the official tutorial), if you do not configure the Hadoop.tmp.dir parameter, the default temporary directory is/tmp/hadoo-hadoop. This directory may be removed by the system when it restarts, causing the format to be re-executed. So we set it up, and we also specify Dfs.namenode.name.dir and Dfs.datanode.data.dir, otherwise you might get an error in the next step.

Configure Yarn-site.xml, execute command:nano yarn-site.xml, modified to:

<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>

Configure Mapred-site.xml

Copy an XML from the template file, execute the command:MV mapred-site.xml.template mapred-site.xml
Execute command:nano mapred-site.xml
Change the document to

<configuration>

<property>

<name>mapreduce.framework.name</name >

<value>yarn</value>

</property>

</configuration>

After the configuration is complete, perform the formatting:hdfs namenode-format. The fifth line appears exitting with status 0 indicates success, and if exitting with status 1 is an error.
start all of the Hadoop processes: start-all.sh
To see if each process starts normally, execute:JPS. If everything is OK, you will see the following results:

2583 DataNode 2970 ResourceManager 3461 Jps 3177 nodemanager 2361 namenode 2840 Secondarynam

"Note" If you are prompted after performing JPS:

The
program ' JPS ' is already included in the following packages: * default-jdk* ecj* gcj-4.6-jdk* openjdk-6-jdk* gcj-4.5-jdk* openjdk-7-jdk try: sudo apt-get insta ll < selected packages >

then, manually set the system default JDK by executing the following command:sudo update-alternatives--install/usr/bin/jps JPS/usr/lib/jvm/jdk1.7.0_79/bin/jps 1 sudo update-alternatives--install/usr/bin/javac javac/usr/lib/jvm/Jdk1.7.0_79/bin/javac sudo update-alternatives--install/usr/bin/java java/usr/lib/jvm/Jdk1.7.0_79/bin/java

Ubuntu 14.04 LTS installed on the deployment of Hadoop 2.7.1

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More