[Hadoop] Step-by-step Hadoop (standalone mode) on Ubuntu system

Source: Internet
Author: User

1 Creating Hadoop user groups and Hadoop users

  STEP1: Create a Hadoop user group:

~$ sudo addgroup Hadoop

  STEP2: Create a Hadoop User:

~$ sudo adduser-ingroup Hadoop hadoop

Enter the password when prompted, this is the new Hadoop password, enter two times the password hit enter. As shown in the following:

 

  STEP3: Add Permissions for Hadoop users:

~$ sudo gedit/etc/sudoers

After you click Enter, open the Sudoers file,

Root all= (All:all) all

After adding:

Hadoop all= (All:all) all

  Note: "\ t" after "Hadoop", instead of a space, once the sudoers modification error can cause serious consequences (such as causing the sudo command to not work properly, it can only be restored with root privileges). the modified sudoers file looks like the following:

  

2 landing Ubuntu system with new Hadoop user name

~$ Su-hadoop

You can enter your password.

3 Installing SSH

  STEP4: Installing the SSH required for Hadoop communication:

~$ sudo apt-get install Openssh-server

When the installation is complete, start the service:

~$ sudo/etc/init.d/ssh Start

After starting, you can confirm that the service started correctly by following the command:

~$ PS-E | grep SS

As shown in the following:

As a secure communication protocol, a password is required for use, so we want to set the password-free login to generate the private key and the public key:

~$ ssh-keygen-t rsa-p ""

As shown in the following:

Two files are generated under/home/hadoop/.ssh: Id_rsa and Id_rsa.pub, which is the private key and the latter is the public key. Now we append the public key to Authorized_keys (Authorized_keys is used to hold all public key content that allows the user to log on to the SSH client as the current user):

~$ Cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

You can now log in to SSH to confirm that you do not need to enter a password:

~$ ssh localhost

  Exit:

~$ exit

4 Installing Java

  STEP5: Installing Java:

    ~$ sudo apt-get install openjdk-6-jdk

  After installation, you can enter the following instructions to view the Java version: 

~$ java-version

5 Installing and configuring Hadoop

  STEP6: Installing Hadoop:

1) Download:

Currently the latest version is 2.7.0, you can install different versions of Hadoop according to your needs: click here

2) Unzip:

~$ sudo tar xzf hadoop-2.7.0.tar.gz

3) Move Hadoop to the/usr/local/hadoop directory:

~$ sudo mv Hadoop-1.0.2/usr/local/hadoop

4) To ensure that all operations are done under user hadoop:

~$ sudo chown-r hadoop:hadoop/usr/local/hadoop

  STEP7: Configuring Hadoop:

1) configuration. BASHRC:

To Configure the file, you need to know the Java installation path, which can be viewed in the following code:

~$ update-alternatives--config Java

The results of the implementation are as follows:

    

The complete path is:/usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java, we only take the previous part/usr/lib/jvm/java-7-openjdk-amd64.

Modify the. bashrc file:

~$ sudo gedit ~/.BASHRC

The command opens the edit window for the file, appends the following to the end of the file, and then saves and closes the editing window.

#HADOOP VARIABLES START

Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64

Export Hadoop_install=/usr/local/hadoop

Export path= $PATH: $HADOOP _install/bin

Export path= $PATH: $HADOOP _install/sbin

Export Hadoop_mapred_home= $HADOOP _install

Export Hadoop_common_home= $HADOOP _install

Export Hadoop_hdfs_home= $HADOOP _install

Export Yarn_home= $HADOOP _install

Export hadoop_common_lib_native_dir= $HADOOP _install/lib/native

Export hadoop_opts= "-djava.library.path= $HADOOP _install/lib"

#HADOOP VARIABLES END

To make the added environment variable effective:

~$ Source ~/.BASHRC

2) configuration hadoop-env.sh

Open hadoop-env.sh File:

~$ sudo gedit/usr/local/hadoop/etc/hadoop/hadoop-env.sh

Locate the Java_home variable and modify it as follows:

~$ Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64

The modified hadoop-env.sh file resembles the following :


To make the configuration effective:

    ~$ source/usr/local/hadoop/conf/hadoop-env.sh

Here, the single-machine mode of Hadoop is all installed.

6 Hadoop Test

  In order to test the Hadoop installation for correctness, we can test it in a band example (such as WordCount).

1) Create the input folder under the/usr/local/hadoop path

~$ mkdir Input

2) Copy README.txt to input folder

~$ CP README.txt Input

3) Execute WordCount program instance

~$ Bin/hadoop Jar Share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.0-sources.jar Org.apache.hadoop.examples.WordCount Input Output

If you see this, then congratulations, this means that your Hadoop has been installed successfully.

7 Conclusion

The process of installing Hadoop from Ubuntu can be seen in the past when you find it difficult to try something that may actually be simple. As long as you want to learn, just step by step, the problem can always find a way to solve, and we encourage each other.

8 Reference Content

  [1] Ubuntu 14.04 under Install Hadoop2.4.0 (standalone mode)

[2] Setting up a Hadoop environment on Ubuntu (standalone mode + pseudo distribution mode)

[Hadoop] Step-by-step Hadoop (standalone mode) on Ubuntu system

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.