Second, Hadoop pseudo-distributed construction

Source: Internet
Author: User
Tags ssh file permissions ssh server

Environment

Virtual machine: VirtualBox

ubuntu:14.04

hadoop:2.6

installation

1. Create a Hadoop user

sudo useradd-m hadoop-s/bin/bash

"Ubuntu terminal copy and paste shortcut keys"

"In the Ubuntu Terminal window, the copy and paste shortcuts need to be added with shift, that is, paste is ctrl+shift+v. 】

Use the following command to change the password, follow the prompts to enter the password two times Hadoop:

sudo passwd Hadoop

Add administrator privileges to Hadoop users for ease of deployment and avoid some tricky permissions issues for newbies:

sudo adduser hadoop sudo

2. Switch to Hadoop user

Su Hadoop

3, install SSH server, configure SSH login without password

The cluster, single-node mode requires SSH login (similar to remote login, you can log on to a Linux computer and run commands on it), Ubuntu has the SSH client installed by default, and also needs to install SSH server:

sudo apt-get install Openssh-server

After installation, you can use the following command to log on to the machine:

SSH localhost


At this point you will be prompted (SSH first login hint), enter Yes. Then follow the prompts to enter the password Hadoop, so it landed on the machine.

But this login is required to enter the password every time, we need to configure SSH without password login more convenient.

First exit the SSH just now, go back to our original terminal window, then use Ssh-keygen to generate the key and add the key to the authorization:

Exit # quit SSH localhost cd ~/.ssh/# If there is no such directory, please first execute ssh localhost ssh-keygen-t RSA # will be prompted, press ENTER to be able to cat Id_rsa.pub >&gt ; Authorized_keys # Join License

With ssh localhost command, you can log in directly without entering your password.


4. Installing the Java Environment

The Java environment can choose Oracle's JDK, or OpenJDK, according to Http://wiki.apache.org/hadoop/HadoopJavaVersions, the new version under OpenJDK 1.7 is no problem. Convenient for the diagram, this side directly through the command to install OpenJDK 7.

sudo apt-get install Openjdk-7-jre openjdk-7-jdk

OpenJDK The default installation location is:/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64 (32-bit system is/usr/lib/jvm/java-7-openjdk-i86, can be ordered dpkg-l OPENJDK-7-JDK See). After the installation is ready to use, you can check with Java-version.

Then we need to configure the JAVA_HOME environment variable, for convenience, we set in the ~/.BASHRC (extended reading: Setting the Linux environment variable method and the difference):

VI ~/.BASHRC

Add a separate line to the front of the file (note that there can be no spaces before and after the = number) and save the following:

Export JAVA_HOME=/USR/LIB/JVM/JAVA-7-OPENJDK-AMD64

As shown in the following figure (the file may not exist, the content is empty, this does not affect):


Configuring Java_home Variables

You will then need to make the environment variable effective, and execute the following code:

SOURCE ~/.bashrc# make variable settings effective

echo $JAVA _home# test is set correctly

When set correctly, the following results are output:


Successfully configured Java_home variable 5, installing Hadoop

Go to the directory where Hadoop is located and unzip it to/usr/local/hadoop

sudo tar-zxvf./hadoop-2.6.0.tar.gz-c/usr/local # Unzip to/usr/local Cd/usr/local/sudo mv./hadoop-2.6.0/./hadoop # The folder Name changed to Hadoop sudo chown-r hadoop:hadoop./hadoop # Modify File permissions

Hadoop can be used after decompression. Enter the following command to check if Hadoop is available, and a successful display of command usage:

CD./hadoop

./bin/hadoop

6. Hadoop Pseudo-distribution configuration

The configuration file for Hadoop is located in/usr/local/hadoop/etc/hadoop/, and pseudo-distributed requires the modification of 2 configuration files core-site.xml and hdfs-site.xml . The configuration file for Hadoop is in XML format, and each configuration is implemented in a way that declares the property's name and value.

Modify the configuration file core-site.xml (vim/usr/local/hadoop/etc/hadoop/core-site.xml) to

<configuration>
</configuration>

Modify to the following configuration:

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>file:/ Usr/local/hadoop/tmp</value>
        <description>abase for other temporary directories.</description >
    </property>
    <property>
        <name>fs.defaultFS</name>
        <value> hdfs://localhost:9000</value>
    </property>
</configuration>

Similarly, modify the configuration file hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1< /value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/hadoop/tmp/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/local/hadoop/tmp/dfs/data</value >
    </property>
</configuration>
Modifying a configuration fileYarn-site.xml
<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>
<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

</configuration>

After the configuration is complete, perform the Namenode format

Bin/hdfs Namenode-format

If you succeed, you will see a hint from the successfully formatted, and the 5th line below shows the following, exitting with status 0 is successful, and if exitting with status 1 is an error

Then open the following process

sbin/start-dfs.sh

sbin/start-yarn.sh

At this point, all of them have been installed and all services have been started.

Verify

http://127.0.0.1:8088


http://localhost:50070


http://127.0.0.1:19888


Tips
Each time you enter a virtual machine system, you must enter the Hadoop user (Su Hadoop) before you can turn on the service, or you will get an error.

Reference article: www.powerxing.com/install-hadoop/

QQ Exchange: 1923361654

Hadoop fully Distributed Deployment reference: http://blog.csdn.net/gamer_gyt/article/details/51991893

Hadoop stand-alone deployment reference: http://blog.csdn.net/gamer_gyt/article/details/46545303

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.