The construction of Hadoop on Ubuntu systems [illustration]

Source: Internet
Author: User
Tags xsl

Objective

This article describes how to build a Hadoop platform on the Ubuntu Kylin operating system.

Configuration

1. Operating system: Ubuntu Kylin 14.04

2. Programming language support: JDK 1.8

3. Communication protocol Support: SSH

2. Cloud computing Project: Hadoop 1.2.1

Step One: Install the latest version of the JDK (ignore this step if you have already installed)

1. Go to the official website to download JDK1.8 and unzip (current installation package: jdk-8u25-linux-x64.gz)

2. Copy the extracted installation package to the/USR/LIB/JVM directory (the JVM directory needs to be created by itself)

3. Open the/etc/profile file as an administrator and add the following code at the bottom of the file:

1 #set Java Environment 2 Export JAVA_HOME=/USR/LIB/JVM/JDK1. 8 . 0_25 3 export classpath=".: $JAVA _home/lib: $CLASSPATH"4 export path="  $JAVA _home/bin: $PATH"

4. Execute the following command to make the configuration file effective immediately:

1 source/etc/profile

5. Verify that the JDK is successfully installed by executing the following command:

1 java-version

The following information is displayed to indicate that the installation is complete:

Step Two: Configure SSH password-free login

1. Enter the following command to install SSH

Ssh

2. Check whether there is a. SSH hidden folder in the user directory, and create one yourself without the words.

3. Execute the following command to configure SSH login without password (the functions of these lines of code refer to the SSH documentation):

1 Ssh-keygen ' -f ~/. ssh/id_dsa2cat ~/. ssh/id_dsa.pub >> ~/. ssh/authorized_keys

4. Execute the following command to verify that the SSH installation configuration is successful:

1 ssh localhost

When prompted for Yes, the terminal displays the following information indicating that the SSH configuration was successful:

  

Step three: Install and run Hadoop

Description: Hadoop has three modes of operation-single-machine mode, pseudo-distributed, and fully distributed. The first two are mainly used for program testing and debugging, here is to talk about the pseudo-distributed configuration, the configuration of a fully distributed method will be explained later.

1. Download and unzip the latest version of Hadoop into the current directory (the current installation package is: hadoop-1.2.1.tar.gz)

2. Go to the Conf subdirectory and modify the following configuration file:

A. hadoop-env.sh

Set the Java path at the end:

1 export JAVA_HOME=/USR/LIB/JVM/JDK1. 8. 0_25

B. core-site.xml

Configured to:

1<?xml version="1.0"?>2<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>3 4<!--Put Site-specific property overridesinchThisfile. -5 6<configuration>7<property>8<name>fs.default.name</name>9<value>hdfs://localhost:9000</value>Ten</property> One</configuration>

C. hdfs-site.xml

Configured to:

1<?xml version="1.0"?>2<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>3 4<!--Put Site-specific property overridesinchThisfile. -5 6<configuration>7<property>8<name>dfs.replication</name>9<value>1</value>Ten</property> One</configuration>

D. mapred-site.xml

Configured to:

1<?xml version="1.0"?>2<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>3 4<!--Put Site-specific property overridesinchThisfile. -5 6<configuration>7<property>8<name>mapred.job.tracker</name>9<value>localhost:9001</value>Ten</property> One</configuration>

3. Go to the Hadoop folder and execute the following command to format the Hadoop file system HDFs:

1 bin/hadoop Namenode-format

4. Execute the following command to start all Hadoop processes:

1 bin/start-all. SH

5. Verify that Hadoop is installed successfully

A. Open the browser and enter the URL http://localhost:50030 to view the MapReduce Web page:

    

B. Open the browser and enter the URL http://localhost:50070 to view the HDFs Web page:

    

If the display is OK, then the Hadoop environment is set up.

Summary

1. Pseudo-distributed architectures, mechanisms and real-world distribution are actually the same, but in pseudo-distributed, both master and slave are a single machine.

2. The construction of a real-world distributed environment will be introduced in the future. A virtual network will be formed on the virtual machine to run the real distributed program.

The construction of Hadoop on Ubuntu systems [illustration]

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.