The construction of Hadoop on Ubuntu systems [illustration]

Last Update:2014-10-29 Source: Internet

Author: User

Tags xsl

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective

This article describes how to build a Hadoop platform on the Ubuntu Kylin operating system.

Configuration

1. Operating system: Ubuntu Kylin 14.04

2. Programming language support: JDK 1.8

3. Communication protocol Support: SSH

2. Cloud computing Project: Hadoop 1.2.1

Step One: Install the latest version of the JDK (ignore this step if you have already installed)

1. Go to the official website to download JDK1.8 and unzip (current installation package: jdk-8u25-linux-x64.gz)

2. Copy the extracted installation package to the/USR/LIB/JVM directory (the JVM directory needs to be created by itself)

3. Open the/etc/profile file as an administrator and add the following code at the bottom of the file:

1 #set Java Environment 2 Export JAVA_HOME=/USR/LIB/JVM/JDK1. 8 . 0_25 3 export classpath=".: $JAVA _home/lib: $CLASSPATH"4 export path="  $JAVA _home/bin: $PATH"

4. Execute the following command to make the configuration file effective immediately:

1 source/etc/profile

5. Verify that the JDK is successfully installed by executing the following command:

1 java-version

The following information is displayed to indicate that the installation is complete:

Step Two: Configure SSH password-free login

1. Enter the following command to install SSH

Ssh

2. Check whether there is a. SSH hidden folder in the user directory, and create one yourself without the words.

3. Execute the following command to configure SSH login without password (the functions of these lines of code refer to the SSH documentation):

1 Ssh-keygen ' -f ~/. ssh/id_dsa2cat ~/. ssh/id_dsa.pub >> ~/. ssh/authorized_keys

4. Execute the following command to verify that the SSH installation configuration is successful:

1 ssh localhost

When prompted for Yes, the terminal displays the following information indicating that the SSH configuration was successful:

Step three: Install and run Hadoop

Description: Hadoop has three modes of operation-single-machine mode, pseudo-distributed, and fully distributed. The first two are mainly used for program testing and debugging, here is to talk about the pseudo-distributed configuration, the configuration of a fully distributed method will be explained later.

1. Download and unzip the latest version of Hadoop into the current directory (the current installation package is: hadoop-1.2.1.tar.gz)

2. Go to the Conf subdirectory and modify the following configuration file:

A. hadoop-env.sh

Set the Java path at the end:

1 export JAVA_HOME=/USR/LIB/JVM/JDK1. 8. 0_25

B. core-site.xml

Configured to:

1<?xml version="1.0"?>2<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>3 4<!--Put Site-specific property overridesinchThisfile. -5 6<configuration>7<property>8<name>fs.default.name</name>9<value>hdfs://localhost:9000</value>Ten</property> One</configuration>

C. hdfs-site.xml

Configured to:

1<?xml version="1.0"?>2<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>3 4<!--Put Site-specific property overridesinchThisfile. -5 6<configuration>7<property>8<name>dfs.replication</name>9<value>1</value>Ten</property> One</configuration>

D. mapred-site.xml

Configured to:

1<?xml version="1.0"?>2<?xml-stylesheet type="text/xsl"href="configuration.xsl"?>3 4<!--Put Site-specific property overridesinchThisfile. -5 6<configuration>7<property>8<name>mapred.job.tracker</name>9<value>localhost:9001</value>Ten</property> One</configuration>

3. Go to the Hadoop folder and execute the following command to format the Hadoop file system HDFs:

1 bin/hadoop Namenode-format

4. Execute the following command to start all Hadoop processes:

1 bin/start-all. SH

5. Verify that Hadoop is installed successfully

A. Open the browser and enter the URL http://localhost:50030 to view the MapReduce Web page:

B. Open the browser and enter the URL http://localhost:50070 to view the HDFs Web page:

If the display is OK, then the Hadoop environment is set up.

Summary

1. Pseudo-distributed architectures, mechanisms and real-world distribution are actually the same, but in pseudo-distributed, both master and slave are a single machine.

2. The construction of a real-world distributed environment will be introduced in the future. A virtual network will be formed on the virtual machine to run the real distributed program.

The construction of Hadoop on Ubuntu systems [illustration]

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More