Complete Hadoop installation Configuration

Source: Internet
Author: User

* Hadoop is an open-source distributed computing framework of Apache open-source organization. It has been applied to many large websites, such as Amazon, Facebook, and Yahoo. For me, a recent use point is log analysis on the service integration platform. The log volume of the service integration platform will be large, which is exactly in line with the applicable scenarios of distributed computing (Log Analysis and index creation are two major application scenarios ).

Today, we are going to build Hadoop 2.2.0. The actual environment is the current mainstream server operating system CentOS 5.8.

I. Actual environment
System Version: CentOS 5.8 x86_64
JAVA version: JDK-1.7.0_25
Hadoop version: hadoop-2.2.0
192.168.149.128 namenode (Act as namenode, secondary namenode, and ResourceManager)
192.168.149.129 datanode1 (Act as datanode and nodemanager)
192.168.149.130 datanode2 (Role of datanode and nodemanager)

Ii. System Preparation

1. Hadoop can download the latest version Hadoop2.2 from the Apache official website. Currently, the linux32-bit system executable file is provided officially. to deploy the file on a 64-bit system, you need to download the src Source Code and compile it by yourself. (If it is a real online environment, download the 64-bit hadoop version to avoid many problems. Here we use the 32-bit version)

Hadoop

Http://apache.claz.org/hadoop/common/hadoop-2.2.0/

Java download

Http://www.Oracle.com/technetwork/java/javase/downloads/index.html

2. Here we use three CIDR servers to build a Hadoop cluster. The roles of the two are as shown above.

Step 1: we need to set the corresponding host name in/etc/hosts of the three servers as follows (intranet DNS resolution can be used in the real environment)

[Root @ node1 hadoop] # cat/etc/hosts
# Do not remove the following line, or various programs
# That require network functionality will fail.
127.0.0.1 localhost. localdomain localhost
192.168.149.128 node1
192.168.149.129 node2
192.168.149.130 node3

(Note * hosts resolution must be configured on the namenode and datanode servers)

Step 2: log on to each datanode server without a password from namenode. The following configuration is required:
Run ssh-keygen on namenode 128 and press Enter.
Copy the public key/root/. ssh/id_rsa.pub to the datanode server as follows:
Root@192.168.149.129 for ssh-copy-id-I. ssh/id_rsa.pub
Root@192.168.149.130 for ssh-copy-id-I. ssh/id_rsa.pub

 

Iii. Java installation and configuration
Tar-xvzf jdk-7u25-linux-x64.tar.gz & mkdir-p/usr/java/; mv/jdk1.7.0 _ 25/usr/java.
After installation and configuration of java environment variables, add the following code at the end of/etc/profile:
Export JAVA_HOME =/usr/java/jdk1.7.0 _ 25/
Export PATH = $ JAVA_HOME/bin: $ PATH
Export CLASSPATH = $ JAVE_HOME/lib/dt. jar: $ JAVE_HOME/lib/tools. jar :./

Save and exit, and then execute source/etc/profile to take effect. If you run java-version on the command line, the installation is successful.
[Root @ node1 ~] # Java-version
Java version "1.7.0 _ 25"
Java (TM) SE Runtime Environment (build 1.7.0 _ 25-b15)
Java HotSpot (TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

(Note * We need to install Java jdk on the namenode and datanode servers)

For more details, please continue to read the highlights on the next page:

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

  • 1
  • 2
  • 3
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.