Hadoop2.2.0 installation and configuration manual! Fully Distributed Hadoop cluster Construction Process

Source: Internet
Author: User
Tags ssh server

After more than a week, I finally set up the latest version of Hadoop2.2 cluster. During this period, I encountered various problems and was really tortured as a cainiao. However, when wordcount gave the results, I was so excited ~~ (If you have any errors or questions, please correct them and learn from each other)

In addition, you are welcome to leave a message when you encounter problems during the configuration process and discuss them with each other. You can also share the solution with us. The following comments have some problems and solutions. Please refer to them!

 

Part 1 Hadoop 2.2 download

For Hadoop, download the latest version Hadoop2.2 from the Apache official website. Currently, linux32-bit system executable files are officially provided, therefore, if you need to deploy a 64-bit system, you need to download the src Source Code and compile it by yourself (a solution link is provided in the comments on the 10th floor ).

: Http://apache.claz.org/hadoop/common/hadoop-2.2.0/

As shown in, download the red part. If you want to compile the SDK, download src.tar.gz.

 

 

 

Part 2 Cluster Environment Construction

1. Here we build a cluster composed of three machines:

192.168.0.1 hduser/passwd cloud001 nn/snn/rm CentOS6 64bit

192.168.0.2 hduser/passwd cloud002 dn/nm Ubuntu13.04 32bit

192.168.0.3 hduser/passwd cloud003 dn/nm Ubuntu13.0432bit

1.1 The above columns are IP, user/passwd, hostname, and roles (namenode, secondary namenode, datanode, resourcemanager, nodemanager) in the cluster)

1.2 The Hostname can be modified in/etc/hostname (ubuntu is in this path, and RedHat is slightly different)

1.3 here we have created an account hduser for each machine. here we need to assign sudo permissions to each account. (Switch to the root account, modify the/etc/sudoers file, and add: hduser ALL = (ALL)

2. Modify the/etc/hosts file and add the ip ing between the ip addresses and hostnames of the three hosts.

192.168.0.1 cloud001

192.168.0.2 cloud002

192.168.0.3 cloud003

3. SSH password-less login from cloud001 to cloud002 and cloud003

3.1 Install ssh

Generally, ssh commands are installed by default. If not, or the version is old, you can reinstall it:

Sodu apt-get install ssh

3.2 set local Login Without Password

After the installation is complete ~ Directory (the current user's main directory, that is,/home/hduser) generates a hidden folder. ssh (ls-a can view hidden files ). If you do not have this file, create it yourself (mkdir. ssh ).

The procedure is as follows:

1. Enter the. ssh folder

2. ssh-keygen-t rsa and then return to the vehicle (generate a key)

3. append id_rsa.pub to the authorization key (cat id_rsa.pub> authorized_keys)

4. restart the SSH server command to make it take effect: service sshd restart (here the RedHat is ssh in sshdUbuntu)

Now you can log on to the ssh localhost without a password.

[Note]: The preceding operations must be performed on each machine.

3.3 set Remote Login Without Password

Here, only cloud001 is the master. If there are multiple namenode or rm, password-free login is required for all other master nodes. (Append authorized_keys of 001 to authorized_keys of 002 and 003)

Enter the. ssh directory of 001

Scp authorized_keys hduser @ cloud002 :~ /. Ssh/authorized_keys_from_cloud001

Enter the. ssh directory of 002

Cat authorized_keys_from_cloud001> authorized_keys

So far, you can log on to sshhduser @ cloud002 without a password on 001. The 003 operation is the same.

 

4. Install jdk (the JAVA_HOME path of each machine is recommended to be the same)

 

Note: Download jdk and install it on your own, instead of directly installing it through the source (apt-get install)

 

4.1 download jkd (http://www.Oracle.com/technetwork/java/javase/downloads/index.html)

 

4.1.1 for 32-bit systems, you can download the following two Linux x86 versions (uname-a view of the System Version)

 

4.1.2 64-bit system download Linux x64((x64.rpmand x64.tar.gz)

 

 

 

 

 

4.2、install jdk(.tar.gz, 32-bit system as an example)

 

Installation Method reference http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-jdk.html

 

4.2.1 select the location where you want to install java, such as in the/usr/directory, and create a folder named java (mkdirjava)

 

4.2.2 move the jdk-7u40-linux-i586.tar.gz file to/usr/java

 

4.2.3 decompression: tar-zxvf jdk-7u40-linux-i586.tar.gz

 

4.2.4 Delete jdk-7u40-linux-i586.tar.gz (to save space)

 

Now, after jkd is installed, configure the environment variables below

 

4.3 open/etc/profile (vim/etc/profile)

 

Add the following content at the end:

 

JAVA_HOME =/usr/java/jdk1.7.0 _ 40 (the version number 1.7.40 must be modified based on the download details)

 

CLASSPATH =.: $ JAVA_HOME/lib. tools. jar

 

PATH = $ JAVA_HOME/bin: $ PATH

 

Export JAVA_HOMECLASSPATH PATH

 

4.4. source/etc/profile

 

4.5 verify whether the installation is successful: java-version

 

[Note] Each machine performs the same operation and finally installs java in the same path (not required, but this will make subsequent configuration much more convenient)

 

5. Disable the firewall for each machine

 

RedHat:

 

/Etc/init. d/iptables stop disable the firewall.

 

Chkconfig iptables off disable startup.

 

Ubuntu:

 

Ufw disable (restart takes effect)

For more details, please continue to read the highlights on the next page:

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

  • 1
  • 2
  • 3
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.