Preparations for hadoop: Build a hadoop distributed cluster on an x86 computer

Source: Internet
Author: User

Basic software and hardware configuration:

X86 desktop, window7 64-bit system vb Virtual Machine (x86 desktop at least 4G memory, in order to open 3 virtual machines) centos6.4 operating system hadoop-1.1.2.tar.gz
Jdk-6u24-linux-i586.bin
1. configuration under root

A) modify the Host Name: vi/etc/sysconfig/network

Master, slave1, slave2

B) Resolution Ip Address: vi/etc/hosts

192.168.8.100 master

192.168.8.101 slave1

192.168.8.102 slave2

C) network debugging:

Use a bridge to connect to the network and configure the network.

After modification, remember to call service network restart

Ensure that the three VMS can ping each other.

D. Disable the firewall.

View: service iptables status

Disable: service iptables stop

Check whether the firewall is self-started:

Chkconfig -- list | grepiptables

Disable auto-start:

Chkconfig iptables off

Ii. Configuration Under the yao user

A) create a user yao, set the password, and enter the user

Useradd yao

Passwd 123456

B) create a public/private key on the master node.

Ssh-keygen-t rsa

1) Copy id_rsa.pub to authorized_keys

Cp id_rsa.pub authorized_keys

2) Copy authorized_keys from the master to/home under slave1

Scp id_rsa.pub root@192.168.8.101:/home

3) copy the authorized_keys copied from the master to the authorized_keys created by slave1. Similarly, slave2 does. At last, any authorized_keys contains the public key of all units.

4) Copy hadoop to the corresponding host/home/yao/Documents/

Configure the environment variable vi/etc/profile under root

Export HADOOP_HOME =/home/yao/Documents/hadoop

Export HADOOP_HOME_WARN_SUPPRESS = 1

Export PATH =.: $ PATH: $ HADOOP_HOME

Note: The su + User Name allows users to be switched.

5) install jdk. authorization is required during decompression;

Chmod u + x jdk...

Decompress the package.

Configure environment variables: vi/etc/profile

 

6) modify the configuration file in/hadoop/conf.

Modify core-site.xml

 

Modify hdfs-site.xml

Modify mapred-site.xml

 

7) modify the hadoop/conf/hadoop-evn.xml file, where the jdk path is specified.

Export JAVA_HOME =/usr/local/jdk

8) Modify/hadoop/conf/masters and slaves to negotiate the Virtual Machine name to let hadoop know the host and datanode;

Masters: Master

Slavers: Slave1 Slave2

3. Copy hadoop

The hadoop configuration in the above master is basically completed. Because the hadoop configuration on the namenode node is the same, now we copy the hadoop on the master to slave1 and slave2 respectively.

Command:

Scp-r./hadoop yao @ slave1:/home/yao/

Scp-r./hadoop yao @ slave2:/home/yao/

After the copy is completed, run the following command in the hadoop directory on the master machine:

Format: Bin/hadoop namenode-format

Next, run start:

Bin/start-all.sh

 

In slave1, enter jps:

Similarly, in slave2, the same result can be obtained by inputting jps:

Summary:

To configure a fully distributed hadoop cluster, perform the following steps:

1) configure the Hosts file

2) create a Hadoop Running Account

3) Configure ssh password-free connection

4) download and decompress the hadoop installation package

5) Configure namenode and modify the site file

6) Configure hadoop-env.sh

7) configure the masters and slaves files.

8) Copy hadoop to nodes

9) format namenode

10) Start hadoop

11) Use jps to check whether various background processes are successfully started

Note: It is not easy to stand out. From the installation stage, each step will encounter various problems that need to be solved. This is a process familiar with commands and hadoop file mechanisms.

Pseudo-distributed

The construction of pseudo-distributed architecture is very simple, because it is a single node, the above steps only need:

1) create a Hadoop Running Account

2) Configure ssh password-free connection (for a single node, you only need to copy id_rsa.pub to authorized_keys to implement password-free connection)

3) download and decompress the hadoop installation package

4) download the jdk and decompress it for installation.

5) modify the site file

6) Configure hadoop-env.sh

7) format namenode

8) Start hadoop

9) Use jps to check whether various background processes are successfully started

OK, basically understand the hadoop build process. The pseudo distribution and full distribution are very simple.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.