Preparations for hadoop: Build a hadoop distributed cluster on an x86 computer

Last Update:2014-05-28 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Basic software and hardware configuration:

X86 desktop, window7 64-bit system vb Virtual Machine (x86 desktop at least 4G memory, in order to open 3 virtual machines) centos6.4 operating system hadoop-1.1.2.tar.gz
Jdk-6u24-linux-i586.bin

1. configuration under root

A) modify the Host Name: vi/etc/sysconfig/network

Master, slave1, slave2

B) Resolution Ip Address: vi/etc/hosts

192.168.8.100 master

192.168.8.101 slave1

192.168.8.102 slave2

C) network debugging:

Use a bridge to connect to the network and configure the network.

After modification, remember to call service network restart

Ensure that the three VMS can ping each other.

D. Disable the firewall.

View: service iptables status

Disable: service iptables stop

Check whether the firewall is self-started:

Chkconfig -- list | grepiptables

Disable auto-start:

Chkconfig iptables off

Ii. Configuration Under the yao user

A) create a user yao, set the password, and enter the user

Useradd yao

Passwd 123456

B) create a public/private key on the master node.

Ssh-keygen-t rsa

1) Copy id_rsa.pub to authorized_keys

Cp id_rsa.pub authorized_keys

2) Copy authorized_keys from the master to/home under slave1

Scp id_rsa.pub root@192.168.8.101:/home

3) copy the authorized_keys copied from the master to the authorized_keys created by slave1. Similarly, slave2 does. At last, any authorized_keys contains the public key of all units.

4) Copy hadoop to the corresponding host/home/yao/Documents/

Configure the environment variable vi/etc/profile under root

Export HADOOP_HOME =/home/yao/Documents/hadoop

Export HADOOP_HOME_WARN_SUPPRESS = 1

Export PATH =.: $ PATH: $ HADOOP_HOME

Note: The su + User Name allows users to be switched.

5) install jdk. authorization is required during decompression;

Chmod u + x jdk...

Decompress the package.

Configure environment variables: vi/etc/profile

6) modify the configuration file in/hadoop/conf.

Modify core-site.xml

Modify hdfs-site.xml

Modify mapred-site.xml

7) modify the hadoop/conf/hadoop-evn.xml file, where the jdk path is specified.
Export JAVA_HOME =/usr/local/jdk
8) Modify/hadoop/conf/masters and slaves to negotiate the Virtual Machine name to let hadoop know the host and datanode;

Masters: Master

Slavers: Slave1 Slave2

3. Copy hadoop

The hadoop configuration in the above master is basically completed. Because the hadoop configuration on the namenode node is the same, now we copy the hadoop on the master to slave1 and slave2 respectively.

Command:

Scp-r./hadoop yao @ slave1:/home/yao/

Scp-r./hadoop yao @ slave2:/home/yao/

After the copy is completed, run the following command in the hadoop directory on the master machine:

Format: Bin/hadoop namenode-format

Next, run start:

Bin/start-all.sh

In slave1, enter jps:

Similarly, in slave2, the same result can be obtained by inputting jps:

Summary:

To configure a fully distributed hadoop cluster, perform the following steps:

1) configure the Hosts file

2) create a Hadoop Running Account

3) Configure ssh password-free connection

4) download and decompress the hadoop installation package

5) Configure namenode and modify the site file

6) Configure hadoop-env.sh

7) configure the masters and slaves files.

8) Copy hadoop to nodes

9) format namenode

10) Start hadoop

11) Use jps to check whether various background processes are successfully started

Note: It is not easy to stand out. From the installation stage, each step will encounter various problems that need to be solved. This is a process familiar with commands and hadoop file mechanisms.

Pseudo-distributed

The construction of pseudo-distributed architecture is very simple, because it is a single node, the above steps only need:

1) create a Hadoop Running Account

2) Configure ssh password-free connection (for a single node, you only need to copy id_rsa.pub to authorized_keys to implement password-free connection)

3) download and decompress the hadoop installation package

4) download the jdk and decompress it for installation.

5) modify the site file

6) Configure hadoop-env.sh

7) format namenode

8) Start hadoop

9) Use jps to check whether various background processes are successfully started

OK, basically understand the hadoop build process. The pseudo distribution and full distribution are very simple.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Preparations for hadoop: Build a hadoop distributed cluster on an x86 computer

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Preparations for hadoop: Build a hadoop distributed cluster on an x86 computer

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support