Build an experimental environment for the Hadoop series and build the hadoop Series

Last Update:2014-10-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Build an experimental environment for the Hadoop series and build the hadoop Series
Basic configuration of the experiment environment

Hardware: 50 GB, 1 GB memory, single core on a single hard disk node.

Operating System: CentOS6.4 64bit

Hadoop: 2.20 64bit (Compiled)

JDK: jdk1.7

Disk Partition:

/	5 GB
/Boot	100 MB
/Usr	5 GB
/Tmp	500 MB
Swap	2 GB
/Var	1 GB
/Home	Remaining Space

Linux installation Configuration

No desktop (Minimal)

Base System à Base, Compatibility libraries, Performance Tools, Perl Support

Development à Development Tools

Supported ages à Chinese Support

Create a Hadoop user

Useradd Hadoop

Passwd Hadoop

Network Configuration modify ip

Vim/etc/sysconfig/network-scripts/ifcfg-eth0

Save and restart the network service network restart

Modify host name

Vim/etc/sysconfig/network

Host Name and IP binding

Vim/etc/host

Disable Firewall

View service iptables status

Disable Firewall service iptables stop

View the firewall startup status chkconfig iptables-list

Disable firewall boot start chkconfig iptables off

Disable SELinux

Vim/etc/sysconfig/selinux

Setenforce 0

Getenforce

SSH Login-free settings

Use hadoop users to generate public and private keys: ssh-keygen-t rsa

Send authorization to Slave1.. 5: ssh-copy-I Slave1

Similarly, Slave1... 5. Password-free logon to the Master

To ensure the communication between S1 and Master, do not log on From S1 to Master.

Install JDK

Decompress jdk1.7 to the/usr/local/directory and change the name to jdk.

Modify the/etc/profile file

Host Name	IP	Installed Software	Running Process
Master	172.20.53.151	Jdk, hadoop	NameNode, DFSZKFailoverController
Slave1	172.20.53.171	Jdk, hadoop	ResourceManager
Slave2	172.20.53.21	Jdk, hadoop,	NameNode, DFSZKFailoverController
Slave3	172.20.53.37	Jdk, hadoop, and zookeeper	DataNode, NodeManager, JournalNode, QuorumPeerMain
Slave4	172.20.53.174	Jdk, hadoop, and zookeeper	DataNode, NodeManager, JournalNode, QuorumPeerMain
Slave5	172.20.53.177	Jdk, hadoop, and zookeeper	DataNode, NodeManager, JournalNode, QuorumPeerMain

Zookeeper Installation

Install zookeeper on S3, S4, and S5 nodes:

Log on to S3 as a root user and decompress zookeeper to/usr/local:

Tar-zxvf zookeeper-3.4.5.tar.gz-C/usr/local/

Go to the zookeeper directory and configure it.
Rename zoo_sample.cfg In the conf directory to zoo. cfg:

Mv zoo_sample.cfg zoo. cfg, which is used for reading when zookeeper is started

Create File myid in/usr/local/zookeeper-3.4.5/data, write server id: 1
Modify the log storage path in zoo. cfg to/usr/local/zookeeper-3.4.5/data (remember to create the data DIRECTORY and create the myid file) as follows:

Add the following information at the end of the file:

Server ID: server.1

Zookeeper running HOST: Slave3. 5

Port: 2888

Election port: 3888

Send the configured zookeeper to S4, S5 with scp

Scp-r/usr/local/zookeeper-3.4.5/root @ Slave4:/usr/local/zookeeper-3.4.5/

Scp-r/usr/local/zookeeper-3.4.5/root @ Slave5:/usr/local/zookeeper-3.4.5/

Don't forget to modify the server number in the myid File

Start the zk of the three nodes:

Call the zkServer. sh script command in the bin directory:./zkServer. sh start

View the status./zkServer. sh status

Only one of the three nodes is the leader, and the other is the follower.

Install hadoop

Upload the compiled hadoop-2.2.0.tar.gz file to the Master, decompress it to the/usr directory as the root user, and rename it hadoop.

Create a tmp folder in the hadoop directory (omitted)

Mkdir tmp

Set the owner of the hadoop directory to hadoop:

Chown-R Hadoop: hadoop Hadoop

Add hadoop to the environment variable vim/etc/profile

The other nodes are also configured.

Configure hadoop

Configure HDFS (all configuration files of hadoop2.0 are in the $ HADOOP_HOME/etc/hadoop directory)

Export JAVA_HOME =/usr/local/jdk

Export HADOOP_HOME =/usr/hadoop

Export PATH = $ PATH: $ JAVA_HOME/bin: $ HADOOP_HOME/bin

Modify the configuration file in the/usr/Hadoop/etc/Hadoop/directory

Configure the hadoop runtime environment, modify hadoo-env.sh:

Modify core-site.xml

Although the Hadoop. tmp. dir parameter is called a temporary directory, the hdfs data is saved later.

Modify hdfs-site.xml files

<! -- Specify the nameservice of hdfs as ns1, which must be consistent with that in the core-site.xml -->

<Name> dfs. nameservices </name>

</Property>

<! -- There are two NameNode under ns1, namely nn1 and nn2 -->

<Name> dfs. ha. namenodes. ns1 </name>

</Property>

<! -- RPC communication address of nn1 -->

<Name> dfs. namenode. rpc-address.ns1.nn1 </name>

<Value> Master: 9000. </value>

</Property>

<! -- Nn1 http Communication address -->

<Name> dfs. namenode. http-address.ns1.nn1 </name>

<Value> Master: 50070. </value>

</Property>

<! -- RPC communication address of nn2 -->

<Name> dfs. namenode. rpc-address.ns1.nn2 </name>

<Value> Slave1: 9000 </value>

</Property>

<! -- Nn2 http Communication address -->

<Name> dfs. namenode. http-address.ns1.nn2 </name>

<Value> Slave1: 50070 </value>

</Property>

<! -- Specify the storage location of NameNode metadata on JournalNode -->

<Name> dfs. namenode. shared. edits. dir </name>

<Value> qjournal: // Slave3: 8485; Slave4: 8485; Slave5: 8485/ns1 </value>

</Property>

<! -- Specify the location where JournalNode stores data on the local disk -->

<Name> dfs. journalnode. edits. dir </name>

<Value>/usr/hadoop/journal </value>

</Property>

<! -- Enable automatic failover when NameNode fails -->

<Name> dfs. ha. automatic-failover.enabled </name>

</Property>

<! -- Implementation of Automatic Switch upon configuration failure -->

<Name> dfs. client. failover. proxy. provider. ns1 </name>

<Value> org. apache. hadoop. hdfs. server. namenode. ha. ConfiguredFailoverProxyProvider </value>

</Property>

<! -- Configure the isolation mechanism -->

<Name> dfs. ha. fencing. methods </name>

<Value> sshfence </value>

</Property>

<! -- Ssh Login-free is required to use the isolation mechanism -->

<Name> dfs. ha. fencing. ssh. private-key-files </name>

<Value>/home/hadoop/. ssh/id_rsa </value>

</Property>

</Configuration>

Rename mapred-site.xml.template to mapred-site.xml and configure the following

Description: The MR framework runs on yarn.

Configure the subnode file: slaves

DN: S3, S4, S5

Copy the configured hadoop to another node (root)

Scp-r/usr/Hadoop/Slave1:/usr/Hadoop/

After copying, modify the permission: chown-R Hadoop: hadoop Hadoop

Start hadoop

Start zookeeper:

Bin/zkServer. sh start

Start journalnode (start all journalnodes on the Master)

Cd/usr/hadoop

Sbin/hadoop-daemons.sh start journalnode simultaneously starts multiple processes through the ssh protocol

(Run the jps command to check if the JournalNode process is added)

Format HDFS

Run the following command on the Master: hadoop namenode-format

Copy the tmp directory in the Master to/usr/Hadoop/of Slave1:

Scp-r/usr/Hadoop/tmp Slave1:/usr/Hadoop/

Format ZK (executed on Master): hdfs zkfc-formatZK

In this case, run the./zkCli. sh command in the bin directory of zk on the s3.. S5 node. You can find that the hadoop-ha directory is used to save data.

Start HDFS (run on Master ):

Sbin/start-dfs.sh

If a NameNode fails, restart to use the command sbin/Hadoop-deamon.sh start namenode to ensure that both NameNode require ssh password-free login.

Start yarn: sbin/start-yarn.sh on Slave2

Note:

About modifying Virtual Machine NICs

Modify the/etc/udev/rules. d/70-persistent-net.rules File

Delete eth0 information. Modify the name of the second eth1 Nic to eth0.

Modify the mac address of eth0 in/etc/sysconfig/network-scripts/ifcfg-eth0 to/etc/udev/rules. d/70-persistent-net.rules.

How to Use VMWare + Hadoop to build a cloud computing environment and run a simple cloud computing instance on a rough Platform

The example of word statistics in hadoop can be tested. You have to build an environment step by step. The question above is too general. It should be like this: 1. How to install vmwarevm in XP or Windows 7. Install the linux system after installation. 2. install linux on a virtual machine (ubuntu, redhat, and fedora ). 3. If the redhat system is installed, find and install hadoop under redhat (a version of hadoop is downloaded from the official website ). 4. Test the hadoop environment. The installation step will show you how to test the word statistics example. Step by step! The configuration of these environments depends on the environment you want to build on several nodes. The configuration of each node is the same.

Hadoop Development Environment Configuration

My eclipse is installed in WINDOWS and HADOOP is in the CENTOS production environment.

But the principle is the same.
Club.sm160.com/showtopic-937269.aspx

HADOOP version is hadoop1.0.0 (same as your HADOOP 1.0.4 configuration)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Build an experimental environment for the Hadoop series and build the hadoop Series

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support