Build an experimental environment for the Hadoop series and build the hadoop Series

Source: Internet
Author: User

Build an experimental environment for the Hadoop series and build the hadoop Series
Basic configuration of the experiment environment

Hardware: 50 GB, 1 GB memory, single core on a single hard disk node.

Operating System: CentOS6.4 64bit

Hadoop: 2.20 64bit (Compiled)

JDK: jdk1.7

Disk Partition:

/

5 GB

/Boot

100 MB

/Usr

5 GB

/Tmp

500 MB

Swap

2 GB

/Var

1 GB

/Home

Remaining Space

 

Linux installation Configuration

No desktop (Minimal)

Base System à Base, Compatibility libraries, Performance Tools, Perl Support

Development à Development Tools

Supported ages à Chinese Support

 

Create a Hadoop user

Useradd Hadoop

Passwd Hadoop

Network Configuration modify ip

Vim/etc/sysconfig/network-scripts/ifcfg-eth0

Save and restart the network service network restart

Modify host name

Vim/etc/sysconfig/network

Host Name and IP binding

Vim/etc/host

Disable Firewall

View service iptables status

Disable Firewall service iptables stop

View the firewall startup status chkconfig iptables-list

Disable firewall boot start chkconfig iptables off

Disable SELinux

Vim/etc/sysconfig/selinux

Setenforce 0

Getenforce

SSH Login-free settings

Use hadoop users to generate public and private keys: ssh-keygen-t rsa

Send authorization to Slave1.. 5: ssh-copy-I Slave1

Similarly, Slave1... 5. Password-free logon to the Master

To ensure the communication between S1 and Master, do not log on From S1 to Master.

Install JDK

Decompress jdk1.7 to the/usr/local/directory and change the name to jdk.

Modify the/etc/profile file

Host Name

IP

Installed Software

Running Process

Master

172.20.53.151

Jdk, hadoop

NameNode, DFSZKFailoverController

Slave1

172.20.53.171

Jdk, hadoop

ResourceManager

Slave2

172.20.53.21

Jdk, hadoop,

NameNode, DFSZKFailoverController

Slave3

172.20.53.37

Jdk, hadoop, and zookeeper

DataNode, NodeManager, JournalNode, QuorumPeerMain

Slave4

172.20.53.174

Jdk, hadoop, and zookeeper

DataNode, NodeManager, JournalNode, QuorumPeerMain

Slave5

172.20.53.177

Jdk, hadoop, and zookeeper

DataNode, NodeManager, JournalNode, QuorumPeerMain

Zookeeper Installation

Install zookeeper on S3, S4, and S5 nodes:

  • Log on to S3 as a root user and decompress zookeeper to/usr/local:

Tar-zxvf zookeeper-3.4.5.tar.gz-C/usr/local/

  • Go to the zookeeper directory and configure it.
  • Rename zoo_sample.cfg In the conf directory to zoo. cfg:

Mv zoo_sample.cfg zoo. cfg, which is used for reading when zookeeper is started

  • Create File myid in/usr/local/zookeeper-3.4.5/data, write server id: 1
  • Modify the log storage path in zoo. cfg to/usr/local/zookeeper-3.4.5/data (remember to create the data DIRECTORY and create the myid file) as follows:

  • Add the following information at the end of the file:

Server ID: server.1

Zookeeper running HOST: Slave3. 5

Port: 2888

Election port: 3888

 

  • Send the configured zookeeper to S4, S5 with scp

Scp-r/usr/local/zookeeper-3.4.5/root @ Slave4:/usr/local/zookeeper-3.4.5/

Scp-r/usr/local/zookeeper-3.4.5/root @ Slave5:/usr/local/zookeeper-3.4.5/

Don't forget to modify the server number in the myid File

  • Start the zk of the three nodes:

Call the zkServer. sh script command in the bin directory:./zkServer. sh start

  • View the status./zkServer. sh status

Only one of the three nodes is the leader, and the other is the follower.

 

 

Install hadoop

Upload the compiled hadoop-2.2.0.tar.gz file to the Master, decompress it to the/usr directory as the root user, and rename it hadoop.

  • Create a tmp folder in the hadoop directory (omitted)

Mkdir tmp

  • Set the owner of the hadoop directory to hadoop:

Chown-R Hadoop: hadoop Hadoop

  • Add hadoop to the environment variable vim/etc/profile

The other nodes are also configured.

Configure hadoop
  • Configure HDFS (all configuration files of hadoop2.0 are in the $ HADOOP_HOME/etc/hadoop directory)

Export JAVA_HOME =/usr/local/jdk

Export HADOOP_HOME =/usr/hadoop

Export PATH = $ PATH: $ JAVA_HOME/bin: $ HADOOP_HOME/bin

Modify the configuration file in the/usr/Hadoop/etc/Hadoop/directory
  • Configure the hadoop runtime environment, modify hadoo-env.sh:

  • Modify core-site.xml

Although the Hadoop. tmp. dir parameter is called a temporary directory, the hdfs data is saved later.

  • Modify hdfs-site.xml files

<Configuration>

<! -- Specify the nameservice of hdfs as ns1, which must be consistent with that in the core-site.xml -->

<Property>

<Name> dfs. nameservices </name>

<Value> ns1 </value>

</Property>

<! -- There are two NameNode under ns1, namely nn1 and nn2 -->

<Property>

<Name> dfs. ha. namenodes. ns1 </name>

<Value> nn1, nn2 </value>

</Property>

<! -- RPC communication address of nn1 -->

<Property>

<Name> dfs. namenode. rpc-address.ns1.nn1 </name>

<Value> Master: 9000. </value>

</Property>

<! -- Nn1 http Communication address -->

<Property>

<Name> dfs. namenode. http-address.ns1.nn1 </name>

<Value> Master: 50070. </value>

</Property>

<! -- RPC communication address of nn2 -->

<Property>

<Name> dfs. namenode. rpc-address.ns1.nn2 </name>

<Value> Slave1: 9000 </value>

</Property>

<! -- Nn2 http Communication address -->

<Property>

<Name> dfs. namenode. http-address.ns1.nn2 </name>

<Value> Slave1: 50070 </value>

</Property>

<! -- Specify the storage location of NameNode metadata on JournalNode -->

<Property>

<Name> dfs. namenode. shared. edits. dir </name>

<Value> qjournal: // Slave3: 8485; Slave4: 8485; Slave5: 8485/ns1 </value>

</Property>

<! -- Specify the location where JournalNode stores data on the local disk -->

<Property>

<Name> dfs. journalnode. edits. dir </name>

<Value>/usr/hadoop/journal </value>

</Property>

<! -- Enable automatic failover when NameNode fails -->

<Property>

<Name> dfs. ha. automatic-failover.enabled </name>

<Value> true </value>

</Property>

<! -- Implementation of Automatic Switch upon configuration failure -->

<Property>

<Name> dfs. client. failover. proxy. provider. ns1 </name>

<Value> org. apache. hadoop. hdfs. server. namenode. ha. ConfiguredFailoverProxyProvider </value>

</Property>

<! -- Configure the isolation mechanism -->

<Property>

<Name> dfs. ha. fencing. methods </name>

<Value> sshfence </value>

</Property>

<! -- Ssh Login-free is required to use the isolation mechanism -->

<Property>

<Name> dfs. ha. fencing. ssh. private-key-files </name>

<Value>/home/hadoop/. ssh/id_rsa </value>

</Property>

</Configuration>

 

  • Rename mapred-site.xml.template to mapred-site.xml and configure the following

Description: The MR framework runs on yarn.

  • Configure the subnode file: slaves

DN: S3, S4, S5

  • Copy the configured hadoop to another node (root)

Scp-r/usr/Hadoop/Slave1:/usr/Hadoop/

After copying, modify the permission: chown-R Hadoop: hadoop Hadoop

 

Start hadoop
  • Start zookeeper:

Bin/zkServer. sh start

  • Start journalnode (start all journalnodes on the Master)

Cd/usr/hadoop

Sbin/hadoop-daemons.sh start journalnode simultaneously starts multiple processes through the ssh protocol

(Run the jps command to check if the JournalNode process is added)

  • Format HDFS

Run the following command on the Master: hadoop namenode-format

Copy the tmp directory in the Master to/usr/Hadoop/of Slave1:

Scp-r/usr/Hadoop/tmp Slave1:/usr/Hadoop/

  • Format ZK (executed on Master): hdfs zkfc-formatZK

In this case, run the./zkCli. sh command in the bin directory of zk on the s3.. S5 node. You can find that the hadoop-ha directory is used to save data.

  • Start HDFS (run on Master ):

Sbin/start-dfs.sh

If a NameNode fails, restart to use the command sbin/Hadoop-deamon.sh start namenode to ensure that both NameNode require ssh password-free login.

  • Start yarn: sbin/start-yarn.sh on Slave2

 

 

 

 

Note:

About modifying Virtual Machine NICs
  • Modify the/etc/udev/rules. d/70-persistent-net.rules File

Delete eth0 information. Modify the name of the second eth1 Nic to eth0.

  • Modify the mac address of eth0 in/etc/sysconfig/network-scripts/ifcfg-eth0 to/etc/udev/rules. d/70-persistent-net.rules.

How to Use VMWare + Hadoop to build a cloud computing environment and run a simple cloud computing instance on a rough Platform

The example of word statistics in hadoop can be tested. You have to build an environment step by step. The question above is too general. It should be like this: 1. How to install vmwarevm in XP or Windows 7. Install the linux system after installation. 2. install linux on a virtual machine (ubuntu, redhat, and fedora ). 3. If the redhat system is installed, find and install hadoop under redhat (a version of hadoop is downloaded from the official website ). 4. Test the hadoop environment. The installation step will show you how to test the word statistics example. Step by step! The configuration of these environments depends on the environment you want to build on several nodes. The configuration of each node is the same.

Hadoop Development Environment Configuration

My eclipse is installed in WINDOWS and HADOOP is in the CENTOS production environment.

But the principle is the same.
Club.sm160.com/showtopic-937269.aspx

HADOOP version is hadoop1.0.0 (same as your HADOOP 1.0.4 configuration)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.