Hadoop 2.2.0 installation Configuration

Source: Internet
Author: User

The Hadoop 2.2.0 environment is set up based on the online articles. The specific content is as follows.

Environment Introduction:

I use two laptops, both of which use VMware to install the Fedora 10 system.

VM 1: IP 192.168.1.105 hostname: cloud001 User: root

Virtual Machine 2: IP 192.168.1.106 hostname: cloud002 User: root

 

Preparations:

1. Configure the/etc/hosts file and add the following two lines:

192.168.1.105 cloud001
192.168.1.106 cloud002

2. service iptables stop: Disable the iptables service

3. Install jdk jdk-6u45-linux-i586.bin and run unzip under/opt

Configure/etc/profile and add

Export JAVA_HOME =/opt/jdk1.6.0 _ 45
Export CLASSPATH =.: $ JAVA_HOME/lib. tools. jar
Export PATH = $ JAVA_HOME/bin: $ PATH

Then run source/etc/profile

You can run the env command to check whether the environment variable is set successfully.

4. Configure SSH password-free login, which is important

1) Run ssh-keygen-t rsa on the two machines to generate a pair of key: private key (id_rsa) and Public Key (id_rsa.pub );

2) copy the public key on the host 192.168.1.105 to the corresponding/root/. ssh/directory on the host 192.168.1.106.

Scp./id_rsa.pub root@192.168.1.106:/root/. ssh/authorized_keys

3) copy the public key on the host 192.168.1.106 to the corresponding/root/. ssh/directory on the host 192.168.1.105.

Scp./id_rsa.pub root@192.168.1.105:/root/. ssh/authorized_keys

4) both machines enter the/root/. ssh directory and run cat id_rsa.pub> authorized_keys

5) After configuration, ssh cloud001 and ssh cloud002 on both hosts should be password-free for login.

 

Configure hadoop 2.2.0

1. Download hadoop-2.2.0.tar.gz. I use a 32-bit directory and decompress it to the/opt directory of the two machines;

2. Go to the/opt/hadoop-2.2.0/etc/hadoop directory and modify the hadoop-env.sh

Export JAVA_HOME =/opt/jdk1.6.0 _ 45

3. Modify yarn-env.sh on 192.168.1.105

Export JAVA_HOME =/opt/jdk1.6.0 _ 45

4, 192.168.1.105 to modify the core-site.xml, pay attention to the XML file header has <? Xml version = "1.0"?>, In addition, the directories configured in XML should all exist. If they do not exist, create them first.

<Configuration>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // cloud001: 9000 </value>
</Property>
<Property>
<Name> io. file. buffer. size </name>
<Value> 131072 </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/opt/hadoop-2.2.0/tmp </value>
<Description> Abase for other temporary directories. </description>
</Property>
<Property>
<Name> hadoop. proxyuser. hduser. hosts </name>
<Value> * </value>
</Property>
<Property>
<Name> hadoop. proxyuser. hduser. groups </name>
<Value> * </value>
</Property>
</Configuration>

5. Modify mapred-site.xml on 192.168.1.105

<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. address </name>
<Value> cloud001: 10020 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. webapp. address </name>
<Value> cloud001: 19888 </value>
</Property>
</Configuration>

6. Modify hdfs-site.xml on 192.168.1.105

<Configuration>
<Property>
<Name> dfs. namenode. secondary. http-address </name>
<Value> cloud001: 9001 </value>
</Property>
<Property>
<Name> dfs. namenode. name. dir </name>
<Value>/opt/hadoop-2.2.0/tmp/dfs/name </value>
</Property>
<Property>
<Name> dfs. datanode. data. dir </name>
<Value>/opt/hadoop-2.2.0/tmp/dfs/data </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 1 </value>
</Property>
<Property>
<Name> dfs. webhdfs. enabled </name>
<Value> true </value>
</Property>
</Configuration>

7. Modify the yarn-site.xml file on 192.168.1.105

<Configuration>
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services.mapreduce.shuffle.class </name>
<Value> org. apache. hadoop. mapred. ShuffleHandler </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address </name>
<Value> cloud001: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address </name>
<Value> cloud001: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address </name>
<Value> cloud001: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address </name>
<Value> cloud001: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address </name>
<Value> cloud001: 8088 </value>
</Property>
</Configuration>

8. Modify the slaves file on 192.168.1.105 and add a line

Cloud002

9. Create Sc2salve. sh on 192.168.1.105 (the name is vim Sc2slave. sh) to copy the configuration to 192.168.1.106.

Scp/opt/hadoop-2.2.0/etc/hadoop/slaves root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/slaves
Scp/opt/hadoop-2.2.0/etc/hadoop/core-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/core-site.xml
Scp/opt/hadoop-2.2.0/etc/hadoop/hdfs-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
Scp/opt/hadoop-2.2.0/etc/hadoop/mapred-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/mapred-site.xml
Scp/opt/hadoop-2.2.0/etc/hadoop/yarn-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/yarn-site.xml

10. Run Sc2slave. sh on 192.168.1.105 to copy the configuration file to 192.168.1.106;

11. Modify/etc/profile on 192.168.1.105 to directly use commands such as hadoop on the command line;

Export HADOOP_HOME =/opt/hadoop-2.2.0
Export PATH =.: $ HADOOP_HOME/bin: $ HADOOP_HOME/sbin: $ JAVA_HOME/bin: $ PATH

12. Execute hadoop namenode-format;

13. Run the hadoop-2.2.0 script under/opt/start-all.sh/sbin

14. Check JPS on 192.168.1.105 after execution:

Jps 10531
9444 SecondaryNameNode
9579 ResourceManager
9282 NameNode

View jps on 192.168.1.106:

4463 DataNode
Jps 4941
4535 NodeManager

15. Run hdfs dfsadmin-report on 192.168.1.105.
Configured Capacity: 13460701184 (12.54 GB)
Present Capacity: 5762686976 (5.37 GB)
DFS Remaining: 5762662400 (5.37 GB)
DFS Used: 24576 (24 KB)
DFS Used %: 0.00%
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)

Live datanodes:
Name: 192.168.1.106: 50010 (cloud002)
Hostname: localhost
Decommission Status: Normal
Configured Capacity: 13460701184 (12.54 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7698014208 (7.17 GB)
DFS Remaining: 5762662400 (5.37 GB)
DFS Used %: 0.00%
DFS Remaining %: 42.81%
Last contact: Mon Feb 17 05:52:18 PST 2014.

The configuration process is smooth. Note that the configuration of several XML files should not be incorrect. In addition, if an error occurs, you can check that the log check problem is located above, my operation steps perform hadoop log UNDER/opt/hadoop-2.2.0/logs.

It should be that hadoop has been configured successfully. You can start to learn more!

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.