The Hadoop 2.2.0 environment is set up based on the online articles. The specific content is as follows.
Environment Introduction:
I use two laptops, both of which use VMware to install the Fedora 10 system.
VM 1: IP 192.168.1.105 hostname: cloud001 User: root
Virtual Machine 2: IP 192.168.1.106 hostname: cloud002 User: root
Preparations:
1. Configure the/etc/hosts file and add the following two lines:
192.168.1.105 cloud001
192.168.1.106 cloud002
2. service iptables stop: Disable the iptables service
3. Install jdk jdk-6u45-linux-i586.bin and run unzip under/opt
Configure/etc/profile and add
Export JAVA_HOME =/opt/jdk1.6.0 _ 45
Export CLASSPATH =.: $ JAVA_HOME/lib. tools. jar
Export PATH = $ JAVA_HOME/bin: $ PATH
Then run source/etc/profile
You can run the env command to check whether the environment variable is set successfully.
4. Configure SSH password-free login, which is important
1) Run ssh-keygen-t rsa on the two machines to generate a pair of key: private key (id_rsa) and Public Key (id_rsa.pub );
2) copy the public key on the host 192.168.1.105 to the corresponding/root/. ssh/directory on the host 192.168.1.106.
Scp./id_rsa.pub root@192.168.1.106:/root/. ssh/authorized_keys
3) copy the public key on the host 192.168.1.106 to the corresponding/root/. ssh/directory on the host 192.168.1.105.
Scp./id_rsa.pub root@192.168.1.105:/root/. ssh/authorized_keys
4) both machines enter the/root/. ssh directory and run cat id_rsa.pub> authorized_keys
5) After configuration, ssh cloud001 and ssh cloud002 on both hosts should be password-free for login.
Configure hadoop 2.2.0
1. Download hadoop-2.2.0.tar.gz. I use a 32-bit directory and decompress it to the/opt directory of the two machines;
2. Go to the/opt/hadoop-2.2.0/etc/hadoop directory and modify the hadoop-env.sh
Export JAVA_HOME =/opt/jdk1.6.0 _ 45
3. Modify yarn-env.sh on 192.168.1.105
Export JAVA_HOME =/opt/jdk1.6.0 _ 45
4, 192.168.1.105 to modify the core-site.xml, pay attention to the XML file header has <? Xml version = "1.0"?>, In addition, the directories configured in XML should all exist. If they do not exist, create them first.
<Configuration>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // cloud001: 9000 </value>
</Property>
<Property>
<Name> io. file. buffer. size </name>
<Value> 131072 </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/opt/hadoop-2.2.0/tmp </value>
<Description> Abase for other temporary directories. </description>
</Property>
<Property>
<Name> hadoop. proxyuser. hduser. hosts </name>
<Value> * </value>
</Property>
<Property>
<Name> hadoop. proxyuser. hduser. groups </name>
<Value> * </value>
</Property>
</Configuration>
5. Modify mapred-site.xml on 192.168.1.105
<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. address </name>
<Value> cloud001: 10020 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. webapp. address </name>
<Value> cloud001: 19888 </value>
</Property>
</Configuration>
6. Modify hdfs-site.xml on 192.168.1.105
<Configuration>
<Property>
<Name> dfs. namenode. secondary. http-address </name>
<Value> cloud001: 9001 </value>
</Property>
<Property>
<Name> dfs. namenode. name. dir </name>
<Value>/opt/hadoop-2.2.0/tmp/dfs/name </value>
</Property>
<Property>
<Name> dfs. datanode. data. dir </name>
<Value>/opt/hadoop-2.2.0/tmp/dfs/data </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 1 </value>
</Property>
<Property>
<Name> dfs. webhdfs. enabled </name>
<Value> true </value>
</Property>
</Configuration>
7. Modify the yarn-site.xml file on 192.168.1.105
<Configuration>
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services.mapreduce.shuffle.class </name>
<Value> org. apache. hadoop. mapred. ShuffleHandler </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address </name>
<Value> cloud001: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address </name>
<Value> cloud001: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address </name>
<Value> cloud001: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address </name>
<Value> cloud001: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address </name>
<Value> cloud001: 8088 </value>
</Property>
</Configuration>
8. Modify the slaves file on 192.168.1.105 and add a line
Cloud002
9. Create Sc2salve. sh on 192.168.1.105 (the name is vim Sc2slave. sh) to copy the configuration to 192.168.1.106.
Scp/opt/hadoop-2.2.0/etc/hadoop/slaves root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/slaves
Scp/opt/hadoop-2.2.0/etc/hadoop/core-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/core-site.xml
Scp/opt/hadoop-2.2.0/etc/hadoop/hdfs-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
Scp/opt/hadoop-2.2.0/etc/hadoop/mapred-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/mapred-site.xml
Scp/opt/hadoop-2.2.0/etc/hadoop/yarn-site.xml root @ cloud002:/opt/hadoop-2.2.0/etc/hadoop/yarn-site.xml
10. Run Sc2slave. sh on 192.168.1.105 to copy the configuration file to 192.168.1.106;
11. Modify/etc/profile on 192.168.1.105 to directly use commands such as hadoop on the command line;
Export HADOOP_HOME =/opt/hadoop-2.2.0
Export PATH =.: $ HADOOP_HOME/bin: $ HADOOP_HOME/sbin: $ JAVA_HOME/bin: $ PATH
12. Execute hadoop namenode-format;
13. Run the hadoop-2.2.0 script under/opt/start-all.sh/sbin
14. Check JPS on 192.168.1.105 after execution:
Jps 10531
9444 SecondaryNameNode
9579 ResourceManager
9282 NameNode
View jps on 192.168.1.106:
4463 DataNode
Jps 4941
4535 NodeManager
15. Run hdfs dfsadmin-report on 192.168.1.105.
Configured Capacity: 13460701184 (12.54 GB)
Present Capacity: 5762686976 (5.37 GB)
DFS Remaining: 5762662400 (5.37 GB)
DFS Used: 24576 (24 KB)
DFS Used %: 0.00%
Under replicated blocks: 0
Blocks with primary upt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 1 (1 total, 0 dead)
Live datanodes:
Name: 192.168.1.106: 50010 (cloud002)
Hostname: localhost
Decommission Status: Normal
Configured Capacity: 13460701184 (12.54 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 7698014208 (7.17 GB)
DFS Remaining: 5762662400 (5.37 GB)
DFS Used %: 0.00%
DFS Remaining %: 42.81%
Last contact: Mon Feb 17 05:52:18 PST 2014.
The configuration process is smooth. Note that the configuration of several XML files should not be incorrect. In addition, if an error occurs, you can check that the log check problem is located above, my operation steps perform hadoop log UNDER/opt/hadoop-2.2.0/logs.
It should be that hadoop has been configured successfully. You can start to learn more!
Build a Hadoop environment on Ubuntu 13.04
Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1
Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)
Configuration of Hadoop environment in Ubuntu
Detailed tutorial on creating a Hadoop environment for standalone Edition
Build a Hadoop environment (using virtual machines to build two Ubuntu systems in a Winodws environment)