Questions Guide
1. What is the difference between a local model, a pseudo-distribution, and a distributed deployment from the actual deployment of this article?
2. Is the single machine a pseudo-distribution?
3. Can the local mode run MapReduce?
Source: About Cloud
Http://www.aboutyun.com/thread-12798-1-1.html
hadoop2.7 released, this edition is not suitable for the production environment, but does not affect learning: because there are three types of Hadoop installation, and three installation methods can be installed on the basis of the previous configuration, respectively:
- Local mode
- Pseudo-Distribution
- Distributed
###############################################
1. Preparation
installation jdk1.7 reference
Linux (Ubuntu) install Java JDK environment variable settings and applet test
Test:
Java-version
Installing SSH
sudo apt-get install ssh
$ ssh-keygen-t Dsa-p "-F ~/.SSH/ID_DSA
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
$ Export Hadoop\_prefix=/usr/local/hadoop
End up with no password login
ssh localhost
Installing rsync
sudo apt-get install rsync
To modify the network card:
Comment out 127.0.1.1 Ubuntu
Add a new map
10.0.0.81 Ubuntu
You must modify it here, or you will encounter a connection rejection problem later.
2. Installation
Go to configuration file directory
I'm here.
~/hadoop-2.7.0/etc/hadoop
To modify a configuration file:
etc/hadoop/hadoop-env.sh
Add Java_home, Hadoop_common_home
Export JAVA_HOME=/USR/JDK
Export hadoop_common_home=~/hadoop-2.7.0
Configuring Environment variables
sudo nano/etc/environment
Increase Hadoop configuration
Add the following to the variable path
/home/aboutyun/hadoop-2.7.0/bin:/home/aboutyun/hadoop-2.7.0/sbin:
########################################################
3. Local mode validation [negligible]
The so-called local mode: When running a program, such as WordCount is running on a local disk
The above has been configured, and we tested it, executing the polygon command separately:
Note: Bin/hadoop's execution condition is in hadoop_home, I'm here
$ mkdir Input
$ cp etc/hadoop/*.xml Input
$bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep input Output ' dfs[a-z. + '
$ cat output/*
##################################################################
The local mode above, we know we can, we continue to configure pseudo-distribution mode
4. Pseudo-Distribution mode
I'm here in full path:/home/aboutyun/hadoop-2.7.0/etc/hadoop
Modify File Etc/hadoop/core-site.xml
Add the following content:
meaning: receives the RPC port of the client connection for obtaining file system metadata information.
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Modify Etc/hadoop/hdfs-site.xml:
Add the following content:
meaning: backup has only one copy
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
5. Pseudo-Distribution mode
1. Formatting Namenode
HDFs Namenode-format
Some places to use
Bin/hdfs Namenode-format
If you configure an environment variable directly using HDFs Namenode-format , you can
2. Start the cluster
start-dfs.sh
The single-node pseudo-distribution has been successfully installed.
Verify
Enter the following
http://localhost:50070/
If you are installing in a virtual machine but accessing it in a host host, you need to enter the virtual machine IP address
Here the virtual machine IP address is 10.0.0.81
So, here I am
http://10.0.0.81:50070/
Configuration here is also possible, we can also run wordcount, that is, our mapreduce does not run on yarn. If you want your program to run on yarn, continue with the following configuration
#####################################################
6. Configure yarn
1. Modify the configuration file
Modifying a configuration file Mapred-site.xml
Edit the file Etc/hadoop/mapred-site.xml, add the following because there is no mapred-site.xml in Etc/hadoop, copy the Mapred-queues.xml.template
CP mapred-site.xml.template Mapred-site.xml
Then edit the file Mapred-site.xml
Add to
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Final form:
Modifying a configuration file Yarn-site.xml
Add the following content:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
2. Start yarn
start-yarn.sh
(Since I've configured the environment to change that, I can run start-yarn.shanywhere)
If you do not configure the environment variable, you need to go to Hadoop_home and execute the following command
sbin/start-yarn.sh
3. Verification
After you start yarn, enter
http://localhost:8088/
You can see the following interface
Next article hadoop2.7 run WordCount
Encounter problems
Question 1:
error:could not find or Load main class
Org.apache.hadoop.hdfs.server.namenode.NameNode
Workaround:
Add in ~/hadoop-2.7.0/etc/hadoop/hadoop-env.sh
Export hadoop_common_home=~/hadoop-2.7.0
Restart takes effect
Question 2:
Formatted Java_home not found
Bin/hdfs Namenode-format
Add in/etc/environment
Export JAVA_HOME=/USR/JDK
Effect
source/etc/environment
Reboot [If not yet, restart]
sudo init 6
hadoop2.7 "Single node" stand-alone, pseudo-distribution, distributed installation guidance