Online Hadoop single node installation article is really a search a lot of, can say, all is rubbish, those bloggers do not know that they have installed No. Anyway, a bunch of copy, I really can't see anymore. You have to write my summary, if you have any questions, feel free to contact me!
e-mail: [Email protected]
Let me take 2.2.0 as an example! Note that this is a single node!! Pseudo-Distributed!! Don't like to see
Hadoop2.2.0 (Yarn) Installation details process
One, the Hadoop pseudo-distributed platform running a stand-alone edition mode
1 , Software Environment
1. Operating system:
Due to Linux's open source, security and other factors, the existing Hadoop cluster is a Linux platform, where a personal installation of the Linux Ubuntu 12.04LTS version to implement the Hadoop pseudo-distributed platform. The use of the window System does not allow developers to better experience the advantages of the Hadoop framework and its efficient performance. For Linux systems Red Hat, Ubuntu, Debian, Gentoo, SuSE, Mandriva installation are the same.
2. Java Environment:
Use the JAVA-7-OPENJDK recommended on Ubuntu here. If you use Sun's JDK, you need to add an environment to the system, which is no longer a repeat.
3, SSH: On the Hadoop platform, need to involve communication between ResourceManager and NodeManager, communication between Appmaster and container, communication between user data, etc. In the communication process, you need to configure SSH without password communication.
4. Hadoop version: hadoop2.2.0 (all 1.0 or more frames in Apache website apply)
The above software environment is in the corresponding open source official website download, in which the Java environment and SSH are in Ubuntu using the Apt-get Install command can be obtained.
Note: The modification process may use the editing software as vim.
Install the required software source:
Name |
Url |
Ubuntu |
Http://www.ubuntu.org.cn/download |
Hadoop2.2.0 |
http://www.apache.org/dyn/closer.cgi/hadoop/common/ |
MyEclipse |
http://www.myeclipseide.cn |
Javajdk |
Ubuntu Console command sudo apt-get install JAVA-7-OPENJDK |
Ssh |
Ubuntu Console command sudo apt-get install Ssh-openserver |
Vim |
Ubuntu Console command sudo apt-get install vim |
Second, the installation process
2.1 Pre-Installation preparation work
Before configuring yarn, you need to install JAVAJDK, SSH, vim, and download the Hadoop package from Apache website to local.
Test JAVAJDK Installation Successful input java–version appears
Test VIM installation appears after successfully entering Vim
Test SSH will be used when configuring a login without a password.
2.2 to users Hadoop Add root permissions (in ubuntu12.04 During installation, the user name is set to: Hadoop )
Console input: [Email Protected]:sudo vim/etc/sudoers
Permissions to add Hadoop under root
Hadoop all= (All:all) all
If Ubuntu is already installed, if you want to add Hadoop users and user groups, you need to do the following:
1. Add Hadoop user group: sudo addgroup Hadoop.
2. Add Hadoop users and add them to the Hadoop user group: sudo adduser-ingroup hadoop Hadoop.
3. Add Hadoop to the root permission group.
2 , configure SSH No password access
Configure no password access after the user does not need to continue to enter the access password, the console command is as follows:
1, the creation Ssh-key, uses the common RSA encryption way: Ssh–keygen–t rsa–p ""
A carriage return will generate two files under ~/.ssh/: Id_rsa and id_rsa.pub These two files are paired
2. Enter into the ~/.SSH directory
[Email PROTECTED]:CD ~/.ssh
3, append id_rsa.pub to Authorized_keys authorization file, start is no Authorized_keys file
[Email protected]:cat id_rsa.pub >> Authorized_keys
Installation Successful login test:
[Email protected]:ssh localhost
The installation is successful when it appears
3. Installing the Hadoop second-generation framework
Unzip the downloaded hadoop2.2.0 tar.gz using the tar xvzf and place it in/esr/local, then go to/usr/local/hadoop/etc/hadoop
The specific file configuration is as follows:
1 , hadoop-env.sh Configuration
Add Java Environment in hadoop-env.sh (native to Openjdk7 as an example)
Open [Email protected]:vim/usr/local/hadoop/etc/hadoop/hadoop-env.sh
Add Export java_home=/usr/lib/jvm/java-7-openjdk-i386
2 , yarn-env.sh Configuration
Adding a Java environment to yarn-env.sh
Open [Email protected]:vim/usr/local/hadoop/etc/hadoop/yarn-env.sh
Add Export java_home=/usr/lib/jvm/java-7-openjdk-i386
3 , Core-site.xml Configuration
Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/core-site.xml
Add to
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
4 , Hdfs-site.xml Configuration
Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml
Add to:
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replicatiion</name>
<value>2</value>
</property>
5 , Mapred-site.xml Configuration
Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/mapred-site.xml
Add to:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
6 , Yarn-site.xml Configuration
Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/yarn-site.xml
Add to:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.ampreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
After configuring yarn configuration to complete, and then start all services one at a time, the following actions:
1. Start HDFs:
[Email Protected]:bin/hadoop Namenode-format
[email protected]: sbin/hadoop-daemon.sh start Namenode
[email protected]: sbin/hadoop-daemon.sh start Datanode
2. Start yarn:
[Email protected]:sbin/yarn-daemon.sh start ResourceManager
[Email protected]:sbin/yarn-daemon.sh start NodeManager
(or take advantage of all start-up commands) [email protected]:sbin/start-all.sh
[Email Protected]:jps
If there is no start, then go to find the log log, the log log is located in:/usr/local/hadoop/log directory, yarn will Namenode, Datanode, NodeManager, ResourceManager separately recorded with different logs for user-friendly search
The installation succeeds if the following command appears:
Enter http://localhost:8088 in the browser to test whether the Web port is not occupied as shown in:
Test the HDFs port for normal, enter in the browser: http://localhost:50070 appears as shown:
Click "Browse the FileSystem" to enter the HDFs directory and the HDFs installation is complete, as shown in:
After the above configuration and after the test was successful, our second generation platform for Hadoop (pseudo-distributed) was fully installed. Then we can use it to do a lot of interesting things.
hadoop2.x version pseudo-distributed installation details "non-plagiarism, original!" 】