hadoop2.x version pseudo-distributed installation details "non-plagiarism, original!" 】

Source: Internet
Author: User

Online Hadoop single node installation article is really a search a lot of, can say, all is rubbish, those bloggers do not know that they have installed No. Anyway, a bunch of copy, I really can't see anymore. You have to write my summary, if you have any questions, feel free to contact me!

e-mail: [Email protected]

Let me take 2.2.0 as an example! Note that this is a single node!! Pseudo-Distributed!! Don't like to see

Hadoop2.2.0 (Yarn) Installation details process

One, the Hadoop pseudo-distributed platform running a stand-alone edition mode

1 , Software Environment

1. Operating system:

Due to Linux's open source, security and other factors, the existing Hadoop cluster is a Linux platform, where a personal installation of the Linux Ubuntu 12.04LTS version to implement the Hadoop pseudo-distributed platform. The use of the window System does not allow developers to better experience the advantages of the Hadoop framework and its efficient performance. For Linux systems Red Hat, Ubuntu, Debian, Gentoo, SuSE, Mandriva installation are the same.

2. Java Environment:

Use the JAVA-7-OPENJDK recommended on Ubuntu here. If you use Sun's JDK, you need to add an environment to the system, which is no longer a repeat.

3, SSH: On the Hadoop platform, need to involve communication between ResourceManager and NodeManager, communication between Appmaster and container, communication between user data, etc. In the communication process, you need to configure SSH without password communication.

4. Hadoop version: hadoop2.2.0 (all 1.0 or more frames in Apache website apply)

The above software environment is in the corresponding open source official website download, in which the Java environment and SSH are in Ubuntu using the Apt-get Install command can be obtained.

Note: The modification process may use the editing software as vim.

Install the required software source:

Name

Url

Ubuntu

Http://www.ubuntu.org.cn/download

Hadoop2.2.0

http://www.apache.org/dyn/closer.cgi/hadoop/common/

MyEclipse

http://www.myeclipseide.cn

Javajdk

Ubuntu Console command sudo apt-get install JAVA-7-OPENJDK

Ssh

Ubuntu Console command sudo apt-get install Ssh-openserver

Vim

Ubuntu Console command sudo apt-get install vim

Second, the installation process

2.1 Pre-Installation preparation work

Before configuring yarn, you need to install JAVAJDK, SSH, vim, and download the Hadoop package from Apache website to local.

Test JAVAJDK Installation Successful input java–version appears

Test VIM installation appears after successfully entering Vim

Test SSH will be used when configuring a login without a password.

2.2 to users Hadoop Add root permissions (in ubuntu12.04 During installation, the user name is set to: Hadoop )

Console input: [Email Protected]:sudo vim/etc/sudoers

Permissions to add Hadoop under root

Hadoop all= (All:all) all

If Ubuntu is already installed, if you want to add Hadoop users and user groups, you need to do the following:

1. Add Hadoop user group: sudo addgroup Hadoop.

2. Add Hadoop users and add them to the Hadoop user group: sudo adduser-ingroup hadoop Hadoop.

3. Add Hadoop to the root permission group.

2 , configure SSH No password access

Configure no password access after the user does not need to continue to enter the access password, the console command is as follows:

1, the creation Ssh-key, uses the common RSA encryption way: Ssh–keygen–t rsa–p ""

A carriage return will generate two files under ~/.ssh/: Id_rsa and id_rsa.pub These two files are paired

2. Enter into the ~/.SSH directory

[Email PROTECTED]:CD ~/.ssh

3, append id_rsa.pub to Authorized_keys authorization file, start is no Authorized_keys file

[Email protected]:cat id_rsa.pub >> Authorized_keys

Installation Successful login test:

[Email protected]:ssh localhost

The installation is successful when it appears

3. Installing the Hadoop second-generation framework

Unzip the downloaded hadoop2.2.0 tar.gz using the tar xvzf and place it in/esr/local, then go to/usr/local/hadoop/etc/hadoop

The specific file configuration is as follows:

1 , hadoop-env.sh Configuration

Add Java Environment in hadoop-env.sh (native to Openjdk7 as an example)

Open [Email protected]:vim/usr/local/hadoop/etc/hadoop/hadoop-env.sh

Add Export java_home=/usr/lib/jvm/java-7-openjdk-i386

2 , yarn-env.sh Configuration

Adding a Java environment to yarn-env.sh

Open [Email protected]:vim/usr/local/hadoop/etc/hadoop/yarn-env.sh

Add Export java_home=/usr/lib/jvm/java-7-openjdk-i386

3 , Core-site.xml Configuration

Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/core-site.xml

Add to

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

4 , Hdfs-site.xml Configuration

Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml

Add to:

<property>

<name>dfs.namenode.name.dir</name>

<value>/usr/local/hadoop/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/usr/local/hadoop/dfs/data</value>

</property>

<property>

<name>dfs.replicatiion</name>

<value>2</value>

</property>

5 , Mapred-site.xml Configuration

Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/mapred-site.xml

Add to:

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

6 , Yarn-site.xml Configuration

Open [Email Protected]:vim/usr/local/hadoop/etc/hadoop/yarn-site.xml

Add to:

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.ampreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

After configuring yarn configuration to complete, and then start all services one at a time, the following actions:

1. Start HDFs:

[Email Protected]:bin/hadoop Namenode-format

[email protected]: sbin/hadoop-daemon.sh start Namenode

[email protected]: sbin/hadoop-daemon.sh start Datanode

2. Start yarn:

[Email protected]:sbin/yarn-daemon.sh start ResourceManager

[Email protected]:sbin/yarn-daemon.sh start NodeManager

(or take advantage of all start-up commands) [email protected]:sbin/start-all.sh

[Email Protected]:jps

If there is no start, then go to find the log log, the log log is located in:/usr/local/hadoop/log directory, yarn will Namenode, Datanode, NodeManager, ResourceManager separately recorded with different logs for user-friendly search

The installation succeeds if the following command appears:

Enter http://localhost:8088 in the browser to test whether the Web port is not occupied as shown in:

Test the HDFs port for normal, enter in the browser: http://localhost:50070 appears as shown:

Click "Browse the FileSystem" to enter the HDFs directory and the HDFs installation is complete, as shown in:

After the above configuration and after the test was successful, our second generation platform for Hadoop (pseudo-distributed) was fully installed. Then we can use it to do a lot of interesting things.

hadoop2.x version pseudo-distributed installation details "non-plagiarism, original!" 】

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.