ubuntu14.04 Deploying the Hadoop Environment (learning notes)

Source: Internet
Author: User
Tags ssh hdfs dfs
First step: Prepare three virtual machines and create 3 Hadoop usersModify the Hosts file as follows: sudo vim/etc/hosts
127.0.0.1       localhost  
#127.0.1.1      ubuntu-14.04-server     ubuntu-14  #一定要注释掉  
10.0.83.201 CDH  
10.0.83.202 CDH1  
10.0.83.173 CDH2
and modify the host name of each host: sudo vim/etc/hostname
CHD
The second step: three hosts to create a new user and all set SSH password-free loginFirst, each host creates a new user called Hadoop. Here's how:
sudo useradd-m hadoop-s/bin/bash
sudo passwd hadoop
sudo adduser hadoop sudo
gpasswd-a Hadoop root
#这里的hadoop是自己随意设置的用户名 but 3 servers to unify the user name
Second, install Ssh:sudo Apt-get installed on each machine openssh-server
Set up password-free login again: can learn from http://blog.csdn.net/thinkpadshi/article/details/46518457
Refer to my own method of organizing: http://blog.csdn.net/u012969412/article/details/60961161
Step three: Install jdk1.8Reference: http://blog.csdn.net/u012969412/article/details/58056270
Installing the JDK into the directory/usr/local/java
Fourth Step: Download the Hadoop installation fileHadoop installation file Address: http://mirrors.hust.edu.cn/apache/hadoop/common/
Download to directory ~/hadoop/Three hosts need to install Hadoop
Execution instruction: wget-r-o hadoop-2.7.3.tar.gz "http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz"
Be sure to perform the decompression instruction under the Hadoop User: TAR-ZXVF hadoop-2.7.3.tar.gz to install the Hadoop decompression to the directory ~/hadoop

Add hadoop_home environment variable to:/etc/profile

# Java ENV
export java_home=/usr/local/java/jdk1.8.0_121
export path= $JAVA _home/bin: $PATH
Export Classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
# Hadoop ENV
Export hadoop_home=/home/hadoop/ hadoop-2.7.3
Export Hadoop_prefix=${hadoop_home}
export path= $PATH: $HADOOP _prefix/bin: $HADOOP _prefix/ Sbin
Export Hadoop_common_home=${hadoop_prefix}
export Hadoop_hdfs_home=${hadoop_prefix}
Export Hadoop_mapred_home=${hadoop_prefix}
Export Hadoop_yarn_home=${hadoop_prefix}

Per machine execution: Source/etc/profile make environment variables effective
Per machine execution: Hadoop version to see if Hadoop was installed successfully.
Fifth step: Turn off the firewall

$ sudo apt-get install UFW
$ sudo ufw disable
$ sudo ufw status
Sixth step: Need to create some directories under the hadoop-2.6.0 directory1. Create Hadoop.tmp.dir directory in Core-site.xml: hadoop-2.7.3/tmp # This directory start-dfs.sh not automatically created
2. Create Dfs.namenode.name.dir directory in Hdfs-site.xml: Hadoop-2.7.3/dfs/name # This directory is automatically created when start-dfs.sh
3. Create Dfs.datanode.data.dir directory in Hdfs-site.xml: Hadoop-2.7.3/dfs/data # This directory is automatically created when start-dfs.sh
4. Create Dfs.journalnode.edits.dir directory in Hdfs-site.xml: Hadoop-2.7.3/dfs/journal # This directory is automatically created when start-dfs.sh

5. Create Journalnode log file logs directory: Hadoop-2.7.3/logs # This directory start-dfs.sh automatically creates a seventh step: Modify H Adoop configuration file Similarly configure two other machines
(1) hadoop-env.sh
Add the following two lines of configuration:

Export java_home=/usr/local/java/jdk1.8.0_121
(2) Core-site.xml
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs:// cdh:8020</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/hadoop-2.7.3/tmp</value>
    </property>
</configuration>
(3) Hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3< /value>
    </property>
</configuration>
There are three copies of the data
(4) Mapred-site.xml (requires user to create a new file, according to Mapred-site.xml.default settings can be)
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value >yarn</value>
    </property>
</configuration>
(5) yarn-env.sh
Add Java_home Configuration
Export java_home=/usr/local/java/jdk1.8.0_121
(6) Yarn-site.xml
<configuration>
<!--Site specific YARN Configuration Properties---
    <property>
        < name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </ property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value> cdh</value>
    </property>
</configuration>

(7) Slaves

CDH1
CDH2
CDH (master) is also used as NameNode as DataNode.
Make the same configuration on CDH1 and CDH2

scp/home/hadoop/hadoop-2.7.3/etc/hadoop/*  hadoop@10.0.83.202:/home/hadoop/hadoop-2.7.3/etc/hadoop/# and modify the blue part of the data in CDH1
scp/home/hadoop/hadoop-2.7.3/etc/hadoop/*  hadoop@10.0.83.173:/home/hadoop/ hadoop-2.7.3/etc/hadoop/#并且在CDH2中修改蓝色部分数据
Eighth step: Start HDFs
start the HDFs cluster for the first time:

1. Execute the following command:

$ start-dfs.sh

The goal is to open the Journalnode on all nodes so that the information can be interconnected.

2. Initialize the Namenode metadata on the NN1 node + open the Namenode of nn1:

$ HDFs namenode-format
$ start-dfs.sh

3, other nn2,nn3 and other nodes on the synchronization nn1 initialization namenode meta-data information + open NN2,NN3 nodes such as Namenode:

$ HDFs namenode-bootstrapstandby
#在nn1节点上输入指令 $ start-dfs.sh

4. Change the NN1 node standby state to the active state:

$ HDFs haadmin-transitiontoactive nn1

5. View the status of HDFs:

$ HDFs haadmin-getservicestate nn1

The order must not change.

6. Create a working environment for Hadoop users in HDFs database for HDFs:

$ HDFs dfs-mkdir-p/user/hadoop

Do not start the HDFs cluster for the first time:

$ start-dfs.sh
$ hdfs haadmin-transitiontoactive nn1

nineth Step: Manipulate HDFs and configure the jar file

Reference URL: http://blog.csdn.net/u012969412/article/details/64126714

When working with the HDFS system using Java, the package is often not found and the/hadoop-2.7.3/share/hadoop/hdfs/*.jar package needs to be added to the classpath of the environment variable

For f in $HADOOP _home/share/hadoop/hdfs/*.jar; Do
    Classpath=${classpath}: $f
done























Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.