hadoop-2.3.0-cdh5.1.0 Pseudo-Distributed installation (CentOS-based)

Source: Internet
Author: User
Tags chmod mkdir ssh centos hdfs dfs
First, the environment

Operating system: CentOS 6.5 64-bit operating system

Note: Hadoop2.0 above uses the JDK environment is 1.7,linux comes with the JDK to unload, reinstall

Download Address: http://www.oracle.com/technetwork/java/javase/downloads/index.html

Software version: hadoop-2.3.0-cdh5.1.0.tar.gz, zookeeper-3.4.5-cdh5.1.0.tar.gz

Download Address: http://archive.cloudera.com/cdh5/cdh/5/

Start the installation: second, JDK installation

1. Check if you have your own JDK

Rpm-qa | grep JDK

java-1.6.0-openjdk-1.6.0.0-1.45.1.11.1.el6.i686

2. Uninstall your own JDK

Yum-y Remove java-1.6.0-openjdk-1.6.0.0-1.45.1.11.1.el6.i686

Or:

RPM-E--nodeps java-1.7.0-openjdk-1.7.0.45-2.4.3.3.el6.x86_64

3, Installation jdk-7u55-linux-x64.tar.gz

Create the folder Java under the usr/directory, run TAR–ZXVF under the Java folder jdk-7u55-linux-x64.tar.gz

Unzip to the Java directory

[Root@master01 java]# ls

jdk1.7.0_55 Three, configure environment variables

Run Vim/etc/profile

#/etc/profile

# System wide environment and startup programs, for login setup

# Functions and aliases Go IN/ETC/BASHRC

Export java_home=/usr/java/jdk1.7.0_55 export
jre_home=/usr/java/jdk1.7.0_55/jre
export classpath=/usr/ Java/jdk1.7.0_55/lib
Export path= $JAVA _home/bin: $PATH


Save changes, run Source/etc/profile reload environment variable

Run Java-version

[Root@master01 java]# Java-version

Java Version "1.7.0_55"

Java (TM) SE Runtime Environment (build 1.7.0_55-b13)

Java HotSpot (TM) 64-bit Server VM (build 24.55-b03, Mixed mode)

JDK Configuration successful four, system configuration

1. Turn off the firewall

Chkconfig iptables off (permanently off)

Configure host name and Hosts file

2, SSH no password authentication configuration

Because Hadoop runs a process that requires remote management of the Hadoop daemon, the Namenode node needs to link each datanode node through SSH (Secure Shell) to stop or start their process, so SSH must be without a password, So we have to make namenode nodes and Datanode nodes into a non-secret communication, the same datanode also need to configure the Namenode node without a password link.

Configure on each machine:

Vi/etc/ssh/sshd_config Open

Rsaauthentication Yes # Enable RSA authentication, pubkeyauthentication Yes # Enable public key private key pairing authentication mode

Master01: Run: ssh-keygen-t rsa-p "Do not enter password direct enter

The default is stored in the/root/.ssh directory,

Cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

[Root@master01. ssh]# ls

Authorized_keys Id_rsa id_rsa.pub known_hosts

If you are not installing the root user (this is the default root installation), you need to release the permissions and execute the following command:

chmod 755. SSH
chmod ~/.ssh/authorized_keys

v. Hadoop pseudo-distributed configuration

5.1 Edit file: etc/hadoop/hadoop-env.sh (Note: java_home If you have a value to replace with your own java_home)

# set to the root ofyour Java installation
  export Java_home=/usr/java/latest
 
  # assuming your installation director Y is/usr/local/hadoop
  Export Hadoop_prefix=/usr/local/hadoop


5.2 Adding Hadoop environment variables

Export Hadoop_home=/usr/local/cdh/hadoop

5.3

Edit File Etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs:// localhost:9000</value>
    </property>
</configuration>


Edit etc/hadoop/hdfs-site.xml (/usr/local/cdh/hadoop/data/dfs/name directory must be manually created and reformatted, otherwise error)

<configuration> <property> <!--turn on web hdfs--> <name>dfs.webhdfs.enabled</name> 
   <value>true</value> </property> <property> <name>dfs.replication</name> 
    <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/cdh/hadoop/data/dfs/name</value> <description> Namenode Store Name table (fsimage)  
     Directories (need to be modified) </description> </property> <property> <name>dfs.namenode.edits.dir</name> <value>${dfs.namenode.name.dir}</value> <description>namenode extensive transactionfile (edits) local directory ( 
      Need to be modified) </description> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/cdh/hadoop/data/dfs/data</value> <description>datanode Store block local directory (requires modification) &L T;/description> </property> </configuration> 


Editor: Etc/hadoop/mapred-site.xml:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value >yarn</value>
    </property>
</configuration>


Editor: Etc/hadoop/yarn-site.xml:

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        < value>mapreduce_shuffle</value>
    </property>
</configuration>


Six: Start and verify the installation is successful

Format: To format HDFs first:

Bin/hdfs Namenode-format

Start:

  sbin/start-dfs.sh
  sbin/start-yarn.sh

View process: JPS

7448 ResourceManager

8277 Secondarynamenode

7547 NodeManager

8079 DataNode

7975 NameNode

8401 Jps

 
 
 
1.   Open Browser
  namenode-http://localhost:50070/
2.   Create Folder
3.    $bin/hdfs dfs-mkdir/user
  $ bin/hdfs dfs-mkdir/user/<username>
4.   Copy file
  $ bin/hdfs dfs-put etc/hadoop input
5.   Run Job
  $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0-cdh5.1.0.jar grep input Output ' Dfs [A-Z.] + '
6.   View outputs
  $ bin/hdfs dfs-get output output
  $ cat output/*

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.