Hadoop Installation and Configuration

Source: Internet
Author: User
Tags rsync hdfs dfs

first, system and software environment

1. Operating system

CentOS Release 6.5 (Final)

Kernel version:2.6.32-431.el6.x86_64

master.fansik.com:192.168.83.118

node1.fansik.com:192.168.83.119

node2.fansik.com:192.168.83.120

2.JDK version:1.7.0_75

3.Hadoop version:2.7.2

second, pre-installation preparation

1. Turn off firewall and SELinux

# Setenforce 0

# Service Iptables Stop

2. Configuring the host file

192.168.83.118 master.fansik.com

192.168.83.119 node1.fansik.com

192.168.83.120 node2.fansik.com

3. Generate Secret Key

master.fansik.com on Execution # Ssh-keygen always Enter

# SCP ~/.ssh/id_rsa.pub Node1.fansik.com:/root/.ssh/authorized_keys

# SCP ~/.ssh/id_rsa.pub Node2.fansik.com:/root/.ssh/authorized_keys

# chmod 600/root/.ssh/authorized_keys

4. Installing the JDK

# Tar XF jdk-7u75-linux-x64.tar.gz

# MV jdk1.7.0_75/usr/local/jdk1.7

# vim/etc/profile.d/java.sh Add the following:

Export java_home=/usr/local/jdk1.7

Export JRE_HOME=/USR/LOCAL/JDK1.7/JRE

Export classpath=.: $JAVA _home/lib:/dt.jar: $JAVA _home/lib/tools.jar

Export path= $PATH: $JAVA _home/bin

# Source/etc/profile

5. Synchronization Time ( otherwise there may be problems when analyzing the file )

# ntpdate 202.120.2.101 ( server of Shanghai Jiaotong University )

Third, install Hadoop

The official download site for Hadoop , you can choose the appropriate version download:http://hadoop.apache.org/releases.html

Perform the following operations on three machines, respectively:

# Tar XF hadoop-2.7.2.tar.gz

# MV Hadoop-2.7.2/usr/local/hadoop

# cd/usr/local/hadoop/

# mkdir tmp DFS dfs/data dfs/name

iv. Configuring Hadoop

Configuration on the master.fansik.com

# Vim/usr/local/hadoop/etc/hadoop/core-site.xml

<configuration>  <property>    <name>fs.defaultFS</name>    <value>hdfs: // 192.168.83.118:9000</value>  </property>  <property>    <name>hadoop.tmp.dir</name>    <value>file:/ usr/local/hadoop/tmp</value>  </property>  <property>    <name> io.file.buffer.size</name>    <value>121702</value>  </property></ Configuration>

# Vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml

<configuration>  <property>    <name>dfs.namenode.name.dir</name>    <value >file:/usr/local/hadoop/dfs/name</value>  </property>  <property>    <name> dfs.datanode.data.dir</name>    <value>file:/usr/local/hadoop/dfs/data</value>  </ property>  <property>    <name>dfs.replication</name>    <value>2</value >  </property>  <property>    <name>dfs.namenode.secondary.http-address</ name>    <value>192.168.83.118.9001</value>  </property>  <property>    <name>dfs.webhdfs.enabled</name>    <value>true</value>  </ Property></configuration>

# Cp/usr/local/hadoop/etc/hadoop/mapred-site.xml.template/usr/local/hadoop/etc/hadoop/mapred-site.xml

# VIM (!$|/usr/local/hadoop/etc/hadoop/mapred-site.xml)

<configuration>  <property>    <name>mapreduce.framework.name</name>    <value >yarn</value>  </property>  <property>    <name>mapreduce.jobhistory.address </name>    <value>192.168.83.118:10020</value>  </property>  <property>    <name>mapreduce.jobhistory.webapp.address</name>    <value>192.168.83.118:19888</ Value>  </property></configuration>

# Vim/usr/local/hadoop/etc/hadoop/yarn-site.xml

<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value> mapreduce_shuffle</value> </property> <property> <name> Yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property > <name>yarn.resourcemanager.address</name> <value>192.168.83.118:8032</value> </ property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value> 192.168.83.118:8030</value> </property> <property> <name> Yarn.resourcemanager.resource-tracker.address</name> <value>192.168.83.118:8031</value> </ property> <property> <name>yarn.resourcemanager.admin.address</name> <value> 192.168.83.118:8033</value> </property> <property> <name> Yarn.resourcemanager.webapp.address</name> <value>192.168.83.118:8088</value> </property > <property> <name>yarn.resourcemanager.resource.memory.mb</name> <value>2048</ Value> &LT;/PROPERTY&GT;</configuration> 

# vim/usr/local/hadoop/etc/hadoop/slaves

192.168.83.119

192.168.83.120

Synchronize the etc directory on master to Node1 and node2

# rsync-av/usr/local/hadoop/etc/node1.fansik.com:/usr/local/hadoop/etc/

# rsync-av/usr/local/hadoop/etc/node2.fansik.com:/usr/local/hadoop/etc/

operate on master.fansik.com , two node will start automatically

Configure environment variables for Hadoop

# vim/etc/profile.d/hadoop.sh

Export Path=/usr/local/hadoop/bin:/usr/local/hadoop/bin: $PATH

# Source/etc/profile

Initialization

# HDFs Namenode-format

See if you have an error

# echo $?

Start the service

# start-all.sh

Stop Service

# stop-all.sh

After you start the service, you can access it via the following address:

http://192.168.83.118:8088

http://192.168.83.118:50070

v. Testing Hadoop

Operating on the master.fansik.com

# HDFs Dfs-mkdir/fansik

If you are prompted to ignore the following warnings when creating a directory

16/07/29 17:38:27 WARN util. nativecodeloader:unable to load Native-hadoop library for your pform ... using Builtin-java classes where applicable

Workaround:

Go to the following sites to download the appropriate version:

http://dl.bintray.com/sequenceiq/sequenceiq-bin/

# TAR-XVF Hadoop-native-64-2.7.0.tar-c/usr/local/hadoop/lib/native/

If prompted:copyfromlocal:cannot create directory/123/. Name node is in safe mode

Explains that Hadoop has the security mode enabled, and the workaround

HDFs Dfsadmin-safemode Leave

Copy the myservicce.sh to the Fansik directory

# HDFs dfs-copyfromlocal./myservicce.sh/fansik

See /fansik directory for myservicce.sh files

# HDFs Dfs-ls/fansik

Analyzing files using workcount

# Hadoop Jar/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount/fansik/ myservicce.sh/zhangshan/

To view the parsed file:

# HDFs dfs-ls/zhangshan/

Found 2 Items

-rw-r--r--2 root supergroup 0 2016-08-02 15:19/zhangshan/_success

-rw-r--r--2 root supergroup 415 2016-08-02 15:19/zhangshan/part-r-00000

To view the analysis results:

# HDFs dfs-cat/zhangshan/part-r-00000

Hadoop Installation and Configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.