Hadoop Installation and Configuration

Last Update:2016-08-02 Source: Internet

Author: User

Tags rsync hdfs dfs

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

first, system and software environment

1. Operating system

CentOS Release 6.5 (Final)

Kernel version:2.6.32-431.el6.x86_64

master.fansik.com:192.168.83.118

node1.fansik.com:192.168.83.119

node2.fansik.com:192.168.83.120

2.JDK version:1.7.0_75

3.Hadoop version:2.7.2

second, pre-installation preparation

1. Turn off firewall and SELinux

# Setenforce 0

# Service Iptables Stop

2. Configuring the host file

192.168.83.118 master.fansik.com

192.168.83.119 node1.fansik.com

192.168.83.120 node2.fansik.com

3. Generate Secret Key

master.fansik.com on Execution # Ssh-keygen always Enter

# SCP ~/.ssh/id_rsa.pub Node1.fansik.com:/root/.ssh/authorized_keys

# SCP ~/.ssh/id_rsa.pub Node2.fansik.com:/root/.ssh/authorized_keys

# chmod 600/root/.ssh/authorized_keys

4. Installing the JDK

# Tar XF jdk-7u75-linux-x64.tar.gz

# MV jdk1.7.0_75/usr/local/jdk1.7

# vim/etc/profile.d/java.sh Add the following:

Export java_home=/usr/local/jdk1.7

Export JRE_HOME=/USR/LOCAL/JDK1.7/JRE

Export classpath=.: $JAVA _home/lib:/dt.jar: $JAVA _home/lib/tools.jar

Export path= $PATH: $JAVA _home/bin

# Source/etc/profile

5. Synchronization Time ( otherwise there may be problems when analyzing the file )

# ntpdate 202.120.2.101 ( server of Shanghai Jiaotong University )

Third, install Hadoop

The official download site for Hadoop , you can choose the appropriate version download:http://hadoop.apache.org/releases.html

Perform the following operations on three machines, respectively:

# Tar XF hadoop-2.7.2.tar.gz

# MV Hadoop-2.7.2/usr/local/hadoop

# cd/usr/local/hadoop/

# mkdir tmp DFS dfs/data dfs/name

iv. Configuring Hadoop

Configuration on the master.fansik.com

# Vim/usr/local/hadoop/etc/hadoop/core-site.xml

<configuration>  <property>    <name>fs.defaultFS</name>    <value>hdfs: // 192.168.83.118:9000</value>  </property>  <property>    <name>hadoop.tmp.dir</name>    <value>file:/ usr/local/hadoop/tmp</value>  </property>  <property>    <name> io.file.buffer.size</name>    <value>121702</value>  </property></ Configuration>

# Vim/usr/local/hadoop/etc/hadoop/hdfs-site.xml

<configuration>  <property>    <name>dfs.namenode.name.dir</name>    <value >file:/usr/local/hadoop/dfs/name</value>  </property>  <property>    <name> dfs.datanode.data.dir</name>    <value>file:/usr/local/hadoop/dfs/data</value>  </ property>  <property>    <name>dfs.replication</name>    <value>2</value >  </property>  <property>    <name>dfs.namenode.secondary.http-address</ name>    <value>192.168.83.118.9001</value>  </property>  <property>    <name>dfs.webhdfs.enabled</name>    <value>true</value>  </ Property></configuration>

# Cp/usr/local/hadoop/etc/hadoop/mapred-site.xml.template/usr/local/hadoop/etc/hadoop/mapred-site.xml

# VIM (!$|/usr/local/hadoop/etc/hadoop/mapred-site.xml)

<configuration>  <property>    <name>mapreduce.framework.name</name>    <value >yarn</value>  </property>  <property>    <name>mapreduce.jobhistory.address </name>    <value>192.168.83.118:10020</value>  </property>  <property>    <name>mapreduce.jobhistory.webapp.address</name>    <value>192.168.83.118:19888</ Value>  </property></configuration>

# Vim/usr/local/hadoop/etc/hadoop/yarn-site.xml

<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value> mapreduce_shuffle</value> </property> <property> <name> Yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property > <name>yarn.resourcemanager.address</name> <value>192.168.83.118:8032</value> </ property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value> 192.168.83.118:8030</value> </property> <property> <name> Yarn.resourcemanager.resource-tracker.address</name> <value>192.168.83.118:8031</value> </ property> <property> <name>yarn.resourcemanager.admin.address</name> <value> 192.168.83.118:8033</value> </property> <property> <name> Yarn.resourcemanager.webapp.address</name> <value>192.168.83.118:8088</value> </property > <property> <name>yarn.resourcemanager.resource.memory.mb</name> <value>2048</ Value> &LT;/PROPERTY&GT;</configuration>

# vim/usr/local/hadoop/etc/hadoop/slaves

192.168.83.119

192.168.83.120

Synchronize the etc directory on master to Node1 and node2

# rsync-av/usr/local/hadoop/etc/node1.fansik.com:/usr/local/hadoop/etc/

# rsync-av/usr/local/hadoop/etc/node2.fansik.com:/usr/local/hadoop/etc/

operate on master.fansik.com , two node will start automatically

Configure environment variables for Hadoop

# vim/etc/profile.d/hadoop.sh

Export Path=/usr/local/hadoop/bin:/usr/local/hadoop/bin: $PATH

# Source/etc/profile

Initialization

# HDFs Namenode-format

See if you have an error

# echo $?

Start the service

# start-all.sh

Stop Service

# stop-all.sh

After you start the service, you can access it via the following address:

http://192.168.83.118:8088

http://192.168.83.118:50070

v. Testing Hadoop

Operating on the master.fansik.com

# HDFs Dfs-mkdir/fansik

If you are prompted to ignore the following warnings when creating a directory

16/07/29 17:38:27 WARN util. nativecodeloader:unable to load Native-hadoop library for your pform ... using Builtin-java classes where applicable

Workaround:

Go to the following sites to download the appropriate version:

http://dl.bintray.com/sequenceiq/sequenceiq-bin/

# TAR-XVF Hadoop-native-64-2.7.0.tar-c/usr/local/hadoop/lib/native/

If prompted:copyfromlocal:cannot create directory/123/. Name node is in safe mode

Explains that Hadoop has the security mode enabled, and the workaround

HDFs Dfsadmin-safemode Leave

Copy the myservicce.sh to the Fansik directory

# HDFs dfs-copyfromlocal./myservicce.sh/fansik

See /fansik directory for myservicce.sh files

# HDFs Dfs-ls/fansik

Analyzing files using workcount

# Hadoop Jar/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount/fansik/ myservicce.sh/zhangshan/

To view the parsed file:

# HDFs dfs-ls/zhangshan/

Found 2 Items

-rw-r--r--2 root supergroup 0 2016-08-02 15:19/zhangshan/_success

-rw-r--r--2 root supergroup 415 2016-08-02 15:19/zhangshan/part-r-00000

To view the analysis results:

# HDFs dfs-cat/zhangshan/part-r-00000

Hadoop Installation and Configuration

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More