CentOS Configuration Hadoop

Last Update:2016-10-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop is used for processing big data, with the core of HDFs, Map/reduce. Although the current work does not need to use this, but the technology is not pressure, after a lot of virtual machine to try, and finally will Hadoop2.5.2 the environment to build up smoothly.

First, prepare a centos, change the hostname to master, and add the master corresponding native IP address to the/etc/hosts.

Linux Basic Configuration

Vi/etc/sysconfig/network
#编辑文件里面的HOSTNAME =master
Vi/etc/hosts
#添加
Native IP Address Master
Then turn off the iptables and set the boot not to start.

Service Iptables Stop
Chkconfig iptables off
Reboot the system, then configure SSH login without password. The reason for configuring this is that you can start Hadoop without entering a password.

SSH Login without password

Vi/etc/ssh/sshd_config
#以下4行的注释需要打开
Hostkey/etc/ssh/ssh_host_rsa_key
Rsaauthentication Yes
Pubkeyauthentication Yes
Authorizedkeysfile. Ssh/authorized_keys

#保存 and restart sshd

Service sshd Restart

#生成免登陆秘钥
SSH-KEYGEN-T RSA
#一路回车就行. 2 files will then be generated in the. SSH folder in the current logged in user's home directory.
#进入. SSH directory.
Cat Id_rsa.pub >> Authorized_keys

#现在可以用ssh无密码登陆系统了.
SSH localhost
JDK installation configuration (slightly)

The version used is jdk-7u79-linux-x64.

Installing and configuring Hadoop2.5.2
Upload the downloaded tar.gz package to the environment.

TAR-ZXVF hadoop-2.5.2.tar.gz-c/usr

Vi/etc/profile

#将以下内容放在最后面.
Export java_home=/usr/java/jdk1.7.0_79
Export classpath=.: $JAVA _home/lib/dt.jar: $JAVA _home/lib/tools.jar
Export hadoop_home=/usr/hadoop-2.5.2
Export path= $PATH: $JAVA _home/bin: $HADOOP _home/bin: $HADOOP _home/sbin
Export Hadoop_common_home= $HADOOP _home
Export Hadoop_hdfs_home= $HADOOP _home
Export Hadoop_mapred_home= $HADOOP _home
Export Hadoop_yarn_home= $HADOOP _home
Export hadoop_conf_dir= $HADOOP _home/etc/hadoop
Export hadoop_common_lib_native_dir= $HADOOP _home/lib/native
Export Hadoop_opts=-djava.library.path= $HADOOP _home/lib

#保存, and execute Source/etc/profile

#配置Hadoop
#创建hadoop的name与data目录
Mkdir-p/usr/hdfs/name
Mkdir-p/usr/hdfs/data
Mkdir-p/usr/hdfs/tmp

Cd/usr/hadoop-2.5.2/etc/hadoop
Set the java_home of the following files
Hadoop-env.sh hadoop-yarn.sh

VI Core-site.xml
#在configuration节点里面加入以下配置, notice that IP is changed to native IP
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/hdfs/tmp</value>
<description>a base for other temporary directories.</description>
</property>

<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.1.112:9000</value>
</property>

VI Hdfs-site.xml
#同样在configuration节点里面加入以下配置
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>

#从模板复制一份mapred-site.xml
CP Mapred-site.xml.template Mapred-site.xml
VI Mapred-site.xml
#同样在configuration节点里面加入以下配置
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

VI Yarn-site.xml
#同样在configuration节点里面加入以下配置, pay attention to changing the IP address to the cost machine.
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<description>the address of the Applications manager interface in the rm.</description>
<name>yarn.resourcemanager.address</name>
<value>192.168.1.112:18040</value>
</property>
<property>
<description>the address of the scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.1.112:18030</value>
</property>
<property>
<description>the address of the RM Web application.</description>
<name>yarn.resourcemanager.webapp.address</name>
<value>192.168.1.112:18088</value>
</property>
<property>
<description>the address of the resource tracker interface.</description>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.1.112:8025</value>
</property>
At this point, the initial Hadoop environment has been configured, and you need to format the Namenode before booting.

Enter the command "Hadoop Namenode-format";

Start command:

start-dfs.sh

start-yarn.sh

Stop command:

stop-dfs.sh

stop-yarn.sh

When the boot is complete, open the browser input http://192.168.1.112:50070 with http://192.168.1.112:18088 to verify the installation.

Test Hadoop

Verify that the installation is correct by running the wordcount that comes with Hadoop.

Go to the Hadoop installation directory and enter the following command.

mkdir Example
CD Example

Edit File1.txt and File2.txt

VI file1.txt
Hello Zhm

Hello Hadoop

Hello CZ

VI File2.txt
Hadoop is OK

Hadoop is Newbee

Hadoop 2.5.2

Cd..
Hadoop Fs-mkdir/data
Hadoop fs-put-f example/file1.txt Example/file2.txt/data
#运行wordcount例子
Hadoop jar./share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.5.2-sources.jar Org.apache.hadoop.examples.wordcount/data/output
#查看运行结果
Hadoop fs-cat/output/part-r-00000
#结果如下:
2.5.2 1
CZ 1
Hadoop 4
Hello 3
is 2
Newbee 1
OK 1
ZHM 1
Here, the environment is already configured, and the following is the use of MAVEN to develop Hadoop projects.

In the process of installation, it is necessary to encounter problems. A good search on the internet can usually find the answer you want.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

CentOS Configuration Hadoop

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support