Ubuntu16.04 Install hadoop-2.8.1.tar.gz Cluster Setup

Source: Internet
Author: User
Tags hdfs dfs lenovo

Environment Description:

IP address user name Machine name machine role

192.168.3.150 Donny Donny-lenovo-b40-80 Master + Salve

192.168.3.167 CQB cqb-lenovo-b40-80 Salve

Master machine mainly configures the roles of Namenode and Jobtracker, responsible for the execution of distributed data and decomposition tasks, salve the role of machine configuration Datanode and Tasktracker, responsible for distributed data storage and execution of tasks. There should be 1 master machines (spare) here to prevent the master server from downtime.

Note: Because Hadoop requires the same deployment directory structure for Hadoop on all machines (because other task nodes are started at startup in the same directory as the primary node), and all have an identical user name account. Referring to various documents, it is said that all machines are built with a Hadoop user, using this account to achieve no password authentication. For convenience, a Hadoop user is re-established on three machines. (strongly recommend this practice, do not learn poor bloggers)

Environment configuration

Modified hostname Vim/etc/hostname modified with hostname test modified successfully

Add hosts vim/etc/hosts 192.168.3.150 donny-lenovo-b40-80 192.168.3.167 cqb-lenovo-b40-80

SSH configuration

SSH-KEYGEN-T RSA

Ssh-copy-id-i ~/.ssh/id_rsa.pub [email protected]

Hadoop configuration

Vim/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/tmp</value>
<description>abase for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://donny-Lenovo-B40-80:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
Vim/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>


Vim/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>donny-Lenovo-B40-80:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>donny-Lenovo-B40-80:19888</value>
</property>
</configuration>


Vim/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>donny-Lenovo-B40-80</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Vim/etc/hadoop/masters

Donny-lenovo-b40-80

Vim/etc/hadoop/slaves

Donny-lenovo-b40-80
Cqb-lenovo-b40-80

Send Namenode to Datanode.

Scp-r/etc/hadoop/[Email protected]:/tmp/

Start

First time formattinghdfs namenode -format

Executed at Nomenode

hadoop-daemon.sh Start Namenode

hadoop-daemon.sh start Datanode (optional)

yarn-daemon.sh Start NodeManager

mr-jobhistory-daemon.sh Start Historyserver

yarn-daemon.sh start ResourceManager

Executed at Datanode

hadoop-daemon.sh Start Datanode

yarn-daemon.sh Start NodeManager

hadoop-daemon.sh start Secondarynamenode (optional)

Verify

Verify that the cluster HDFs Dfs-ls/Any single machine can be executed

Visit 192.168.3.150:50070 to see if Datanode is robust

Publish a task Hadoop Jar/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar wordcount/tmp/order.data/ Output

Access 192.168.3.150:8088 View Publish task results

Ubuntu16.04 Install hadoop-2.8.1.tar.gz Cluster Setup

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.