Environment Description:
IP address user name Machine name machine role
192.168.3.150 Donny Donny-lenovo-b40-80 Master + Salve
192.168.3.167 CQB cqb-lenovo-b40-80 Salve
Master machine mainly configures the roles of Namenode and Jobtracker, responsible for the execution of distributed data and decomposition tasks, salve the role of machine configuration Datanode and Tasktracker, responsible for distributed data storage and execution of tasks. There should be 1 master machines (spare) here to prevent the master server from downtime.
Note: Because Hadoop requires the same deployment directory structure for Hadoop on all machines (because other task nodes are started at startup in the same directory as the primary node), and all have an identical user name account. Referring to various documents, it is said that all machines are built with a Hadoop user, using this account to achieve no password authentication. For convenience, a Hadoop user is re-established on three machines. (strongly recommend this practice, do not learn poor bloggers)
Environment configuration
Modified hostname Vim/etc/hostname modified with hostname test modified successfully
Add hosts vim/etc/hosts 192.168.3.150 donny-lenovo-b40-80 192.168.3.167 cqb-lenovo-b40-80
SSH configuration
SSH-KEYGEN-T RSA
Ssh-copy-id-i ~/.ssh/id_rsa.pub [email protected]
Hadoop configuration
Vim/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/tmp</value>
<description>abase for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://donny-Lenovo-B40-80:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
Vim/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
Vim/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>donny-Lenovo-B40-80:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>donny-Lenovo-B40-80:19888</value>
</property>
</configuration>
Vim/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>donny-Lenovo-B40-80</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Vim/etc/hadoop/masters
Donny-lenovo-b40-80
Vim/etc/hadoop/slaves
Donny-lenovo-b40-80
Cqb-lenovo-b40-80
Send Namenode to Datanode.
Scp-r/etc/hadoop/[Email protected]:/tmp/
Start
First time formattinghdfs namenode -format
Executed at Nomenode
hadoop-daemon.sh Start Namenode
hadoop-daemon.sh start Datanode (optional)
yarn-daemon.sh Start NodeManager
mr-jobhistory-daemon.sh Start Historyserver
yarn-daemon.sh start ResourceManager
Executed at Datanode
hadoop-daemon.sh Start Datanode
yarn-daemon.sh Start NodeManager
hadoop-daemon.sh start Secondarynamenode (optional)
Verify
Verify that the cluster HDFs Dfs-ls/Any single machine can be executed
Visit 192.168.3.150:50070 to see if Datanode is robust
Publish a task Hadoop Jar/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar wordcount/tmp/order.data/ Output
Access 192.168.3.150:8088 View Publish task results
Ubuntu16.04 Install hadoop-2.8.1.tar.gz Cluster Setup