Build pseudo-distributed environment
- Upload the hadoop2.7.0 compiled package and unzip it into the/zzy directory
Mkdir/zzy
Extract
TAR-ZXVF hadoop.2.7.0.tar.gz-c/zzy
?
?
Note: The hadoop2.x configuration file $hadoop_home/etc/hadoop
???? Pseudo-distributed requires 5 configuration files to be modified
First one: hadoop-env.sh
???????? Vim hadoop-env.sh
???????? #第27行
???????? Export java_home=/usr/java/jdk1.7.0_79
????????
The second one: Core-site.xml
???????? <!--Specify the default name for FS--
???????? <property>
<name>fs.default.name</name>
<value>hdfs://zzy:9000</value>
</property>
???????? <!--Specify the address of the boss of HDFs (NameNode)--
???????? <property>
???????????? <name>fs.defaultFS</name>
???????????? <value>hdfs://zzy:9000</value>
???????? </property>
???????? <!--specify the storage directory where the Hadoop runtime generates files--
???????? <property>
???????????? <name>hadoop.tmp.dir</name>
???????????? <value>/zzy/hadoop-2.7.0/tmp</value>
</property>
????????
The third one: Hdfs-site.xml
???????? <!--Specify the number of HDFs replicas-
???????? <property>
???????????? <name>dfs.replication</name>
???????????? <value>1</value>
</property>
????????
Fourth: Mapred-site.xml (MV Mapred-site.xml.template mapred-site.xml)
???????? MV Mapred-site.xml.template Mapred-site.xml
???????? Vim Mapred-site.xml
???????? <!--specify Mr to run on yarn--
???????? <property>
???????????? <name>mapreduce.framework.name</name>
???????????? <value>yarn</value>
</property>
????????
Fifth one: Yarn-site.xml
???????? <!--Specify the address of yarn's boss (ResourceManager)---
???????? <property>
???????????? <name>yarn.resourcemanager.hostname</name>
???????????? <value>zzy</value>
???????? </property>
???????? <!--Reducer How to get data--
???????? <property>
???????????? <name>yarn.nodemanager.aux-services</name>
???????????? <value>mapreduce_shuffle</value>
???? </property>
- Add Hadoop to an environment variable
Vim/etc/profile
The contents are as follows:
java_home=/usr/java/jdk1.7.0_79
hadoop_home=/zzy/hadoop-2.4.1
Export path= $PATH: $JAVA _home/bin: $HADOOP _home/bin: $HADOOP _home/sbin
?
Let the configuration take effect:
Source/etc/profile
?
- Format Namenode (is initialized for Namenode)
HDFs Namenode-format or Hadoop namenode-format
?
- Start Hadoop
?
sbin/start-dfs.sh
????????
???? sbin/start-yarn.sh
?
- Verify that startup is successful
Using the JPS command to verify
27408 NameNode
28218 Jps
27643 Secondarynamenode
28066 NodeManager
27803 ResourceManager
27512 DataNode
????
http://192.168.0.2:50070 (HDFs management interface)
http://192.168.0.2:8088 (Mr Management interface)
?
Hadoop Environment Testing
???????????? Hadoop fs-help <cmd>
#上传
???????????? Paths on files >
#查看文件内容
???????????? Paths on Hadoop fs-cat
#查看文件列表
???????????? Hadoop Fs-ls/
#下载文件
???????????? Paths on Hadoop fs-get
?
- Uploading files to the HDFs file system
Paths on files >
For example: Hadoop fs-put/root/install.log hdfs://zzy:9000/
Hadoop FS-RMR Hdfs://zzy:9000/install.log
?
Note: If you can upload and delete files correctly, HDFs is no problem.
- Test yarn
Hadoop fs-put words.txt hdfs://zzy:9000/
?
- Let yarn count the file information.
cd/$HADOOP _home/etc/hadoop/share/hadoop/target/mapreduce/
#测试命令
Hadoop jar Hadoop-mapreduce-examples-2.7.0.jar Wordcount/works.txt HDFS://ZZY:9000/WC
?
?
Note: If you can generate a directory, and the statistics output to the folder, the yarn is not a problem.
Build a pseudo-distributed environment for Hadoop