First, Hadoop download
Use the 2.7.6 version, because the company production environment is this version
CD/optwget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.7.6/ Hadoop-2.7.6.tar.gz
Second, the configuration file
Reference Document: https://hadoop.apache.org/docs/r2.7.6
Configuration of 7 files in the $hadoop_home/etc/hadoop directory
1.core-site.xml
<?XML version= "1.0" encoding= "UTF-8"?><?xml-stylesheet type= "text/xsl" href= "configuration.xsl "?><Configuration> < Property> <name>Fs.defaultfs</name> <value>hdfs://pangu10:9000</value> <Description>NameNode URI,HDFS Processing External ports</Description> </ Property> < Property> <name>Hadoop.tmp.dir</name> <value>/opt/hdfs/tmp</value> <Description>When HDFs is reformatted (such as a new Datenode), this temporary directory needs to be deleted</Description> </ Property></Configuration>
View Code
2.hdfs-site.xml
<?XML version= "1.0" encoding= "UTF-8"?><?xml-stylesheet type= "text/xsl" href= "configuration.xsl "?><Configuration> < Property> <name>Dfs.namenode.name.dir</name> <value>File:/opt/hdfs/name</value> <Description>HDFs namespace metadata stored on Namenode</Description> </ Property> < Property> <name>Dfs.datanode.data.dir</name> <value>File:/opt/hdfs/data</value> <Description>Physical storage location of data blocks on Datanode</Description> </ Property> < Property> <name>Dfs.replication</name> <value>1</value> <Description>Set the number of Dfs replicas, not set by default is 3</Description> </ Property> < Property> <name>Dfs.namenode.secondary.http-address</name> <value>pangu11:50090</value> <Description>Setting the port for Secondname</Description> </ Property></Configuration>
View Code
3.yarn-site.xml
<?XML version= "1.0"?><Configuration> < Property> <name>Yarn.resourcemanager.hostname</name> <value>Pangu10</value> <Description>Specify the hostname where the ResourceManager is located</Description> </ Property> < Property> <name>Yarn.nodemanager.aux-services</name> <value>Mapreduce_shuffle</value> <Description>NodeManager run on the secondary service, need to be configured as Mapreduce_shuffle, to run the MapReduce program</Description> </ Property> < Property> <name>Yarn.nodemanager.pmem-check-enabled</name> <value>False</value> </ Property> < Property> <name>Yarn.nodemanager.vmem-check-enabled</name> <value>False</value> </ Property></Configuration>
View Code
4.mapred-site.xml
<?XML version= "1.0"?><?xml-stylesheet type= "text/xsl" href= "configuration.xsl "?><Configuration> < Property> <name>Mapreduce.framework.name</name> <value>Yarn</value> <Description>Specify mapreduce using the yarn framework</Description> </ Property></Configuration>
View Code
5.slaves
Pangu10pangu11pangu12
6.yarn-env.sh
Find Line 23rd
# Export JAVA_HOME=/HOME/Y/LIBEXEC/JDK1. 6.0/
Replaced by
Export JAVA_HOME=/OPT/JDK1. 8. 0_181/
7.hadoop-env.sh
25 Rows found
Export Java_home=${java_home}
Replaced by
Export JAVA_HOME=/OPT/JDK1. 8. 0_181/
Third, copy to Slave
Iv. format of HDFs
The shell executes the following command
Hadoop Namenode-format
Formatting succeeds if the following red log content appears
-/Ten/ A A: -: -INFO util. Gset:capacity =2^ the=32768Entries -/Ten/ A A: -: -INFO Namenode. fsimage:allocated New blockpoolid:bp-1164998719-192.168.56.10-153936231358418/10/12 12:38:33 INFO Common. Storage:storage Directory/opt/hdfs/name has been successfully formatted. -/Ten/ A A: -: -INFO Namenode. Fsimageformatprotobuf:saving imagefile/opt/hdfs/name/current/fsimage.ckpt_0000000000000000000 using no compression -/Ten/ A A: -: -INFO Namenode. Fsimageformatprotobuf:imagefile/opt/hdfs/name/current/fsimage.ckpt_0000000000000000000 of size theBytes savedinch 0seconds. -/Ten/ A A: -: -INFO Namenode. Nnstorageretentionmanager:going to retain1Images with Txid >=0 -/Ten/ A A: -: -INFO util. Exitutil:exiting with status0 -/Ten/ A A: -: -INFO Namenode. Namenode:shutdown_msg:/************************************************************shutdown_msg:shutting down NameNode at pangu10/ 192.168.56.10************************************************************/
V. Start Hadoop
Spark Installation II: Hadoop cluster deployment