Hardware environment:
Hddcluster1 10.0.0.197 REDHAT7
Hddcluster2 10.0.0.228 Centos7 this one as master
Hddcluster3 10.0.0.202 REDHAT7
Hddcluster4 10.0.0.181 Centos7
Software Environment:
Turn off all firewalls firewall
Openssh-clients
Openssh-server
Java-1.8.0-openjdk
Java-1.8.0-openjdk-devel
Hadoop-2.7.3.tar.gz
Process:
-
Select a machine as Master
-
Configure Hadoop users on the master node, install SSH server, install the Java environment
-
Install Hadoop on the master node and complete the configuration
-
Configure Hadoop users on other Slave nodes, install SSH server, install the Java environment
-
Copy the/usr/local/hadoop directory on the Master node to the other Slave nodes /p>
-
Turn on Hadoop on the Master node
#节点的名称与对应的 IP relations [[Email protected] ~]$ cat /etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain610.0.0.228 hddcluster210.0.0.197 hddcluster110.0.0.202 hddcluster310.0.0.181 hddcluster4
Create a Hadoop user su # the above mentioned root user login Useradd -m hadoop -s /bin/bash # Create a new user hadooppasswd hadoop # Set up Hadoop password visudo #root all= (All) all Add hadoop all= (All) all
below this line
#登录hadoop用户, install SSH, configure SSH login without password [[email protected] ~]$ rpm -qa | grep ssh[[ Email protected] ~]$ sudo yum install openssh-clients[[email protected] ~]$ sudo yum install openssh-server[[email protected] ~] $CD ~/.ssh/ # If you do not have this directory, first ssh localhost[[email protected] ~] $ssh-keygen -t rsa # You will be prompted to press ENTER to [[email protected]ster2 ~] $ssh-copy-id -i ~/.ssh/id_rsa.pub localhost # join authorization [[email protected] ~] $chmod 600 ./authorized_keys # modify file permissions [[email protected] ~] $ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected][[email protected] ~] $ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected][[email protEcted] ~] $ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
#解压hadoop文件到/usr/local/hadoop[[email protected] ~] $sudo tar -zxf hadoop-2.7.3.tar.gz -c /usr/local/[[email protected] ~] $sudo mv /usr/local/hadoop-2.7.3 /usr/ Local/hadoop[[email protected] ~] $sudo chown -r hadoop:hadoop /usr/local/ hadoopcd /usr/local/hadoop./bin/hadoop version# Installing the Java Environment [[email protected] ~] $sudo yum install java-1.8.0-openjdk java-1.8.0-openjdk-devel[[email protected] ~]$ rpm -ql java-1.8.0-openjdk-devel | grep '/bin/javac ' /usr/lib/jvm/ java-1.8.0-openjdk-1.8.0.111-2.b15.el7_3.x86_64/bin/javac[[email protected] ~]$ vim ~/. Bashrcexport java_home=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-2.b15.el7_3.x86_64export hadoop_ Home=/usr/local/hadoopexport hadoop_install= $HADOOP _homeexport hadoop_mapred_home= $HADOOP _homeexport hadoop_common_home= $HADOOP _homeexport  Hadoop_hdfs_home= $HADOOP _homeexport yarn_home= $HADOOP _homeexport hadoop_common_lib_native_dir=$ Hadoop_home/lib/nativeexport path= $PATH: $HADOOP _home/sbin: $HADOOP _home/binexport hadoop_prefix=$ hadoop_homeexport hadoop_opts= "-djava.library.path= $HADOOP _prefix/lib: $HADOOP _prefix/lib/native" # Test Java Environment source ~/.bashrcjava -versionjava_home/bin/java -version # and direct execution Like java -version .
#修改hadoop文件配置 [[email protected] hadoop]$ pwd/usr/local/hadoop/etc/hadoop[[email protected] hadoop]$ cat core-site.xml<?xml version= "1.0" encoding= "UTF-8"? ><? Xml-stylesheet type= "text/xsl" href= "configuration.xsl"?><!-- licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the license. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable Law or agreed to in writing, software distributed under the License is distributed on an "As is" BASIS, WITHOUT Warranties or condItions of any kind, either express or implied. see the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--> <!-- Put site-specific property overrides in this file. -->< configuration> <property> <name>fs.defaultFS</name> <value> hdfs://hddcluster2:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>abase for other temporary directories.</description> </property></configuration>[[email protected] hadoop]$ cat Hdfs-site.xml<?xml version= "1.0" encoding= "UTF-8"? ><?xml-stylesheet type= "Text/xsl" href= "configuration.xsl"?><!-- licensed under the apache license, Version 2.0 (the "License"); you may not use this File except in compliance with the license. you may obtain a copy of the&nBsp license at http://www.apache.org/licenses/license-2.0 unless Required by applicable law or agreed to in writing, software distributed under the License is distributed on an "as is " basis, without warranties or conditions of any kind, either express or implied. See the License for the Specific language governing permissions and limitations under the License. See accompanying LICENSE file.--><!-- put site-specific property overrides in this file. --><configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>hddcluster2:50090</value> </property> <property > <name >dfs.replication</name> <value>3</value> </property > <property> <name>dfs.namenode.name.dir</name> <value>file:/ usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/ data</value> </property></configuration>[[ Email protected] hadoop]$ [[email protected] hadoop]$ cat mapred-site.xml <?xml version= "1.0"? ><?xml-stylesheet type= "Text/xsl" href= "configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in Compliance with the license. you may obtain a copy of the license at http://www.apache.org/licenses/license-2.0 unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "As is" BASIS, Without warranties or conditions of any kind, either express or implied. See the License for the specific language governing permissions and limitations under the license. see accompanying license file.--><!-- put site-specific property overrides in this file. --><configuration> <property> &nbsP; <name>mapreduce.framework.name </name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hddcluster2 :10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hddcluster2:19888</value> </property></configuration>[[email protected] hadoop ]$ [[email protected] hadoop]$ cat yarn-site.xml <?xml version= "1.0"? ><!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the license. you may obtain a copy of The license at http://www.apache.org/licenses/license-2.0 unless required by applicable law or agreed to in writing, Software distributed under the license is distributed on an "As is" &NBSP;BASIS,&NBSP;&NBsp Without warranties or conditions of any kind, either express or implied. See the License for the specific language governing permissions and limitations under the license. see accompanying license file.--><configuration><!-- Site specific YARN Configuration properties --> <property> <name> yarn.resourcemanager.hostname</name> <value>hddcluster2</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value> mapreduce_shuffle</value> </property></ Configuration>[[email protected] hadoop]$ [[email protected] hadoop]$ cat slaves hddcluster1hddcluster2hddcluster3hddcluster4
$CD/usr/local$sudo rm-r./hadoop/tmp # Delete Hadoop temp file $sudo rm-r./hadoop/logs/* # Delete log file $tar-zcf ~/hadoop.master . tar.gz./hadoop # First Compress and then copy $CD ~ $scp./hadoop.master.tar.gz HDDCLUSTER1:/HOME/HADOOP$SCP./hadoop.master.tar.gz HDDCLUSTER3:/HOME/HADOOP$SCP./hadoop.master.tar.gz Hddcluster4:/home/hadoop
operate on the Salve node, install the software environment and configure it. Bashrcsudo tar-zxf ~/hadoop.master.tar.gz-c/usr/localsudo chown-r hadoop/usr/local/hadoop
[[email protected] ~] $hdfs namenode -format # First run requires initialization, then no need to start hadoop , start command on Master node: $start-dfs.sh$start-yarn.sh$ Mr-jobhistory-daemon.sh start historyserver can view the processes initiated by each node through command jps . Correctly, the namenode, ResourceManager, Secondrrynamenode, jobhistoryserver processes can be seen on the Master node, In addition, you need to use the command hdfs dfsadmin -report view DataNode to start on the Master node if Live datanodes is not 0 the cluster started successfully. [[email protected] ~]$ hdfs dfsadmin -reportconfigured capacity: 2125104381952 (1.93&NBSP;TB) present capacity: 1975826509824 (1.80 TB) DFS remaining: 1975824982016 (1.80&NBSP;TB) dfs used: 1527808 (1.46 MB) DFS Used% : 0.00%under replicated blocks: 0blocks with corrupt replicas: 0missing blocks: 0missing blocks (with replication factor 1): 0--------------------- ----------------------------live datanodes (4): View DataNode and are also available through the Web page State of namenode : http://hddcluster2:50070/. If this is not successful, you can troubleshoot the cause by starting the log.
The DataNode and NodeManager processes can be seen in the Slave node operation
Testing a Hadoop distributed instance first creates a user directory on HDFs: HDFs dfs-mkdir-p/user/hadoop will/usr/local/hadoop/etc/hadoop The configuration files in the file are copied to the distributed file system as input files: HDFs dfs-mkdir inputhdfs dfs-put/usr/local/hadoop/etc/hadoop/*.xml input by viewing the status of the Datanode (accounting for Changes in size), the input file is actually copied to the DataNode. Then you can run the MapReduce job: Hadoop jar/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep input Output ' dfs[a-z. + ' Wait for the output to complete after execution:
Hadoop Start command: start-dfs.shstart-yarn.shmr-jobhistory-daemon.sh start Historyserverhadoop shutdown command: stop-dfs.shstop-yarn.shmr-jobhistory-daemon.sh stop Historyserver
PS: If the cluster has one or two units can not start, first try to delete the Hadoop temporary files
Cd/usr/local
sudo rm-r./hadoop/tmp
sudo rm-r./hadoop/logs/*
And then execute
HDFs Namenode-format
Start again
This article refers to the site and experiment successfully:
http://www.powerxing.com/install-hadoop-cluster/
This article is from "Zen Sword as" blog, please be sure to keep this source http://yanconggod.blog.51cto.com/1351649/1884998
Hadoop&spark installation (UP)