Hadoop-2.6 cluster installation, hadoop-2.6 Cluster
Basic Environment
Sshd Configuration
Directory:/root/. ssh
The configuration involves four shells.
1.Operation per machine
Ssh-keygen-t rsa
Generate an ssh key. The generated file is as follows:
Id_rsa
Id_rsa.pub
. Pub is the public key, and No. pub is the private key.
2.Operation per machine
Cp id_rsa.pub authorized_keys
Authorized_keys Error
3.Copy and distribute each machine once.
Scp id_rsa.pub hadoop262:/root/. ssh/hadoop261.pub
Copy the Local Public Key to the/root/. ssh directory of hadoop262. The new file name is hadoop261.pub.
4.Add the public key of other machines to the local machine, and you can leave the password empty.
Cat hadoop261.pub> authorized_keys
In addition to the id_rsa.pub file, all xxxxx. pub machines are added;
Note:
Host ing of/etc/hosts, ing to the local machine, hostname of the Local Machine in/etc/sysconfig/network, as long as there is no conflict, generally there will be no problem, if you are afraid, you will be confused and in conflict.
My shell script:
/Etc/hosts --> Configuration
# Masterhadoop localhost
192.168.121.218 hadoop261
# Salve1hadoop
192.168.121.228 hadoop262
# Salve2hadoop
192.168.121.238 hadoop263
127.0.0.1 localhost. localdomain hadoop261
/Etc/sysconfig/network--> Configuration
NETWORKING = yes
HOSTNAME =Hadoop261
GATEWAY = 192.168.121.1
Ftp Configuration
I mainly want to upload a tool for convenience, not a professional ftp server, so I only use the root user to log on and upload. the root user only needs to remove the root user list blocked by ftp, open the firewall and disable selinux.
In the following files, the first line is usually root use # comment out other unnecessary.
Vim/etc/vsftpd/user_list
Vim/etc/vsftpd/ftpusers
Run setup
Disable firewall,
Press the arrow and press enter to confirm. In the displayed dialog box, remove *, indicating that it is not running. Next, select [System Service]. Generally, check vsftpd first,
* Indicates running. If no * Indicates not running, iptables is also selected in this form. Generally, iptables has;
Select vsftpd to enable it with the System
Disable selinux:
Temporarily disable:
Setenforce 0
Always close:
Sed-I's/SELINUX = enforcing/SELINUX = disabled/'/etc/selinux/config
Getenforce can get the current selinux status:
The Disabled status is permanently Disabled.
Permissive is temporarily disabled.
If the IP address is correctly configured, you can start or restart vsftpd. By default, vsftpd is disabled.
Service vsftpd restart
Service vsftpd start
Environment Planning
Note:
Before installation, plan the host name, IP address, hadoop directory, and java directory.
Decompress the file and move the file (rename)
Java_home
/Home/jdk1.7
Hadoop home/Home/hadoop_app/hadoop-2.6.0
Environment variable configuration
Repeatedly executed on all machines
Repeated vim/etc/profile
ExportJAVA_HOME =/home/jdk1.7
ExportCLASSPATH =.: $ JAVA_HOME/lib/dt. jar: $ JRE_HOME/lib/tools. jar: $ CLASSPATH
ExportPATH = $ JAVA_HOME/bin: $ PATH
# Hadoop
ExportHADOOP_HOME =/home/hadoop/hadoop-2.6.0.
ExportPATH = $ HADOOP_HOME/bin: $ HADOOP_HOME/sbin: $ PATH
Duplicate hostname:
Vim/etc/hosts
Repeatedly executed on all machines
# Masterhadoop
192.168.121.218 hadoop261
# Salve1 hadoop
192.168.121.228 hadoop262
# Salve2 hadoop
192.168.121.238 hadoop263
127.0.0.1 localhost. localdomain hadoop262
#: 1 localhostlocalhost. localdomain localhost6 localhost6.localdomain6
/Etc/sysconfig/network
NETWORKING = yes
HOSTNAME = hadoop262
GATEWAY = 192.168.121.1
Hadoop Configuration
Core-site.xml
<? Xmlversion = "1.0" encoding = "UTF-8"?>
<? Xml-stylesheet type = "text/xsl" href = "configuration. xsl"?>
<Configuration>
<Property>
<Name> hadoop. tmp. dir </name>
<Value>/home/hadoop/tmp </value>
<Description> Abasefor other temporary directories. </description>
</Property>
<Property>
<Name> fs. defaultFS </name>
<Value> hdfs: // 192.168.121.218: 9000 </value>
</Property>
<Property>
<Name> io. file. buffer. size </name>
<Value> 4096 </value>
</Property>
</Configuration>
Hdfs-site.xml
<? Xmlversion = "1.0" encoding = "UTF-8"?>
<? Xml-stylesheettype = "text/xsl" href = "configuration. xsl"?>
<! --
Licensedunder the Apache License, Version 2.0 (the "License ");
You maynot use this file before t in compliance with the License.
You mayobtain a copy of the License
Http://www.apache.org/licenses/LICENSE-2.0
Unlessrequired by applicable law or agreed to in writing, software
Distributedunder the License is distributed on an "as is" BASIS,
Withoutwarranties or conditions of any kind, either express or implied.
See theLicense for the specific language governing permissions and
Limitationsunder the License. See accompanying LICENSE file.
-->
<! -- Put site-specific property overrides in this file. -->
<Configuration>
<Property>
<Name> dfs. namenode. name. dir </name>
<Value> file: // home/hadoop_app/dfs/name </value>
</Property>
<Property>
<Name> dfs. datanode. data. dir </name>
<Value> file: // home/hadoop_app/dfs/data </value>
</Property>
<Property>
<Name> dfs. replication </name>
<Value> 2 </value>
</Property>
<! --
<Property>
<Name> dfs. nameservices </name>
<Value> hadoop-cluster1 </value>
</Property>
-->
<Property>
<Name> dfs. namenode. secondary. http-address </name>
<Value> 192.168.121.218: 50090 </value>
</Property>
<Property>
<Name> dfs. webhdfs. enabled </name>
<Value> true </value>
</Property>
</Configuration>
Mapred-site.xml
<? Xmlversion = "1.0"?>
<? Xml-stylesheettype = "text/xsl" href = "configuration. xsl"?>
<Configuration>
<Property>
<Name> mapreduce. framework. name </name>
<Value> yarn </value>
<Final> true </final>
</Property>
<Property>
<Name> mapreduce. jobtracker. http. address </name>
<Value> 192.168.121.218: 50030 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. address </name>
<Value> 192.168.121.218: 10020 </value>
</Property>
<Property>
<Name> mapreduce. jobhistory. webapp. address </name>
<Value> 192.168.121.218: 19888 </value>
</Property>
<Property>
<Name> mapred. job. tracker </name>
<Value> http: // 192.168.121.218: 9001 </value>
</Property>
</Configuration>
Yarn-site.xml
<Property>
<Name> yarn. resourcemanager. hostname </name>
<Value> hadoop261 </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services </name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. resourcemanager. address </name>
<Value> 192.168.121.218: 8032 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. schedager. address </name>
<Value> 192.168.121.218: 8030 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. resource-tracker.address </name>
<Value> 192.168.121.218: 8031 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. admin. address </name>
<Value> 192.168.121.218: 8033 </value>
</Property>
<Property>
<Name> yarn. resourcemanager. webapp. address </name>
<Value> 192.168.121.218: 8088 </value>
</Property>
Vi hadoop-env.sh and yarn-env.sh
Add the following environment variables at the beginning. I tried not to add an error indicating that JAVA_HOME could not be found.
Export JAVA_HOME =/home/java/jdk1.7
# Thejava implementation to use.
ExportJAVA_HOME =$ {JAVA_HOME}
In fact, the environment variables can be read. I added them.
Standalone,Repeatedly executed on all machines
You can use hadoop dfsadmin-report to check whether an error is reported.
Modify slaves
# Hadoop261
Hadoop262
Hadoop263
Here we can see that it is quite simple, and the set is not a cluster, but only in slaves.
Easy to configure:
After the configuration is complete on one machine, copy the entire java and hadoop directories to other machines and organize them according to the original directory, and add the environment variables.
Format a file system:
Hadoopnamenode-format
If hadoop dfsadmin-report is not entered, the following error is reported:
Report: Call From hadoop261/192.168.121.218 to hadoop261: 9000 failed on connectionexception: java.net. ConnectException: Connection denied; For moredetails see: http://wiki.apache.org/hadoop/ConnectionRefused
Verify installation:
Start:
Sbin/start-all.sh
Enter:
[Root @ hadoop261sbin] #Jps
56745Jps
56320 SecondaryNameNode
56465 ResourceManager
56129 NameNode
Enter the address for browsing on the local machine:
Http: // 192.168.121.218: 8088/cluster
Result
Input address: http: // 192.168.121.218: 50070/
Datanode Information
I tried to use 64-bit rhel 6 here. If there is a problem with the startup, I switched back to 64-bit hadoop2.6 and re-adjusted it.
The IP address configured for the hostname used for the first time is changed to an IP address if an error is reported.
Continue wordcount tomorrow. If you are interested, you can go to QQ Group 208881891 for discussion.