This tutorial is built using Vultr VPS, which mainly implements HDFs and MapReduce two functions.
45.32. 90.100 45.32. 92.47 45.32. 89.205
First, prepare
Use SSH to sign in to three VPS
Modify the host name to modify the following two files
/etc/hosts/etc/sysconfig/network
And at the end of the/etc/hosts, add the host name:
45.32. 90.100 Master.hadoop 45.32. 92.47 slave1.hadoop45.32. 89.205 Slave2.hadoop
Deactivate iptables Firewall
Service Iptables Stop
Second, configure SSH
Configure SSH Public private key (no password) login
Target: Master can access all slave, each slave can access master, each machine can access its own
Implementation: You can use Ssh-keygen, generate a public private key, and append the public key id_rsa.pub to the target machine./ssh/authorized_keys
Access yourself and all slave separately in master, and enter "Yes" to initialize the public key
SSH Master.hadoop SSH Slave1.hadoop ssh Slave2.hadoop
In slave1, access yourself and master, and enter "Yes"
SSH Master.hadoop ssh Slave1.hadoop
In Slave2, access yourself and master, and enter "Yes"
SSH Master.hadoop ssh Slave2.hadoop
Third, install Java JDK
1. Download Java JDK
2. Unzip to/USR/LIB/JDK
3. Configuring/etc/profile Environment variables
Export java_home=/usr/lib/jdkexport jre_home= $JAVA _home/jreexport CLASSPATH=.: $CLASSPATH: $JAVA _ Home/lib: $JRE _home/libexport PATH= $PATH: $JAVA _home/bin: $JRE _home/bin
4. Make the environment variable effective
Source/etc/profile
5. Check if Java is installed successfully
Java-version
If the version number is displayed correctly, the configuration is successful
" 1.8.0_66 " 1.8. 0_66-25.66-b17, Mixed mode)
Iv. installation of Hadoop
1. Download Hadoop 1.2.1
wget https://archive.apache.org/dist/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz
2. Unzip to/usr/local/hadoop
3. Create File system directory/usr/local/hadoop/tmp
mkdir /usr/local/hadoop/tmp
4. Configure environment variable/etc/profile
Export hadoop_home=/usr/local/hadoopexport PATH= $PATH: $HADOOP _home/bin
and make the environment variable effective
Source/etc/profile
5. Check if Hadoop is installed successfully
Hadoop version
V. Configuring Hadoop
Enter the/usr/local/hadoop/conf directory
1. Modify the Masters file
Master.hadoop
2. Modify the Slaves file
Slave1.hadoopslave2.hadoop
3, Configuration hadoop-env.sh
Add Java JDK Path
Export JAVA_HOME=/USR/LIB/JDK
4, Configuration Core-site.xml
<configuration> <property> <name>hadoop.tmp. Dir</name> <value>/usr/local/hadoop/tmp</value> dir</description > </property> <property> <name>fs.default.name</name> <value >hdfs://master.hadoop:9000</value> </property></configuration>
5, Configuration Hdfs-site.xml
(because there are only 2 slave, so the value of replication is 1, multiple machines can be added)
<configuration> <property> <name>dfs.replication</name> <value>1 </value> </property></configuration>
6, Configuration Mapred-site.xml
<configuration> <property> <name>mapred.job.tracker</name> <value> http://master.hadoop:9001</value> </property></configuration>
Vi. start Hadoop
1. Format file partition (only one time)
Enter the/usr/local/hadoop/bin directory, run
/usr/local/hadoop/bin/hadoop Namenode-format
After a successful format, there will be DFS and mapred two subdirectories in the/usr/local/hadoop/tmp directory
2. Start Hadoop
/usr/local/hadoop/bin/start-all. SH
3. Stop Hadoop
/usr/local/hadoop/bin/stop-all. SH
4. View Hadoop running Status
Master on input: JPS
3798 Jps 3450 NameNode 3690 Jobtracker 3596 Secondarynamenode
Slave on input: JPS
2148 Jps 1988 DataNode 2072 Tasktracker
Vii. Mission View
1. HDFs task view
Enter in the browser
Master IP:50070
Click Live Nodes to see
Try to build a 500MB file and pass in the HDFs file system
DD if=/dev/zero of=/root/test bs=1k count=512000-put ~/test test
Map/reduce task View
Master IP:50030
CentOS 6.7 builds Hadoop 1.2.1 environment