This article takes the installation and configuration of a hadoop cluster composed of three machines as an example.
The information of the three machines is as follows:
Hostname role IP Address
DC01 mast 192.168.68.57
Dc02 slave 192.168.68.58
Dc03 slave 192.168.68.59
The username is root.
1: Install jdk1.6 and configure the Java environment. For details, refer:
Install JDK in Tomcat 7 on centos6
2: Configure SSH password-less logon for centos:
Ssh-keygen-t dsa-P ""-f ~ /. Ssh/id_dsa
Cat ~ /. Ssh/id_dsa.pub> ~ /. Ssh/authorized_keys
Test whether the configuration is successful:
SSH localhost
The first running will prompt you whether to connect. Enter yes and press Enter.
Modify the machine name (take DC01 as an example. The methods for the other two machines are the same ):
VI/etc/sysconfig/network-scripts/ifcfg-eth0
Add a row:
Hostname = DC01
Save
VI/etc/sysconfig/Network
Change hostname to DC01
Save
VI/etc/hosts
Add the following information:
192.168.68.57 DC01
192.168.68.58 dc02
192.168.68.59 dc03
Save
So far, the hostname has been modified. Restart the host to make the hostname take effect.
Configure DC01 to log on to dc02 and dc03 without a password:
In DC01
SCP./. Ssh/authorized_keys dc02 :~ /. Ssh/
SCP./. Ssh/authorized_keys dc03 :~ /. Ssh/
Test whether the password-less logon configuration is successful:
SSH dc02
SSH dc03
Download hadoop, where hadoop-1.0.3 is used
Tar-xzvf hadoop-1.0.3.tar.gz
Music hadoop-1.0.3/usr/local/hadoop
3. Configure hadoop:
Echo $ java_home
Copy the java_home value,
Modify hadoop-env.sh
VI/usr/local/hadoop/CONF/hadoop-env.sh
Find:
# The JAVA Implementation to use. required.
# Export java_home =/usr/lib/j2sdk1.5-sun
Add a row below:
Export java_home =/usr/Java/jdk1.6.0 _ 37
# Add the jar package in hbase to hadoop's classpath:
# Set the hbase installation directory:
Export hbase_home =/usr/local/hbase
# Extra Java classpath elements. Optional.
Export hadoop_classpath = $ hadoop_classpath: $ hbase_home/hbase-0.94.1.jar: $ hbase_home/hbase-0.94.1-test.jar: $ hbase_home/conf: $ {hbase_home}/lib/zookeeper-3.4.3.jar: $ {hbase_home}/lib/protobuf-java-2.4.0a.jar: $ {hbase_home}/lib /*
Save
Modify core-site.xml
VI/usr/local/hadoop/CONF/core-site.xml
Change the file to the following:
<? XML version = "1.0"?>
<? XML-stylesheet type = "text/XSL" href = "configuration. XSL"?>
<! -- Put site-specific property overrides in this file. -->
<Configuration>
<Property>
<Name> fs. Default. Name </Name>
<Value> HDFS: // dc01: 9000 </value>
</Property>
</Configuration>
Save
Modify mapred-site.xml
VI/usr/local/hadoop/CONF/mapred-site.xml
<? XML version = "1.0"?>
<? XML-stylesheet type = "text/XSL" href = "configuration. XSL"?>
<! -- Put site-specific property overrides in this file. -->
<Configuration>
<Property>
<Name> mapred. Job. Tracker </Name>
<Value> DC01: 9001 </value>
</Property>
</Configuration>
Save
Modify masters
VI/usr/local/hadoop/CONF/masters
Changed:
DC01
Save
Modify slaves
VI/usr/local/hadoop/CONF/slaves
Changed:
Dc02
Dc03
Save
Modify HDFS
Vim hdfs-site.xml.
<? XML version = "1.0"?>
<? XML-stylesheet type = "text/XSL" href = "configuration. XSL"?>
<! -- Put site-specific property overrides in this file. -->
<Configuration>
<Property>
<Name> DFS. Name. dir </Name>
<Value>/opt/HDFS/name </value>
<Final> true </FINAL>
</Property>
<Property>
<Name> DFS. Data. dir </Name>
<Value>/opt/HDFS/Data </value>
<Final> true </FINAL>
</Property>
</Configuration>
So far, DC01 has been configured. Copy the hadoop configuration file of DC01 to the other two machines:
SCP/usr/local/hadoop/CONF/* dc02:/usr/local/hadoop/CONF/
SCP/usr/local/hadoop/CONF/* dc03:/usr/local/hadoop/CONF/
Now that the configuration of the entire hadoop cluster is complete, start hadoop.
4. Start hadoop:
CD/usr/local/hadoop/bin/
./Hadoop namenode-format
/Start-all.sh
Shut down the system firewall (it is best to stop the Firewall Service directly, or you can add a port that is allowed to pass through, the specific port I have not had time to find)
Service iptables stop
Add firewall rules to view the status of the hadoop cluster on the webpage.
VI/etc/sysconfig/iptables
Add the following two lines:
-A input-M state -- state new-m tcp-p tcp -- dport 50030-J accept
-A input-M state -- state new-m tcp-p tcp -- dport 50070-J accept
Save.
Restart the iptables service:
Service iptables restart
Enter the following address in the browser to view the cluster status:
Http: // 192.168.68.57: 50070
Enter the following address in the browser to view the MAP/reduce status:
Http: // 192.168.68.57: 50030
After I open it, it is initializing. It is always like this. I don't know why (Solved. Turn off the firewall.).
After successful:
To be continued