production environment Hadoop large cluster fully distributed mode installation
2013-3-7
Installation Environment
Operating platform: Vmware2
Operating system: Oracle Enterprise Linux 5.6
Software version: Hadoop-0.22.0,jdk-6u18
Cluster Architecture: Node,master node (hotel01), slave node (hotel02,hotel03 ...)
Host name |
IP |
System version |
Hadoop Node |
Hadoop process name |
hotel01 |
192.168.2.111 |
OEL5.6 |
Master |
Namenode,jobtracker |
hotel02 |
192.168.2.112 |
OEL5.6 |
Slave / td> |
Datanode,tasktracker |
hotel03 |
192.168.2.113 |
OEL5.6 |
Slave |
Datanode,tasktracker |
... |
|
|
|
|
Description: Currently only three Hadoop test host, but in a real hadoop cluster production environment, may be hundreds or more hosts, so the following installation steps as far as possible to stand in the context of a large Hadoop cluster environment to install, reduce the individual operations on each server, Because each of these operations can be a huge project.
Installation Steps
1. Download Hadoop and JDK:
http://mirror.bit.edu.cn/apache/hadoop/common/
such as: hadoop-0.22.0
2. Configure DNS resolution host name
Note: In the production of Hadoop cluster environment, because the server may have many stations, by configuring DNS mapping machine name, compared to the configuration/etc/host method, you can avoid configuring the respective host file on each node, and the new node does not need to modify the/etc/of each node Host name-ip mapping file for host. Reduced configuration steps and time for easy management.
For detailed steps See:
"-dns Configuration of Hadoop learning notes"
http://blog.csdn.net/lichangzai/article/details/8645524
Configuration Description: The NDS server is placed on the HOTEL01 (master) node, and the host name of the HOTEL01, HOTEL02, HOTEL03 nodes is parsed.
3. Create a Hadoop run account
Create a Hadoop run account on all nodes
[ROOT@GC ~]# Groupadd Hadoop
[ROOT@GC ~]# useradd-g Hadoop Grid-Note that you must specify a grouping here, otherwise you may not be able to build trust
[ROOT@GC ~]# Idgrid
uid=501 (GRID) gid=54326 (Hadoop) groups=54326 (Hadoop)
[ROOT@GC ~]# passwd grid
Changingpassword for user grid.
New Unixpassword:
Bad Password:itis Too Short
Retype new Unixpassword:
Passwd:allauthentication Tokens updated successfully.
Description: In a large Hadoop cluster installation environment, this step can be completed before the batch installation of the Linux system before the system replication. (not tried, it is said that ghost tool software should be possible)
4. Configure SSH password-free connection via NFS
Description: Configure SSH password-free connection via NFS, when we have a new node access, we no longer need to add their own public key information to the other nodes separately, only need to append the public key information to the shared Authorized_keys public key, the other nodes directly point to the latest public key file. Facilitates the allocation of public keys and management.
For detailed steps See:
"-nfs Configuration of Hadoop learning notes"
http://blog.csdn.net/lichangzai/article/details/8646227
5. Unzip the Hadoop installation package
--You can extract a configuration file from a node first
[Grid@hotel01 ~]$ LL
Total 43580
-rw-r--r--1 grid Hadoop 445755682012-11-19 hadoop-0.20.2.tar.gz
[grid@hotel01~] $tar xzvf/home/grid/hadoop-0.20.2.tar.gz
[Grid@hotel01~]$ LL
Total 43584
Drwxr-xr-x grid Hadoop 4096 2010-02-19hadoop-0.20.2
-rw-r--r--1 grid Hadoop 44575568 2012-11-19 hadoop-0.20.2.tar.gz
--Install JDK at each node
[Root@hotel01~]#./jdk-6u18-linux-x64-rpm.bin
6. Hadoop Configuration related files
N Configuration hadoop-env.sh
[ROOT@GC conf] #pwd
/root/hadoop-0.20.2/conf
--Modify the JDK installation path
[ROOT@GC conf]vihadoop-env.sh
Export JAVA_HOME=/USR/JAVA/JDK1.6.0_18
N Configure Namenode, modify site file
--Modify the Core-site.xml file
[gird@hotel01conf]# VI Core-site.xml
<?xmlversion= "1.0"?>
<?xml-stylesheettype= "text/xsl" href= "configuration.xsl"?>
<!--Putsite-specific Property overrides the This file. -
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hotel01.licz.com:9000</value> #完全分布式不能用localhost, use the IP or machine name of the master node.
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/grid/hadoop/tmp</value>
</property>
</configuration>
Note: The IP address and port of the Fs.default.nameNameNode
--Modify the Hdfs-site.xml file
[grid@hotel01hadoop-0.20.2]$ mkdir data
[gird@hotel01conf]# VI Hdfs-site.xml
<?xmlversion= "1.0"?>
<?xml-stylesheettype= "text/xsl" href= "configuration.xsl"?>
<!--Putsite-specific Property overrides the This file. -
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/grid/hadoop-0.20.2/data</value>--Note that this directory must have been created and can read and write
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
Common configuration parameters in the Hdfs-site.xml file:
--Modify the Mapred-site.xml file
[gird@hotel01conf]# VI Mapred-site.xml
<?xmlversion= "1.0"?>
<?xml-stylesheettype= "text/xsl" href= "configuration.xsl"?>
<!--Putsite-specific Property overrides the This file. -
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hotel01.licz.com:9001</value>
</property>
</configuration>
Common configuration parameters in Mapred-site.xml files
n Configuring Masters and slaves Files
[gird@hotel01conf]$ VI Masters
Hotel01.licz.com
[gird@hotel01conf]$ VI Slaves
Hotel02.licz.com
Hotel03.licz.com
7. Replicate Hadoop (awk command) to each node
--Copy the files of Hadoop configured on the hotel01.licz.com host to each node.
-The original copy method is a command execution, as follows
[gird@hotel01conf]$ SCP-RP hadoop-0.20.2 Hotel02.licz.com:/home/grid/
[gird@hotel01conf]$ SCP-RP hadoop-0.20.2 Hotel03.licz.com:/home/grid/
--but standing in the perspective of the Hadoop cluster, the method above will be time-consuming, we can use the awk command to generate the batch execution of the script, batch execution, saving effort, as follows:
[grid@hotel01~]$ Cat hadoop-0.20.2/conf/slaves| awk ' {print ' SCP-RP hadoop-0.20.2grid@ "$":/home/grid "} ' > scp.sh
[grid@hotel01~]$ chmod u+x scp.sh
[grid@hotel01~]$ Cat scp.sh
SCP-RP Hadoop-0.20.2grid@hotel02.licz.com:/home/grid
SCP-RP Hadoop-0.20.2grid@hotel03.licz.com:/home/grid
[grid@hotel01~]$./scp.sh
8. Formatting Namenode
--Format the Namenode node
[grid@hotel01bin]$ pwd
/home/grid/hadoop-0.20.2/bin
[gird@hotel01bin]$./hadoop Namenode-format
12/10/3108:03:31 INFO Namenode. Namenode:startup_msg:
/************************************************************
Startup_msg:starting NameNode
Startup_msg:host = gc.localdomain/192.168.2.100
Startup_msg:args = [-format]
Startup_msg:version = 0.20.2
Startup_msg:build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r 911707;compiled by ' Chrisdo ' on Fri 08:07:34 UTC 2010
************************************************************/
12/10/3108:03:31 INFO Namenode. Fsnamesystem:fsowner=grid,hadoop
12/10/3108:03:31 INFO Namenode. Fsnamesystem:supergroup=supergroup
12/10/3108:03:31 INFO Namenode. Fsnamesystem:ispermissionenabled=true
12/10/3108:03:32 INFO Common. Storage:image file of size 94 saved in 0 seconds.
12/10/3108:03:32 INFO Common. Storage:storage directory/tmp/hadoop-grid/dfs/name hasbeen successfully formatted.
12/10/3108:03:32 INFO Namenode. Namenode:shutdown_msg:
/************************************************************
Shutdown_msg:shutting down NameNode at gc.localdomain/192.168.2.100
************************************************************/
9. Start Hadoop
--Start the Hadoop daemon on the master node
[gird@hotel01bin]$ pwd
/home/grid/hadoop-0.20.2/bin
[gird@hotel01bin]$./start-all.sh
Startingnamenode, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-namenode-gc.localdomain.out
Rac2:startingdatanode, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-datanode-rac2.localdomain.out
Rac1:startingdatanode, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-datanode-rac1.localdomain.out
The authenticityof host ' GC (192.168.2.100) ' can ' t is established.
RSA Keyfingerprint is 8e:47:42:44:bd:e2:28:64:10:40:8e:b5:72:f9:6c:82.
Is sure youwant to continue connecting (yes/no)? Yes
gc:Warning:Permanently added ' gc,192.168.2.100 ' (RSA) to the list of known hosts.
Gc:startingsecondarynamenode, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-secondarynamenode-gc.localdomain.out
Startingjobtracker, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-jobtracker-gc.localdomain.out
Rac2:startingtasktracker, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-tasktracker-rac2.localdomain.out
Rac1:startingtasktracker, logging to/home/grid/hadoop-0.20.2/bin/. /logs/hadoop-grid-tasktracker-rac1.localdomain.out
10. Use JPS to verify that the background processes are successfully started
--View background process on master node
[Gird@hotel01bin]$/usr/java/jdk1.6.0_18/bin/jps
27462 NameNode
29012 Jps
27672 Jobtracker
27607SecondaryNameNode
--View the background process on the slave node
[Grid@rac1 conf]$/usr/java/jdk1.6.0_18/bin/jps
16722 Jps
16672TaskTracker
16577 DataNode
[Grid@rac2conf]$/usr/java/jdk1.6.0_18/bin/jps
31451 DataNode
31547TaskTracker
31608 Jps
11. Learn about Hadoop activities through the web
Monitor Jobtracker by using browser and HTTP to access the 50030 port of the Jobtracker node
Jobtracker Monitoring
http://192.168.2.111:50030/jobtracker.jsp
Monitoring clusters by using browser and HTTP to access the 50070 port of the Namenode node
http://192.168.2.111:50070/dfshealth.jsp
12. Problems encountered during installation
1) SSH does not build trust
When you build a user without assigning a group, SSH does not build trust, as in the next steps
[ROOT@GC ~]# useradd grid
[ROOT@GC ~]# passwd grid
Solution:
Create a new user group, create a user, and specify this user group.
[ROOT@GC ~]# Groupadd Hadoop
[ROOT@GC ~]# useradd-g Hadoop grid
[ROOT@GC ~]# Idgrid
uid=501 (GRID) gid=54326 (Hadoop) groups=54326 (Hadoop)
[ROOT@GC ~]# Passwdgrid
2) After Hadoop is started, the slave node does not have a datanode process
Phenomenon:
After Hadoop is started on the master node, the master node process is normal, but the slave node does not have a datanode process.
--master node Normal
[Gird@hotel01bin]$/usr/java/jdk1.6.0_18/bin/jps
29843 Jps
29703 Jobtracker
29634 Secondarynamenode
29485 NameNode
--At this point in the two slave nodes to see the process, and found that there is no datanode process
[Grid@rac1 bin]$/usr/java/jdk1.6.0_18/bin/jps
5528 Jps
3213 Tasktracker
[Grid@rac2 bin]$/usr/java/jdk1.6.0_18/bin/jps
30518 Tasktracker
30623 Jps
Reason:
--Look back at the output log when the master node started Hadoop and find the log that started the Datanode process on the slave node
[Grid@rac2 logs]$ pwd
/home/grid/hadoop-0.20.2/logs
[Grid@rac1 logs]$ Morehadoop-grid-datanode-rac1.localdomain.log
/************************************************************
Startup_msg:starting DataNode
Startup_msg:host = rac1.localdomain/192.168.2.101
Startup_msg:args = []
Startup_msg:version = 0.20.2
Startup_msg:build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r 911707;compiled by ' Chrisdo ' on Fri 08:07:34 UTC 2010
************************************************************/
2012-11-18 07:43:33,513 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:Invaliddirectory in Dfs.data.dir:can Not create directory:/usr/hadoop-0.20.2/data
2012-11-18 07:43:33,513error org.apache.hadoop.hdfs.server.datanode.DataNode:All Directories Indfs.data.dir are Invalid.
2012-11-18 07:43:33,571 INFOorg.apache.hadoop.hdfs.server.datanode.DataNode:SHUTDOWN_MSG:
/************************************************************
Shutdown_msg:shutting down DataNode atrac1.localdomain/192.168.2.101
************************************************************/
--found to be a directory of Hdfs-site.xml configuration files The data directory is not created
Solution:
Create an HDFS data directory on each node and modify the Hdfs-site.xml configuration file parameters
[gird@hotel01~]# mkdir-p/home/grid/hadoop-0.20.2/data
[gird@hotel01conf]# VI Hdfs-site.xml
<?xmlversion= "1.0"?>
<?xml-stylesheettype= "text/xsl" href= "configuration.xsl"?>
<!--Putsite-specific Property overrides the This file. -
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/grid/hadoop-0.20.2/data</value>--Note that this directory must have been created and can read and write
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
--Restart Hadoop,slave process Normal
[gird@hotel01bin]$./stop-all.sh
[gird@hotel01bin]$./start-all.sh