Inkfish original, do not reprint commercial nature, reproduced please indicate the source (http://blog.csdn.net/inkfish).
Hadoop is an open source cloud computing platform project under the Apache Foundation. Currently the latest version is Hadoop 0.20.1. The following is a blueprint for Hadoop 0.20.1, which describes how to install Hadoop under Ubuntu Linux 9.10. (Source: Http://blog.csdn.net/inkfish)
Supported Platforms: (Source: Http://blog.csdn.net/inkfish)
Linux can be used as a platform for development and product deployment;
Windows, which can be used as a development platform.
Pre-Required Software: (Source: Http://blog.csdn.net/inkfish)
1.javatm1.6.x, must be installed, recommended to choose the Java version issued by Sun Company;
2.ssh must be installed and guaranteed to run SSHD, Hadoop will communicate with SSH;
3. If it is windows, you need to install Cygwin to support shell commands.
Install the available modes: (Source: Http://blog.csdn.net/inkfish)
1. Local mode;
2. pseudo distribution mode;
3. Full distribution mode.
Fully distributed mode installation steps (the steps here allow Hadoop to run with no tuning steps):
1. Download and extract Hadoop to a server target directory in the cluster.
2. Configure/etc/hosts files
2.1 to verify that all servers in the cluster have hostname, and that the hostname and IP correspondence are configured in the/etc/hosts file for each server in the IP
2.2. Speed up the resolution.
3. Configure SSH password-free login
3.1 run on each server:
$ ssh-keygen-t dsa-p ' F ~/.SSH/ID_DSA
$ cat ~/.ssh/id _dsa.pub >> ~/.ssh/authorized_keys
3.2 Merges the ~/.ssh/authorized_keys file contents of each server into a total Authorized_keys file;
3.3 The total Authorized_keys file SCP to each server, replace the original Authorized_keys file,
3.4 ssh each other machine, confirm SSH login does not require a password
4. Configure each server time, Ensure that each server is in the same time,
5. Configure Hadoop
5.1 Configure conf/hadoop-env.sh file
Configure java_home a row and configure the correct path.
5.2 Configuring Conf/core-site.xml files
<configuration> <property> <name>fs.default.name</name> <value>hdfs://host:9000< /value> </property> </configuration>
Note: The host here must be changed to the corresponding namenode hostname
5.3 Configuring Conf/hdfs-site.xml Files
You can also use the default settings if you do not modify them.
5.4 Configuring Conf/mapred-site.xml Files
<configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001 </value> </property> </configuration>
Note: The host here must be changed to the corresponding namenode hostname
6. Configure Conf/slaves and Conf/master files
The
Slaves file writes Datanode hostname or ip,master namenode, secondary namenode, or IP, each row writes a server, and the line begins with # as a comment.
7. Distribute Hadoop
directly through the SCP, you can copy the entire Hadoop directory to the same directory on each server, and
8. Format Hadoop namenode
Execute command: $ bin/hadoop namenode- Format
9. Start Hadoop
Execute command: $ bin/start-all.sh
At this point, the full distributed installation is complete, usually boot to all servers fully identified it takes a certain amount of time (I'm here for 5 minutes), to be patient, in n Amenode node, open the browser, input http://localhost:50070/can see the entire Hadoop situation, jobtracker situation can look at the http://localhost:50030/of each server.