Hadoop Installation Memo

Last Update:2015-04-26 Source: Internet

Author: User

Tags free ssh

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Refer to Liu Peng's "Live Hadoop" book, in accordance with a few points of attention in Hadoop 0.20.2.

First, start by understanding several background processes in Hadoop.

Namenode,secondary Namenode,jobtracker,tasktracker,datanode these characters.

NameNode: is responsible for how to slice and dice the data block, and which node to put. It is centrally managed for memory and I/O.

This process is deployed on the master node, is a single point, it hangs the entire system is hung up.

Secondary NameNode: As with NameNode, the auxiliary program. Each cluster has one, which communicates with the Namenode and periodically saves the HDFs metadata snapshot when the Namenode fault can be used as a standby namenode. It is also deployed on the master node.

Jobtracker is responsible for dispatching the job, which determines which files are run by which nodes, and listens for heartbeats sent by Tasktracker. When a heartbeat is not received, a task is considered to fail, and a task is determined to be restarted. There is only one jobtracker per cluster. It is deployed on the master node.

The above three processes are deployed on the master node, while the Tasktracker and Datanode process processes are required to be deployed for each of the various points in a cluster.

The Datanode is responsible for reading and writing HDFS data blocks to the local file system. When the client reads and writes a database, the Namenode tells the client to go to that Datanode, and the client communicates directly with the Datanode server and operates the relevant data block.

Tasktracker is also located from the node, which is responsible for executing the specific task independently, each slave node can have only one tasktracker, but each tasktracker can produce multiple Java virtual machines for parallel processing of multiple map and reduce think. Tasktracker also interacts with Jobtracker, Jobtasker is responsible for assigning tasks, and detects tasktracker heartbeat, if there is no heartbeat, it is considered to have crashed and will be considered to be assigned to other tasktracker.

The deployment diagram for each process is as follows:

650) this.width=650; "title=" clipboard "style=" border-top:0px; border-right:0px; Background-image:none; border-bottom:0px; padding-top:0px; padding-left:0px; margin:0px; border-left:0px; padding-right:0px "border=" 0 "alt=" clipboard "src=" http://s3.51cto.com/wyfs02/M00/6B/F3/ Wkiol1u7nkgbqs9gaageg2yscne517.jpg "" 425 "height=" 508 "/>

Specific installation links can be in reference to the steps, but there are a few points to note.

Host and Slave Unified create a dedicated user to run Hadoop grid, set up SSH password-free login mechanism, you can refer to http://chenlb.iteye.com/blog/211809. The contents of all the public key files on the machine are integrated into a Authorized_keys file to realize the mutual password-free SSH login.

When starting Hadoop, note that to log in as a grid user, to operate in a grid user's home directory, and sometimes permissions, it is important to note that the owner of the Hadoop folder for the host and slave is set to grid users and groups. Execute Chown-r grid:grid/home/grid/hadoop-1.2.1 (here is the placement directory for Hadoop, which is to be modified using the root user)

You can then start start-all.sh in the bin directory in the Hadoop folder, and you can see the following information stating that the startup was successful.

650) this.width=650; "title=" clipboard[1] "style=" border-top:0px; border-right:0px; Background-image:none; border-bottom:0px; padding-top:0px; padding-left:0px; margin:0px; border-left:0px; padding-right:0px "border=" 0 "alt=" clipboard[1] "src=" http://s3.51cto.com/wyfs02/M01/6B/F3/wKioL1U7nkLCn1E_ Aaj7qxjuhae603.jpg "" 681 "height=" 228 "/>

At this point you can also run the command to see the start of the process, run the JDK JPS file on the host, you can see the following:

650) this.width=650; "title=" clipboard[2] "style=" border-top:0px; border-right:0px; Background-image:none; border-bottom:0px; padding-top:0px; padding-left:0px; margin:0px; border-left:0px; padding-right:0px "border=" 0 "alt=" clipboard[2] "src=" http://s3.51cto.com/wyfs02/M02/6B/F3/ Wkiol1u7nkwcjt9yaab-byo43e8307.jpg "" 443 "height=" 108 "/>

Running the same command from the node, you can see

650) this.width=650; "title=" clipboard[3] "style=" border-top:0px; border-right:0px; Background-image:none; border-bottom:0px; padding-top:0px; padding-left:0px; border-left:0px; padding-right:0px "border=" 0 "alt=" clipboard[3] "src=" http://s3.51cto.com/wyfs02/M00/6B/F3/ Wkiol1u7nkaz9zpfaabgfx9ptp8509.jpg "" Height= "/>"

At this point, the installation of Hadoop has been successful.

Hadoop Installation Memo

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More