Tell you what Hadoop is.

Source: Internet
Author: User
Keywords What name diagram tell you DFS

What is Hadoop? Hadoop is a software platform for analyzing and processing large data, and it is a box of open source software implemented in Java language, which implements the http://www.aliyun.com/zixun/of massive data in a large number of computer clusters Appach Aggregation/13452.html "> Distributed computing.

The most central design of Hadoop's framework is that HDFS and MAPREDUCE.HDFS provide storage for massive amounts of data, and MapReduce provide computing for massive amounts of data.

The process of large data in Hadoop can be understood in the light of the following simple diagram: The data is the result of the cluster processing of Hadoop.

Hdfs:hadoop the Distributed File system Distributed file System,hadoop.

Large files are divided into the default 64M blocks of block distribution stored in the cluster machine. The file data1 in the following figure is divided into 3 blocks, and the 3 pieces are distributed in redundant mirrors in different machines.

Mapreduce:hadoop for each input split create a task call map calculation, in which the record is processed sequentially in this split, and the map outputs the result in key--value form. Hadoop is responsible for the output of the map as the input of reduce after the key value, and the output of the reduce task is the output of the entire job, stored on the HDFs.

The cluster of Hadoop is composed mainly of Namenode,datanode,secondary namenode,jobtracker,tasktracker. The following illustration shows:

Namenode records how files are split into blocks and that these blocks are stored in those datenode nodes. Namenode also holds the status information for the file system running. Datanode stores the split blocks.secondary Namenode help Namenode collect state information about the file system running. Jobtracker responsible for job execution when a task is submitted to the Hadoop cluster Responsible for scheduling multiple tasktracker.tasktracker to be responsible for a map or reduce task.

Original link: http://blog.csdn.net/kkdelta/article/details/7696025

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.