Hadoop core components of the Hadoop basic concept

Source: Internet
Author: User

Knowing and learning about Hadoop, we have to understand the composition of Hadoop, and based on my own experience, I introduce the Hadoop component, the big data processing process, and the three aspects of Hadoop core:

    • Hadoop Components

650) this.width=650; "Src=" Http://s3.51cto.com/wyfs02/M00/8A/F8/wKiom1g_2tPjiBJHAAFoer0v5dI244.png-wh_500x0-wm_3 -wmp_4-s_2100989086.png "title=" Hadoop component. png "alt=" wkiom1g_2tpjibjhaafoer0v5di244.png-wh_50 "/>

by the figure we can see Hadoop component consists of the underlying Hadoop core components and the upper Hadoop ecosystems are integrated, and the upper-level ecosystems are based on lower-level storage and computation.

first, let's look at the core components:Mapreduceand theHDFS. The generation of core components is based onGooglethoughts Come,Googleof theGFSbrought what we now know.HDFS,Mapreducebrought the presentMapreduce. BecauseGooglehave abigtablethe idea is to store all of the Web data through a single table, which also bringsHbase,butHbasejust this architectural idea, the architecture is not exactly the same.

and the ecology on the upper level is around Hadoop core components for data integration, data mining, data security, data management and user experience.

    • Big Data processing:

650) this.width=650; "Src=" Http://s4.51cto.com/wyfs02/M00/8A/F8/wKiom1g_2wuRPKHzAAFEf_3wczQ209.png-wh_500x0-wm_3 -wmp_4-s_24534979.png "title=" Big Data processing. png "alt=" wkiom1g_2wurpkhzaafef_3wczq209.png-wh_50 "/>

pdf csv ). Before the project of a procuratorate, a large number of cases and instruments are pdf and csv hadoop unify the structure and modelling, carries on the national index, greatly enhances the efficiency.

then the data storage layer, the data storage layer can choose HDFS , you can also choose HBase . What are the two ways to make a better choice? HDFs is generally a good time to have a large number of datasets, because HDFs provides high-throughput data access and is ideal for applications on large-scale datasets. and HBase is more of a performance that uses its random write, random access to massive amounts of data.

Then there is the data processing tool, the basic isSparkand theMapReduce,the more advanced one isHiveand thePig, I will do a detailed analysis of the opportunity. After these data processing tools, we have to followBIintegration with existing, traditional data, we can useImpala, and make timely inquiries. First we need to build it in advance.Q, figure out dimensions, metrics,Impaladrill, slice, dice, fast. Searchis the authoritative index, before the work is done, you can search to find the information you need.

Big Data processing requires these components to work, but the components are in different stages. , Here 's a look at the core components.

    • Hadoop Core

650) this.width=650; "Src=" Http://s1.51cto.com/wyfs02/M01/8A/F8/wKiom1g_21-zWKHeAAErDiuuGio087.png-wh_500x0-wm_3 -wmp_4-s_1157244663.png "title=" Hadoop core. png "alt=" wkiom1g_21-zwkheaaerdiuugio087.png-wh_50 "/>

The main emphasis here YARN : We all know that the use of resources is a common cluster of resources, in the process of using resources need to control the resources, and YARN can play a role in how much control and use of resources.

the above is to introduce to you Hadoop of Components , as for the role of each component, follow-up I will also give you a knowledge sharing. The students who are interested in big data are advised to study and learn more, I usually like to focus on Big data cn and the Big Data Times Learning Center These public numbers, the introduction of some knowledge is very good, you can look at. In addition, you can read more books on this, and constantly improve and improve their knowledge structure!


This article from "11872756" blog, declined reprint!

Hadoop core components of the Hadoop basic concept

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.