Big Data Note 01: Introduction to Hadoop for big data

Source: Internet
Author: User

1. Background

With the advent of the big data era, people are discovering more and more data. But how do you store and analyze Big data ?

Stand-alone PC storage and analysis data has many bottlenecks, including storage capacity, read-write rate, computational efficiency, and so on, these stand-alone PCs can not meet the requirements.

2. In order to solve these problems, such as storage capacity, read-write rate, computational efficiency and so on,Google Big Data technology has developed three revolutionary technologies to solve these problems , these three technologies are:

(1) MapReduce

(2) BigTable

(3) GFS

Technology Revolutionary:

Revolutionary change 01: Lower cost, the ability to use PCs without mainframe and high-end storage.

Revolutionary change 02: Software fault-tolerant hardware failures are seen as normal, ensuring reliability through software.

Revolutionary Change 03: Simplifies parallel distributed computing without controlling node synchronization and data exchange.

But Google just published the relevant technical papers, no open source code .

3. Fortunately, an open- source implementation that mimics Google's big Data technology is:


Then we need to explain the features and benefits of Hadoop:

(1) What is Hadoop first?

Hadoop is a platform for open-source distributed storage and distributed computing .

(2) Why is Hadoop capable of distributed storage and distributed computing?

This is because Hadoop consists of two core components:

HDFs: A Distributed File system that stores massive amounts of data

strong> strong> mapreduce: Framework for parallel processing, implementing task decomposition and scheduling

strong> strong> (3) What can Hadoop do for you?

strong> strong> build a large Data warehouse, petabytes of data storage, processing, analysis, statistics and other services.

(4) Advantages of Hadoop

• Advantage 1: High expansion (theoretically unlimited)

• Advantage 2: Low cost

• Advantage 3: Mature eco-circle (very rich tool chain)

These large numbers of tools are derived from Hadoop, and their presence makes Hadoop more efficient and convenient.

(5) Application of Hadoop

At present, many large companies in China and abroad are using Hadoop to build this big data platform.

(6) Hadoop has become the first choice for big data platforms in the industry, and the need for Hadoop talent is growing.

Big Data Note 01: Introduction to Hadoop for big data

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.