1. Background
With the advent of the big data era, people are discovering more and more data. But how do you store and analyze Big data ?
Stand-alone PC storage and analysis data has many bottlenecks, including storage capacity, read-write rate, computational efficiency, and so on, these stand-alone PCs can not meet the requirements.
2. In order to solve these problems, such as storage capacity, read-write rate, computational efficiency and so on,Google Big Data technology has developed three revolutionary technologies to solve these problems , these three technologies are:
(1) MapReduce
(2) BigTable
(3) GFS
Technology Revolutionary:
Revolutionary change 01: Lower cost, the ability to use PCs without mainframe and high-end storage.
Revolutionary change 02: Software fault-tolerant hardware failures are seen as normal, ensuring reliability through software.
Revolutionary Change 03: Simplifies parallel distributed computing without controlling node synchronization and data exchange.
But Google just published the relevant technical papers, no open source code .
3. Fortunately, an open- source implementation that mimics Google's big Data technology is:
Hadoop
Then we need to explain the features and benefits of Hadoop:
(1) What is Hadoop first?
Hadoop is a platform for open-source distributed storage and distributed computing .
(2) Why is Hadoop capable of distributed storage and distributed computing?
This is because Hadoop consists of two core components:
HDFs: A Distributed File system that stores massive amounts of data
strong> strong> mapreduce: Framework for parallel processing, implementing task decomposition and scheduling
strong> strong> (3) What can Hadoop do for you?
strong> strong> build a large Data warehouse, petabytes of data storage, processing, analysis, statistics and other services.
(4) Advantages of Hadoop
• Advantage 1: High expansion (theoretically unlimited)
• Advantage 2: Low cost
• Advantage 3: Mature eco-circle (very rich tool chain)
These large numbers of tools are derived from Hadoop, and their presence makes Hadoop more efficient and convenient.
(5) Application of Hadoop
At present, many large companies in China and abroad are using Hadoop to build this big data platform.
(6) Hadoop has become the first choice for big data platforms in the industry, and the need for Hadoop talent is growing.
Big Data Note 01: Introduction to Hadoop for big data