Hadoop is powerful, but not omnipotent (CSDN)

Last Update:2015-05-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop is powerful, but before you can use Hadoop or big data, it's important to first identify your goals and determine if you've chosen the right tools, since Hadoop isn't everything! This article lists several scenarios that are not suitable for Hadoop.

With the development of Hadoop application, many people fall into the blind worship of it, think it can solve all problems. While Hadoop is a great framework for distributed large data computing, Hadoop is not everything. For example, there are several scenarios where Hadoop is not a good use:

1. Low-Latency data access

Hadoop does not apply to data access that requires real-time queries and low latency . Databases can reduce latency and fast response by Indexing records, which is simply not a substitute for Hadoop. But if you really want to replace a real-time database, you can try hbase for real-time database read and write.

2. Structured data

Hadoop is not suitable for structured data, but is ideal for semi-structured and unstructured data. Unlike Hadoop and RDBMS, distributed storage is generally used, so there will be latency issues when querying for processing.

3, when the amount of data is not large

How much data does Hadoop typically apply to? The answer is:TB or PB. When your data is only dozens of GB, there is no benefit to using Hadoop. According to the needs of the enterprise selective use of Hadoop, do not blindly follow the trend. Hadoop is powerful. But before you can use Hadoop or big data, you need to be clear about your goals and determine if you've chosen the right tools.

4, a large number of small files

Small files refer to files that are much smaller than the block size of HDFs (the default 64M) . If you store a large number of small files in HDFs and each file corresponds to a block, you will consume namenode of memory to hold the block's information . If the small file size is larger, then it will exceed the current level of computer hardware can meet the limit.

5. Too many writes and file updates

HDFs is a number of multi-read methods used. When there are too many file update requirements, Hadoop has no way to support it.

6. MapReduce may not be the best choice

MapReduce is a simple parallel programming model. is a powerful tool for big data parallel computing, but many computational tasks, work, and algorithms are inherently inappropriate for the MapReduce framework.

If you let data share in MapReduce, you can do this:

Iteration : Run multiple mapreduce jobs, the output of the previous mapreduce, as input to the next mapreduce.
Share state information : but do not share information in memory, because each mapreduce work is run on a single JVM .

Original link: Hadoop isn ' t Silver Bullet

Hadoop is powerful, but not omnipotent (CSDN)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hadoop is powerful, but not omnipotent (CSDN)

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support