A graph that tells you whether you need SQL or Hadoop

Source: Internet
Author: User

Many friends ask whether the current full-time Hadoop is suitable for the introduction of our own projects, when to use SQL, when to use Hadoop, how to choose between them? Aaron Cordova to answer your question with a picture, for different data scenarios, how to choose the correct data storage processing tool is described in detail. Aaron Cordova is an expert on big data analytics and architecture in the United States, Koverse CTO and co-founder.

The text on Twitter @merv forwarded a blog, "Statistics of triangles."
This is a blog about how to count triangles in a graph, and compares the results of MapReduce using Vertica with Hadoop. On top of 1.3 GB of data, Vertica is 22-40x times faster than Hadoop. And it only uses three rows of SQL. Statistics show that the Vertica is simpler and faster on top of 1.3 GB of data. But the results are not so interesting.
The results for the write task will be quite different-yes, SQL is really very simple in this case, as you all know. SQL is much simpler than mapreduce, but in distributed computing, MapReduce is much simpler than SQL. And MapReduce can do things that SQL can't do, like processing.
Using 1.3 GB of data as a benchmark for Vertica or Hadoop is like saying "we're going to have a 50-metre race between Boeing 737 and DC10." Such a game is not even necessary to take off. The comparison of the above blog is the same truth. These techniques are clearly not designed to handle this level of data set.
If there is a scalable system, even if the small-scale data is still very fast, of course, it is better, but this is not discussed in this article. Whether the performance results of large-scale data are still so obvious, this problem is not so obvious, it is really worth proving.
In order to help you how to choose the technology based on your actual situation, I drew this flowchart:

Original link: http://aaroncordova.com/blog2/roncordova.com/2012/01/do-i-need-sql-or-hadoop-flowchart.html.

A graph that tells you whether you need SQL or Hadoop

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.