Companies deploying Hadoop need to think carefully

Source: Internet
Author: User
Keywords Install www http run express

In recent years, Hadoop has received a lot of praise, as well as "moving to the Big data analysis engine". For many people, Hadoop means big data technology. But in fact, open source distributed processing framework may not be able to solve all the big data problems. This requires companies that want to deploy Hadoop to think carefully about when to apply Hadoop and when to apply other products.

For example, using Hadoop for large-scale unstructured or semi-structured data can be said to be more than sufficient. But the speed with which it handles small datasets is little known. This limits the use of Hadoop in the Metamarkets group. Metamarkets Group is located in San Francisco, providing real-time marketing analysis for online advertising.

Metamarkets CEO Michael Driscoll revealed that, in a tight time, the company uses Hadoop to process large, distributed data, including running a day end report to review the day's turnover, or to view historical data several months ago.

But in its core business of delivering to customers-running real-time analytics-metamarkets does not use Hadoop. Driscoll that the best approach is to run a batch job in a database to view each file. In the final analysis, this is a trade-off: Hadoop sacrifices speed in order to establish a deep correlation between data points. Driscoll said: "Using Hadoop is like having a pen pal, you write a letter to him, send it to the past few days before getting a reply." This is a far cry from the experience (SMS) or email. ”

10gen's product marketing manager, also MongoDB NoSQL database developer Kelly Stirman, says that fast response is critical online, while Hadoop is constrained by time. For example, online analytics applications such as product recommendation engines rely on fast processing of small amounts of information, but Hadoop does not do it effectively.

Do not consider the permutation database

Because open source technology has greatly reduced the cost of technology, some companies may consider scrapping traditional data warehouses to select Hadoop clusters. But Carl Olofson, a market research analyst at IDC, says the two are simply not comparable.

Olofson says relational databases provide the power for most data warehouses to accommodate data flows that are remitted over a period of time at fixed frequencies, such as transactions in daily business processes. On the other hand, Hadoop is good at processing large amounts of cumulative data.

Related reading:

Hadoop 2.0 Setup Wizard (0.23.x) http://www.linuxidc.com/Linux/2012-05/61463.htm

Hadoop 1.2.1 Single node setup step http://www.linuxidc.com/Linux/2013-08/89377.htm

Installing Hadoop http://www.linuxidc.com/Linux/2013-08/88600.htm on CentOS

Ubuntu 12.04 Install Hadoop http://www.linuxidc.com/Linux/2013-08/88187.htm

CentOS 6.3 compatible installation and configuration Hadoop-1.0 http://www.linuxidc.com/Linux/2013-07/87959.htm

Getting started with Hadoop –HADOOP2 pseudo distributed installation: http://www.linuxidc.com/Linux/2013-06/86403.htm

Hadoop2.2.0 single node installation and testing: http://www.linuxidc.com/Linux/2013-10/91911.htm

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.