Perfect combination of Hadoop and Data Warehouse

Source: Internet
Author: User
Keywords Data Warehouse Hadoop

It seems that the word "big data" hangs on everyone's lips. The discussion around big data has almost completely overwhelmed the traditional http://www.aliyun.com/zixun/aggregation/8302.html "> Data Warehouse." Some big data enthusiasts even boldly predict that in the near future, all enterprise data will be hosted by a system based on Apache Hadoop, and the Enterprise Data Warehouse (EDW) would eventually die out.

In any case, there is no doubt that traditional data warehouse architectures are evolving. For the past year, I have been writing related articles and blogs, but will it really die? I think the odds are slim. In fact, while everyone is talking about a technology or architecture that might trump another technology or architecture, IBM has a different perspective.

At IBM, we tend to look at problems from the perspective of "Hadoop and Data warehousing," which can be said to be a perfect match. Imagine that for an enterprise with a traditional data warehouse, the opportunity for large data is to make it possible to take advantage of data that used to be unavailable through traditional warehouse architectures.
But why can't traditional data warehouses take on this responsibility? The reasons are manifold. First, the traditional architecture of the data Warehouse uses structured data from the business system to analyze all aspects of the business. The data is cleaned, modeled, distributed, governed, and maintained to perform historical analysis. The data we store in the data warehouse is predictable, both in terms of structural considerations and data uptake rates.

By contrast, large data is unpredictable. The structure of large data is varied and is too large for EDW. In particular, we are more accustomed to browsing large amounts of data to find the information we really need. It may not be long before you decide to discard the data, which in some cases may be saved for a shorter period of time. If we decide to keep all this data, we need to use a more economical solution than EDW to store the unstructured data for future use for historical analysis (another argument for using Hadoop in conjunction with the Data Warehouse).

Big Data presents new opportunities for many customers, and Hadoop now offers us the ability to use new data sources to make analytics smarter. But this new frontier is complementary to the boundaries of the traditional data Warehouse architecture, rather than one instead of another one. We still have to provide the traditional analysis of all the business areas (finance, marketing, sales, customer service, etc.) and these analyses cannot be eliminated quickly. However, it is necessary to recognize the fact that we should broaden the analysis menu to include new sources of more insight and new tools that allow us to achieve goals that were not possible in the past, such as emotional analysis.

I believe that big data will be one of the main drivers of EDW architecture reform, but not unique. Continued growth in equipment, higher requirements for value implementation, and the need for agility and even simplicity in our solutions will play an important role in this reform.

Please think about it: agility and simplicity? This is not the word that we frequently refer to when we construct the Enterprise Data Warehouse! However, the facts are obvious. Many large EDW projects do not have full potential because they are too complex and agile far less than business expects. Another fact is that companies that use analytics to drive decisions are better off. The combined annual growth rate of these enterprises (CAGR) is 49% higher than that of other enterprises, and the profit increase can reach 20 times times of other enterprises, and the return on investment will be 30% higher. There is no doubt that most businesses are struggling to achieve the overall goal.

Figure Word:
Revenue substituting 5 year CAGR (2004-2008): Revenue growth 5 years CAGR (2004 to 2008)
Viewable substituting 5 year CAGR (2004-2008): Profit growth 5 years CAGR (2004 to 2008)
Return to ivested Capital 5 years Average (2004-2008): 5-year average returns on investment (2004 to 2008)
More than 49% over 20 times times more than 30%
Finance organizations with business insights: financial institutions with Business insights
All others enterprises: all other businesses

The secret to establishing this coordination relationship is to have a deep understanding of the types of analysis currently in place and future needs. In the past, our idea of EDW was a thriving ecosystem. Today, we have shifted from the architecture that specializes in providing enterprise data to the architecture that provides enterprise data and intelligence analysis.

Think of all types of data and all types of analysis. This is the analysis of wisdom today!

We have made great strides. Let's move on!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.