Keywordsnbsp large data misunderstanding data Warehouse
With so much hype about big data, IT managers have a hard time knowing how to tap into the potential of big data. Gartner points to five misconceptions about big data to help IT managers develop their information infrastructure strategy.
Alexander Linden, Gartner research director, said: "Big Data offers great opportunities, but it also poses greater challenges." Massive amounts of data do not solve the problems inherent in the data. IT managers need to break all hype and guide action based on known facts and business-driven results. ”
Myth 1: In the use of large data other people are ahead of me
There has been an unprecedented level of interest in large data technology and services, with 73% per cent of the respondents investing or planning to invest in big data. But most corporate institutions are still in the early stages of using large data, and only 13% of respondents have deployed large data solutions (see Figure 1).
Figure 1, 2013 and 2014 stages of large data adoption
Note: Gartner asks each of the respondents, "which of the following 5 phases best describes the stage in which your organization uses large data?" ” 2014 n = 302,2013 Year n = 720. Source: Gartner (September 2014)
The biggest challenge for an enterprise structure is to determine how to derive value from large data and to determine where to start. Many enterprise organizations are stuck in the pilot phase because they do not associate technology with business processes or specific usages.
Myth 2: We have so much data that we don't need to worry about a small data defect
IT managers believe that the current management of so many data makes a single problem of data quality trivial, because of the "big Data law." This view holds that a single data quality flaw does not affect the results of the entire data analysis, as each defect is only a very small part of the inland volume data of the enterprise organization.
"In fact, even though the impact of a single flaw on the entire dataset is smaller than the amount of data, it's more than all the bugs," says Ted Friedman, vice president of Gartner. As a result, poor data quality has the same impact on the entire dataset. In addition, most of the data used by enterprise organizations in the context of large data sources are external or unknown and unknown. This means that there is a higher likelihood of data quality problems than ever before, so data quality is actually becoming more important in the context of large data. ”
Myth 3: Large data continues to eliminate the need for large data consolidation
The general view is that large data technologies-especially the potential to process information in a pattern approach-will enable enterprise organizations to use multiple data models to read the same data sources. Many people believe that this flexibility will allow end users to determine how to translate the various datasets on demand. They believe this will also provide data access that meets the needs of individual users.
In reality, most information users are heavily dependent on the "in-write mode" where data is described and content is described in advance, so the data integrity and relevance to the scene has been unified.
Myth 4: Using a data Warehouse for advanced analysis is meaningless
Many leaders of information management believe that building a data warehouse is time-consuming and pointless, because advanced analysis uses new data rather than just data warehouses.
The reality is that many advanced analytics projects use data warehouses in the analysis process. In other cases, the information manager must refine the new data type as part of the large data to make it suitable for analysis. They need to determine what data is relevant, how to aggregate the data, and the level of data quality, and this data extraction may occur in many places, not just databases.
Myth 5: Data Lake will replace data Warehouse
Many vendors define the data lake as an enterprise data management platform for analyzing RAW format data from a variety of sources.
The reality is that it is misleading for a manufacturer to position a data lake as a substitute for a data warehouse or as a key element of the customer analysis technology facility. The basic technology of data lake lacks the mature type and breadth of the technical feature of Data Warehouse. "The Data warehouse already has the ability to support a wide range of users across the organization," said Nick Heudecker, Gartner research director. There is no need for information managers to wait for the data lake to catch up. The
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.