Impossible task? Data governance process requires large data

Source: Internet
Author: User
Keywords Large data large data great big data very no big data very no we big data very no we should

"Big Data" can be a tempting condition for an enterprise's competitive advantage, including the ability to use it to unlock a customer's confidentiality, to understand the site's usage and other key elements of the business. But everything should be prudent: if there is no proper data management process, just a passion, http://www.aliyun.com/zixun/aggregation/13844.html "> Big Data Project can bring chaos Trouble, including false data and unexpected costs.

The role of data governance is to protect large data. Although large data often involves a large amount of unstructured information, many enterprise IT departments find that large data is only the latest phenomenon. As a result, according to data management analysts, the environmental governance of large data is still in its early stages, and there are a wide range of ways to effectively manage large data.

"Big data is such a new field, so far no one has developed the relevant management procedures and policies." "Forrester Research analyst Borisevelson in Cambridge, Massachusetts. "And there are more problems than answers. ”

A fundamental problem is that large data pools are more data-oriented exploration and discovery than traditional business intelligence reporting and analysis, Evelson added. This, he says, creates a vicious circle: "Data cannot be managed until it is modeled, but it must be modeled after data analysis." ”

Data management programs provide a framework for setting data usage policies and implementing controls to ensure that information remains accurate and consistent and accessible. Clearly, in the course of this major challenge, managing large data requires classification, modeling, and data mapping, and data capture and storage, especially for a large number of unstructured features.

"To get meaningful business information from large data, we need to do all kinds of preparation, like semantic analysis of data, and then render it as a conceptual model or semantic analysis of ontologies." "said Malkom Chizem, president of Askget, a consultancy for data management at Holmdel, New Jersey.

Finding clues in large data

The hard part is that everything in the big data governance process is so new. "When it comes to big data, there's a lot of immaturity, and most data managers are really clueless," he said. "Chisholm said.

Large data, which also includes a large number of structured transaction data, has special functions. It is usually defined in three words: quantity, type, and speed. Forrester also adds variability to its definition, and its rival consulting firm Gartner defines this as a complex feature.

In addition, the data often comes from external sources whose accuracy is not always readily verifiable, and the meaning and context of the text data are not necessarily coherent. In many cases, it is stored in a file system or NoSQL database in Hadoop, rather than in a traditional data warehouse. For many businesses, large data involves all the people involved: IT managers, programmers, data architects, data modelers, and data management professionals.

Rick Schelman, founder and advisor to the IT Solution at Stowe Athena, Massachusetts, says one of the biggest pitfalls in trying to manage massive amounts of data is the loss of visibility into business priorities.

For example, most of the unstructured data captured by businesses comes from social media, and usually only a small fraction of the information is valuable, according to Sherman. "It would be a big mistake to try to manage or control all the unstructured data," he said. He warns that businesses may end up wasting time and resources on unimportant data.

California State Newark, president of Granitefalls Consulting, said that if not handled rationally, large data could only kill time for the data management and governance team. "The only way we're going to tell if big data is worth managing is that we have to know which businesses need that data." "McGilvray said. "When it comes to big data, we still have to remember that." ”

Data governance Research Limited, a consultancy and training company based in Orlando, Fla., is the founder and CEO of Gowen Thomas, who advises that quality judgments on incoming data should be one of the top priorities for data management managers. She says active data quality checking can save a lot of time and trouble.

The importance of mapping new data to business-related use of classified information is often underestimated, Thomas said. The alignment of large data with existing reference data is "a huge detail problem," she said. "In fact, if this is not the case, the processing result information of large data may be misleading, inaccurate or incomplete." ”

To help ensure proper data mapping, tasks should be assigned to an advanced data architect, rather than being left to a less experienced data modeler or an IT-independent person, says Thomas.

Chisholm says data management managers should also prioritize conversations with programmers and data model enterprise users who often start large data devices. Such discussions, however, should start with the appreciation of the technology of Hadoop and NoSQL, and how they differ from relational databases and have an understanding of the need for a unified approach to management.

Companies should avoid allowing programmers and users to build large data systems and required data models and mapping from silos-driven perspectives. This can cost a lot of money, resulting in inadequate facilities to achieve the desired commercial benefits while at the same time wasting unnecessary system investments.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.