Data isolated from large data analysis: Can you see them?

Source: Internet
Author: User
Keywords Big data big data big data analysis big data big data analysis say big data big data analysis say see Big Data big data analysis said see we

Since enterprises and CIOs began to try data mining, the island of data has hindered the improvement of the efficiency of business intelligence. Islands of data, in other words, expensive, tail databases that need to be maintained but not compatible with each other, are expected to discover great knowledge from them, no doubt in impossible. In other words, the number of databases has nothing to do with the amount of knowledge that is mined. As one business intelligence expert has said, the incoming and outgoing is rubbish.

When it comes to big data analysis-or data 3V (categories, numbers and growth rates), it's a buzzword that chokes most companies. Because, according to analyst Ted Friedman, the data island has spread exponentially-just like the plague.

"There are islands of data in your company, any time, anywhere. From a big data point of view, the entire universe is flooded with islands of data--in firewalls, on the web, on the ' cloud ' side, and on the data that comes from other businesses, customers, and vendors, "says Friedman, who hosts information management consulting." All of this makes it harder for you to break down islands of data to tap into meaningful knowledge information. ”

So what does a CIO do in interpreting Big data? Like other IT challenges that business encounters, this conundrum and its solutions revolve around talent, processes, and technology. The CIO needs not only to develop new skills for employees (including recruiting data scientists, analysts and architects, etc.), but also to convince the top: Big data governance is an important proposition that requires executive and even board attention.

Data management that suddenly becomes fashionable

There is a way to deal with the problem of data isolated from large data, that is, isolate analysis and focus on breakthrough. Gartner specializes in an information valuation process that uses this approach. "In the vast ocean of data, different data have different values, so the goal of data mining becomes the definition of what kind of problem space, and then in-depth analysis in space," Friedman said, "In my opinion, customers tend to define the analysis boundaries too broad." ”

To focus, companies can ask themselves the first question: what are we going to get out of the data? How does this data relate to our business? How do we use this data to get positive returns?

As companies increasingly focus on the value of data lurking in large numbers, Gartner notes that more and more companies are starting to set up data governance committees. Consists of business stakeholders who focus on all aspects-from what are important data sources, to what technology investments, to various data-related issues, such as data quality, data retention, data consolidation, data security, and information privacy.

The dangerous exploration of the island of external data

In addition to a handful of IT professionals, it should be open to other staff to explore the right to large data to maximize value from the large data. Gartner and other professionals worry that many organizations are rushing to profit from big data, ignoring the risks of it governance and paying the penalty for violating privacy, data counterfeiting and so on.

"In business, thorough data openness is unrealistic," says Boris Evelson, chief analyst at Forrester, Massachusetts. "There are various regulatory issues and conflicts of interest." For example, there is absolutely no way to make a single step between the investment bank's surveyor and the trader. ”

Protecting the integrity of the data is a huge challenge, says David Gallaher, NSIDC's IT Service manager at the University of Colorado's National Snow Research Data Center (NSIDC) and its data-gathering partner NASA (NASA). David's main task was to collect and manage the petabytes of scientific data that recorded all the frozen regions of the world and to ensure that they were distributed in a controlled manner to the researchers in need. "We need to make it as easy as possible for people to get the data they need, but we have to make sure that they can't change any of them indiscriminately," says Gallaher, who is undergoing geography training. On the other hand, NSIDC scientists are sure to update them every time they access data, so the governance principle of data management must be "the right people make the right changes," Gallaher stressed. NSIDC is currently working with the National Science Foundation to refine its data governance principles.

Data management-as long as multiple views, not multiple copies

Not everyone agrees that big data must mean more islands of data. Anjul Bhambhri, vice president of IBM's Big Data project, claims that big data can actually "help" CIOs.

"Now, the islands of data can clean themselves up," Bhambhri said in an interview with the island, which cleaned up data for more than 200 companies a year. A large enterprise set up 13 data marts for mail archiving (8 were used by the Department of Legal Affairs) because they couldn't wait for it to be processed when they wanted to access archived mail. Two departments of another company made copies of their own web caches. "You know they have 15 billion caches a day to deal with," Bhambhri said.

New technologies--including, of course, IBM's large data products--allow businesses to store and analyze huge amounts of data in a single data warehouse. As a result, the two companies have only kept an active data archive, without having to set up 13 archived replicas or 15 billion web caches. "Your data is only stored in one place, and multiple applications can access the data at the same time, because the data is kept unchanged in the storage hierarchy," Bhambhri said. However, even though she and the IT people who actively advocate large data analysis like her, they are constantly reminding companies that effective large data analysis requires a thorough overhaul of existing IT system frameworks. "Being able to store data effectively is a big step in the right direction," she says, "but storing it is not enough, and effective analysis requires a lot of algorithms." ”


TechTarget Chinese original content, original link: http://www.searchcio.com.cn/showcontent_65230.htm

(Responsible editor: Lu Guang)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.