Best practice of large data analysis based on conventional rules

Source: Internet
Author: User
Keywords Practice large data large data analysis most conventional

With the advent of new terminology, technology, new products and new providers, the "Big Data" analysis is unfamiliar, but the proven data management best practices can work in a field that still belongs to the emerging discipline.

Like various business intelligence (BI) and data warehouses, experts believe it is important to have a clear understanding of the organization's data management needs and clear strategies before starting a large data analysis project. Large data analysis is widely discussed, and companies in a variety of industries are flooded with new data sources and growing information. But before it is clear what the value of this can really bring to the company, invest a lot of resources to apply large data technology, this is the so-called user's most serious mistake.

David Menninger, an analyst at the Ventana research firm, focuses on Bi, analytics and information management technology. Instead of being too aggressive in this technology, he says, start with a business perspective and communicate with CIOs, data scientists, and business people to identify business goals and expected values, and then begin again.

The most critical part of the process is to accurately define the available data and determine how the organization best utilizes these resources. Menninger points out that CIOs, IT managers, and bi people need to determine what data is retained, aggregated, and used, and compare them with discarded data. Also be sure to consider external data sources that are not yet involved but may be added.

Menninger points out that even if companies are unsure when and how to apply large data analysis, it is still useful to conduct such assessments as early as possible. In addition, the process of starting data capture can help you prepare for the final jump. He said: "Even if you don't know what you're going to do with it, capture the data first." Otherwise, you lose an opportunity because you don't have enough historical data to analyze. ”

Big data to start small

Analyzing large datasets also starts with small opportunities and then uses them as a starting point. As companies continually expand the types of data sources and information that they analyze, and start creating the most important analysis models to help them discover the patterns and dependencies of structured and unstructured data, they need to be aware of the results that are most important to the expected business goals.

Yvonne Genovese, a Gartner analyst, points out: "If you end up looking for new models, and they're useless, you must be in a corner." ”

ComScore specializes in tracking the use of the Internet, providing corporate customers with web analytics and sales intelligence services. They have long recognized the need for some kind of large data strategy. However, ComScore picks up some very specific points and then slowly builds up its own big data analysis project.

"We started out as a child--extracting data streams and transferring them to different systems," said Will Duckworth, ComScore's vice president for software engineering. If you can't reach a certain size, you can't do it overnight. ”

Given the amount of data the company deals with, scale is what comscore attaches importance to. As early as 2009, when it started collecting only 300 million records a day-now reaching 23 billion records a day and still growing-Duckworth began looking for new systems and technology infrastructures to efficiently complete ComScore's data processing.

Don't forget that the ultimate goal is still big data

By leveraging open source Hadoop technology and new analytics tools, the Duckworth is optimized for open source environments so that SQL business analysts can accept them more easily. He pointed out that in the determination of large data analysis implementation plan, the company must pay attention to the size factor.

He explained: "You have to think about the changes-how much data you need to work with over the six months from now, how many servers you need to add, and whether the software is going to do these tasks." The extent of data growth has not been taken into account and the prevalence of 觖 in the production environment has not been considered. ”

After plunging into the "new normal" of big data, another aspect that many companies often overlook is that the "old normal" of data management is still valid.

Marcus Collins, another Gartner analyst, points out, "the information management practices are as important as the current large data and previous data warehouses." Even for companies looking to increase flexibility, they should keep in mind that information is a business asset and should be always. ”

(Responsible editor: Lu Guang)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.