Considerations for large data analysis projects

Source: Internet
Author: User
Keywords Large data large data they large data they large data analysis large data they large data analysis these large data they large data analysis these large data analysis project

"Big Data" has become one of the most popular buzzwords nowadays. Also popular terms include: Business Intelligence (BI), analytics, and data management markets. More and more companies are looking for business intelligence and analytics vendors to help them solve business problems in large data environments.

So what is the big data? Recently, it publications eweek the following view, part of the definition is based on Gartner's terminology: "Large data involves the number, type, and speed of structured and unstructured data, transfers between processors and storage devices over the network, and provides data business consulting for business related businesses." ”

This description hits the part of data management and analysis, but ignores the fundamentals of business challenges around large data: complexity. For example, the installation of large data often involves information-including social media networks, e-mail, sensors, network activity logs, and other data sources that are not simply integrated into traditional data warehousing systems.

In many cases, it is necessary to put all these different data together to make them meaningful on a broader level. may have a significant impact on business rules and other components of a large data analysis system. When it comes to data storage and query management, the complexity of large data makes it more different than traditional data, which is the main reason why database and data analysis software vendors have to strengthen their products to help companies cope with big data.

Understanding large data is the first step in assessing your technical needs and developing a large data analysis plan. The second is to understand the market and current trends, as well as the business value and competitive advantage your business hopes to derive from an increasingly large and diverse set of data sets.

Big agenda for large data analysis projects

Many companies have a large dataset. But now, more and more businesses are storing data that is terabytes, not petabytes. In addition, they are looking to analyze critical data on a daily basis, even in real time, and change the traditional weekly or monthly bi-historical data review process. They have to deal with more and more complex queries, which involve a variety of different datasets. This may include enterprise resource planning and customer relationship management systems, plus social media and geospatial data, internal documents and other forms of data transaction information. More and more enterprises want to be given the BI self-service function of enterprise users, making it easier for them to understand the analysis results.

All of this can be played into a large data analysis strategy, and technology providers address these needs in different ways. Many database and data warehouse vendors focus on the ability to process large amounts of complex data in a timely manner. Some use columnar data storage to try to achieve faster query performance, or to provide built-in query optimizer, or to support open source technical support such as Hadoop and MapReduce.

In-memory profiling tools can help accelerate the analysis process by reducing the need to transfer data from disk drives. Data virtualization software and other real-time data integration technologies can be used to assemble information from different data sources. Off-the-shelf analytical applications are suitable for vertical markets that often deal with large data, such as telecommunications, financial services, and the online gaming industry. The data visualization tool can simplify the process of presenting query results of large data analysis and better serve the enterprise managers and business manager.

Before creating an implementation plan and the choice of a large data infrastructure, enterprises that fit the relevant data and analysis requirements categories should first consider the following issues and questions:

-Timeliness of required data because not all databases support the availability of real-time data.

-interconnected data and complex business rules, you will need to connect to a variety of data sources. There is a broad understanding of enterprise performance, sales opportunities, customer behavior, risk factors and other business indicators.

-The amount of historical data that needs to be analyzed. What if a data source contains only two years of data, but it actually needs five years of data?

-In your industry, which technical vendors have large data analysis experience, do they have relevant tracking records?

-who is responsible for the various data in the enterprise, and how will these principals participate in the active large data analysis?

These factors do not constitute an in-depth requirement plan, but they can help organizations to provide some support in deploying a large data analysis system and identification technology.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.