Find new large data analysis solutions to improve business agility and reduce costs

Source: Internet
Author: User
Keywords Large data solutions flexibility cost reduction

All along, data analysis has played an important role in harnessing the benefits of electronic storage of information. Some organizations use data analysis solutions to provide insights into increasing revenue, increasing market share, reducing costs, and achieving scientific breakthroughs.

Today, as business processes become more automated, the map of data analysis is expanding. Previously stored in separate online and offline repositories, information in various formats can now be stored in digital format, ready for consolidation and analysis. As a result, executives are increasingly demanding data and expect faster, more efficient solutions. The organization also attaches greater importance to data analysis activities, which undoubtedly brings greater pressure on existing business analysts and IT teams.

Definition of large data

In a way, large data is the frontier technique of data analysis. The earliest reference to the term "big data" can be traced back to the apache.org Open source project Nutch. Large data is a large number of datasets that need to be processed or analyzed at the same time for updating a network search index, such as blogs with sizes of dozens of to hundreds of TB. With Google's release of MapReduce and Google File System (the latter developed into the Apache OSS Open source project), big data is no longer just a large amount of data, it also covers the speed at which data is processed. With the advent of new, structured, unstructured, and more structured data types, large data also contains a complex element.

The Enterprise Strategy Group (ESG) found that the "big data" that the supplier understood was literally meant to be a lot of data. This trend is particularly evident in vendors that offer some solutions that provide distributed parallel file systems such as GPFS and luster, workload-specific storage solutions such as Emcisilon and Panasas, and databases designed for complex analysis, including Teradata Aster, HP Vertica, IBM Netezza and EMC Greenplum). As shown in table 1, ESG updates the definition of large data to reflect current usage.

Large data is a dataset that exceeds the normal processing range and size and forces the user to adopt a non-traditional processing method.


Table 1. Definition of large data

Assessing the impact of large data on data analysis

ESG thinks big data is not market hype. For many organizations across multiple vertical industries, large data is real, and it is changing the architecture of the datacenter. As data volumes, data processing speeds, and the complexity of types are growing faster than the hyper-standard front-end and background data processing capabilities, large data is growing, forcing IT teams to consider unconventional ways to handle business needs.

How do you leverage the current analysis platform and the underlying IT architecture to handle growing volumes of data while mitigating the pressure to improve performance? This is a problem that many organizations are trying to solve. To better understand how organizations respond to the challenges of big data, and what they want to do to meet large data requirements by deploying new analytics platforms, ESG has recently launched a survey of 270 policymakers and stakeholders. (The findings are as follows)

According to ESG's findings, if organizations have large amounts of data and growing database capacity, and the data comes from multiple sources, they are more likely to face large data challenges. As more and more data sources are integrated into business intelligence and data processing tasks, the usual data analysis process is not enough to meet the needs. These organizations recognize that it is equally important to improve data analysis capabilities.

More than half of the respondents rated the ability to improve data analysis as one of the five most important IT priorities for the next 12-18 months (see table 2). In addition, only 5% of people think that data analysis is not one of their most important 20 IT priorities. Over half (54%) Enterprises (more than 1000 employees) believe that data analysis is one of the top five IT priorities, while only 42% of large and medium-sized enterprises (employees 500-999) hold this view.

Table 2. The relative importance of data analysis

At present, the leading data analysis platform has not appeared. More than half of the organizations are still using custom data analysis solutions. Common databases are tuned for specific workloads and are widely used to perform data analysis activities. Organizations with at least TB data are more likely to use cloud-based data Analysis Services, as well as large-scale parallel processing (MPP) or symmetric processing (SMP) analysis databases. Although workload-specific devices have been present in the early years (the Analysis database is bundled with software, storage, server, and network resources), only 6% of organizations regard these solutions as their primary data analysis platform. The ratio is so small, mainly because the suppliers have a limited choice of equipment, and this limitation will remain in the next 12-18 months. The results show that the organization has been challenging the limits of its analysis platform, while also trying to find a better framework for better completion of the growing data Analysis task.

Data integration is the most common data analysis challenge, exceeding One-third (39%) Respondents found the data integration process too time-consuming, too large (35%), or both. These problems can become more serious as data sources for enterprise data integration increase.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.