Information Big Bang Big data after the mess, how to clean up?

Source: Internet
Author: User

650) this.width=650; "src=" http://s4.51cto.com/wyfs02/M01/7E/F9/wKioL1cOb0qAAxlDAAIyx4S8_c0902.jpg "title=" 41% Enterprise data files have not been accessed in the last three years. jpg "alt=" wkiol1cob0qaaxldaaiyx4s8_c0902.jpg "/>

With the advent of cloud computing around the 2008, DevOps developers have been culturally popular for a while, and in recent years companies have been accelerating their transition to the Internet, resulting in a proliferation of developer files. Cloud computing has also prompted a huge change in business models, with a large number of mergers and acquisitions, mergers, layoffs and the rise of startups, leading to increased mobility and "legacy" data spikes. In addition, with the rise of smartphones, video and image file outbreaks, has become a heavy burden on enterprises.

The developer file is the largest number of file types in the current global enterprise Data Environment, accounting for 20.13 of the total file size, according to Veritastechnologies, a public welfare report published by the information management solution provider, the Data Genetics Index (datagenomics index). % and 9.17% of total storage. The report also points out that when employees change or leave their jobs, their legacy files often become siloed data, creating security risks and long-term cost of enterprise storage.

Developer files, unknown files (including orphaned data), and image files have become the subject of data endgame for enterprise transformation. The data gene Index says 41% of the company's data has never been modified in the past three years. Now, companies are increasingly "not raising" data.

The confusion and runaway of enterprise transformation

The current enterprise IT infrastructure has undergone significant changes that have gone deep into the daily operations of the enterprise. Especially in the domestic development of the Internet +, enterprises focus on the Internet transformation, Internet technology has penetrated into the traditional enterprise IT, "software definition" concept overwhelming, "which makes most enterprises are excited and confused about the overall environment of it."

Senior director of technical support, Veritas, Greater China, Li Gang shared the two years of exposure to domestic companies:"Companies don't know how to implement software definitions, how to build hybrid cloud models in it planning, how to apply knowledge reserves and skill reserves to meet challenges, and what partners can help the enterprise grow. "These problems are now sharply placed in front of CIOs and it executives, making businesses feel confused.

"We have found in our interactions with our customers that enterprise IT managers have always managed it through infrastructure. Infrastructure is the most manageable part of a company's it-building process in the past. After ' software definition ', all the hardware becomes ' commodity '. In the hybrid cloud era, enterprise applications can drift back and forth in the ' cloud '. As a result, the runaway of an enterprise is actually a control-level runaway in the IT infrastructure. ”

He said that Veritas believes that enterprises need to switch ideas, should not be obsessed with IT infrastructure, but should pay more attention to the data itself. The future enterprise will be the software enterprise and the data enterprise, therefore, the enterprise IT decision-maker really should concern is the control of the data. Enterprise IT to regain control of core competencies, it should strengthen data control, data is the core assets of the enterprise.

How to transform Data management thinking?

Veritas serves the information and data management needs of more than 50,000 companies around the world, with 86% of the world's Fortune 500 companies. The data Genetics index is the first time that Veritas has been in-depth analysis of tens of billions of documents since its inception in 1989. The cause of the report is the result of a direct contradiction between data explosion and limited storage resources in recent years.

A 2012 Gartner study said that the cost of storing 1PB data was around $5 million. In recent years, with the development of technology this cost can be reduced to about $500,000, while the current 1PB data in the cloud for a year the lowest storage cost of 2.5 million to 3 million yuan. Big data has become a big burden for companies before they can generate value, and data is growing. Ken, a Veritas chief information governance expert, said he had met a newly established company with only a few terabytes of total storage space, but the business unit came up with petabytes of demand.

Enterprise data is expanding, how to transform data management thinking as early as possible? Li Gang thinks there are several premises. First of all, to focus on data availability, enterprises need to be able to access the data at any time, and secondly, the life cycle management of the data, the enterprise needs to have a complete set of tools to complete data generation, processing, archiving, deletion, destruction and other steps; The third is to focus on data reuse and mining;

In response to changes in the data storage environment, Li Gang believes that enterprise data that is now in a hybrid cloud environment is distributed across private clouds and different public clouds, a complex environment that poses a great challenge to data management. In fact, with the cloud and virtualization of it, the data is not fixed, and businesses are concerned with accessing the data at any time rather than where it is stored, which brings a whole new perspective.

Starting with the understanding of data genes

In order to "clean up" the data, we need to have a fundamental understanding and understanding of the data, which is the purpose and meaning of the data gene index. Any enterprise, industry expert, consultant, end user, technician, and other relevant person can learn the data gene of the global enterprise through datagenomicsproject.org.

According to the first Data gene Index report, companies are creating data that is growing at a rapid rate of 39% per annum. And businesses are growing in different seasons: more than 68% of video files are created in the summer, possibly because employees put vacation videos on corporate servers, and many companies implement annual backup policies, so the growth rate of backup data in October, November, and December has risen significantly by 756%.


The data gene index points out that traditional office format files, such as presentations, spreadsheets, and documents, occupy far more space than reasonable values, creating an unnecessary cost burden for businesses, and visual format files such as videos and pictures are another burden for businesses. Using 10PB as an example, if you expand an archive project that specializes in processing outdated presentations, documents, spreadsheets, and text files, you can save approximately $2 million in storage costs per year for your business.

In addition to Veritas, another enterprise business availability solution provider, Veeam Software (Guardian software), is the first to advocate for the year March 30 as World Availability Day (worldavailability). According to a new industry survey commissioned by Veeam, 84% of CIOs admit that they cannot meet the expectations of their employees, customers and partners for data availability, and are losing $16 million a year for this.

Today, due to the huge changes in the IT infrastructure, such as Veritas, a professional provider of data governance and management technology, and Veeam's enterprise business availability technology providers have unique positioning and location in the market, as well as the suppliers that enterprises need to understand during the transformation process. Of course, there are no vendors that can integrate all of the data management and business availability tools. Veritas plans to launch products similar to "data middleware", integrating and collaborating with more solutions vendors to help companies gain comprehensive data control capabilities.

Overall, in the big data explosion era, companies need to be prepared to deal with the big data mess in advance. Under the premise of developing data availability technology, full data lifecycle management scheme, data utilization and data mining and data access strategy, we should fully understand and recognize the characteristics, regularity and nature of enterprise data, then choose unique and specialized supplier to solve specific problems, and pay attention to the compatibility and universality of different tools. Committed to the eventual formation of a "data middleware" layer.

Starting with the data itself, rethinking the overall IT strategy and architecture of the enterprise is one of the important methodologies in the transition period. (Wen/Ningchuang, this article first itvalue)

"More exciting content in the era of cloud technology," Cloudtechtime "


This article is from the "Cloud Technology Age" blog, please be sure to keep this source http://cloudtechtime.blog.51cto.com/10784015/1763628

Information Big Bang Big data after the mess, how to clean up?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.