The big data age has come to the point of analysis

Source: Internet
Author: User
Keywords Large data large data large data times large data large data times at the same time big data big data times at the same time difficult big data

November 8, by the end of 2011, the number of global Internet users reached 2.267 billion, as of June 2012, China's internet users reached 538 million. Huge numbers of netizens produce a large amount of data every minute, according to statistics: the total number of global e-mail users sent 204 million e-mails a day, Google will handle 2 million searches, Facebook users will share 684,000 bits of content ... At the same time, at the same time, users are not only information online, at the same time users send micro-blog, upload photos, upload video, etc., resulting in a variety of data types. The amount of data generated by users will also show an explosion of growth, the era of large data has come.

As the amount of data in a user grows exponentially, there is no denying that massive user data will create great value, great value from the analysis of large data, but from now on, the ability to large data processing and analysis is far from being followed, how to store, retrieve, clean and analyze large numbers is difficult.

In terms of large data storage and backup, many Internet enterprise-day data volumes have increased by dozens of, hundreds of TB (1TB=1024GB) speed, while total data has reached PB (1024TB) levels, and its data volumes have made it difficult for traditional databases to store large data. At the same time, for enterprises, data backup is critical, the lack of data backup may lead to a devastating impact on enterprises. At present, data explosion in large data age increases the time of backup and recovery, storage devices are limited, data backup and recovery will become more and more difficult, while considering how data storage and backup can save power, save space, save cost and so on.

Before large data analysis, it is necessary to clean up the data, including checking data consistency, deleting duplicate values, dealing with invalid values and missing values, and so on large data, also includes the massive data "noise", using traditional data analysis software to clean up these "noise", more difficult. At the same time, it is necessary to extract the core data of large data quickly, analyze these core data efficiently, and need to set up the advanced analysis model, only the core data can be analyzed, the trend and the hidden information will be realized so that the big data can really play a role. Large data mining needs the combination of hardware and software, which brings a high challenge to the softwares, hardware and talents.

In addition, large data visualization is difficult. Large data visualization is the transformation of large data analysis results into the information that the company can use. Only large data analysis results through the visual processing, the non-data analysis professionals can fully understand the language, charts and other representations of large data contained in the information, will bring the value of the company. Large data contains large amount of data, data type is mixed, data model is complex, data result is abstract, and visualization is difficult.

Large data analysis professionals lack. The larger data age requires more data analysts, and even creates new jobs, such as data scientists, CDO (Chief data Officer), data visualization and data-tuning agents, and there are no specific standards of practice for large data analysis positions. But large data analysts have to be involved in a number of areas, with at least four skills: technology (software and systems), Mathematics (statistics, modeling and algorithms), Business Analytics (knowledge in the field) and visualization (language and graphics), At present, general business user analysts or traditional data analysts have only one or two of the above skills and do not have the skills to develop predictive analysis application models.

"Big Data" has come, and the problems of big data will be solved in the exploration.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.