Large data Engine: mining the gold mine under the iceberg

Source: Internet
Author: User
Keywords These gold we big data.

Now with the development of mobile Internet and IoT, data is not produced. Including the individual user's mobile phone, wearable equipment, etc., also includes the industry application of sensors, surveillance cameras and other information. Digitalization transforms the analog information of the physical world into digital information, which produces and accumulates a large amount of data in the process of merging the digital world and the real world. 90% of all global information data is generated in the past two years. The global generation of 25PB data per day in 2013 corresponds to the total amount of information in 1500 national libraries. Global data volume from 2003 5EB, rapid growth to 2012 2.7ZB, and will be in 2020 to reach 40ZB. These figures are often likened to floating icebergs at sea, which are hidden beneath the water for their immense value.

As Chen Shangyi in the General Assembly, any data produced has its original purpose, and that is their first value, and when these data are accumulated quickly, they will produce a second, third value. This requires effective technology to discover and excavate. For example, online albums, the first value is to provide users with storage services. When we have more photos, we can find the fashionable colors and even predict the future trends. For example, wearable equipment can monitor our body 24 hours, its first value is to record their physical condition, in a fashionable words, is "quantify ego". But if we can analyze the data for a long time, we may find the health situation and provide us with early warning.

Two important characteristics of large data: Large amount of data, rapid growth. According to the McKinsey report, medical data will rise sharply to 35ZB by 2020, equivalent to 99 times times the amount of data in 2009. According to the Ministry of Transportation data, a province expressway video surveillance data 50T per day. This data generation also has its first value. such as medical data is for patients, video surveillance data is for hindsight. When their first value is exploited, the data is generally shelved. Gradually, these data become the burden of the industry. But in fact, these figures still have valuable value. How to discover the value of this data hiding has become one of the problems in the industry.

In the face of their own data on the value of the iceberg, various industries on the value of data mining has made some practical action. In his speech, Chen Shangyi summed up some misunderstandings in the practice of enterprises. Many traditional industries are still limited to small data development and utilization, small data as large data, not involving comprehensive, complete and systematic large data nature. The traditional data processing means and technology as large data technology, there is no large data era brought about by new features. At this time, the traditional industry needs to see the characteristics of large data, the development of new tools and new platforms to meet the data size, complex structure and high-speed expansion of the demand. Therefore, the traditional industry needs large data technology and ability to mining the new value of the industry data.

How does Baidu tap into the value of its data iceberg? Chen Shangyi a few interesting examples. Baidu, as a search engine, connects people with information, and is naturally a large data company. First as a search engine, Baidu needs to collect data on the Internet. In order to facilitate people to retrieve information to be stored in a large number of text, pictures, audio and video data of different structures. Like before we search for a keyword, the result is a monotonous link. Want to check the relevant video, but also have to search another. Now, Baidu is using its own data mining and artificial intelligence technology to connect these different types of network data, resulting in a call "knowledge map" results. For example, now use Baidu search "China good voice", the result is not only the description of the program, there are singers, songs, similar programs and other results. The same search, brings the various forms of information display, which makes the author's eyes bright.

At the same time, the user's search behavior will leave information, Baidu and then they carry out large data related analysis, for the crowd portrait, found that the interest point, characteristics and other new information, in turn, can be promoted from the thousands for our netizens to find the most relevant information. This is Baidu Sinan. It makes the results of advertising and user search keywords have a correlation between the ads put in Baidu ads more effective. For the future, Baidu also uses its own artificial intelligence technology to launch the Baidu forecast, there are tourist cities, attractions, the prediction of the heat, as well as the college entrance examination of Professional, college prediction. I saw Baidu's World Cup forecast on the website--Brazil won, let's wait and see.

Baidu used technology to lift the iceberg, digging out large data resources in the gold mine. Finally, the development of large data has entered a new phase of data mining, Chen Shangyi said. Baidu has packaged these big data technologies as a "big data engine for Baidu", opening it up to industry society. To help the traditional industry according to the characteristics of large data, using the platform of large data engine, mining the new value of industry data, helping industry upgrade.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.