Is big data just a concept or a practical tool?

Source: Internet
Author: User
Keywords Big data we can or





Starting last year, the word "big data" began to appear frequently, whether in the Internet or in other industries.



The "concept" nature of things in China's internet circle can always be spread quickly, there are many reasons, there is a whole atmosphere: most Internet entrepreneurs are hoping to change the world through forward-looking innovation, by the pursuit of capital, and finally the cash. In this process, the concept of rapid dissemination, packaging, become a variety of tagged products. But the pragmatist only passively accepts, lacks the correct cognition profound exploration.



From the figure below can be seen, the 2008 Big data concept began to spread, in Baidu and Google's "Big Data" and "Da" of the search trend (the following figure data Baidu's PV weighting processing, and Google equivalent to reflect the trend comparison):



The word big data, Baidu's Chinese search of the explosive far higher than the English Google search.



This is the Silicon Valley's notorious technology maturity curve (hype cycle), and in the domestic Internet industry has been passed on and carried forward more severe.



A joke: "The current big data at home, like a bunch of adolescent children talking about" sex ", everyone likes to talk, if not talk about it as if they are not normal, but only very few people really have experience. The real experience, but also silent, smile. The internet industry is growing fast, and these kids will be adults sooner or later, but so far the vast majority of beneficiaries are just the label makers, like sellers of illegal publications for teenage children.



What is the big data?



So what is big data? is the big data just a concept or a real future?



First of all, the role of all data is to find the law.



materialist dialectics says: The world is material, matter is movement, movement is regular, and law can be mastered. Whether it's the earliest statistics, the computer appears after data analysis, data mining, and to the current large data. We are all exploring the laws of the world, trying to understand the world through rules.



In the absence of computers and the Internet, senior scientists laid the groundwork for mathematics and statistics. With the advent of computers, the ability to store and compute data has increased significantly, and the ability to collate and analyze data has increased significantly. And the advent and development of the Internet, so that the means of collection of further enrichment, the volume of data greatly increased. The game is also being enriched by data-seeking rules.



This process, the data on the one hand is getting bigger, on the other hand more and more "small", how to say: The evolution of this process can be simply said to "the overall sample coverage" and "the discovery of the value of microscopic data." The essence of the data is the sampling and the model, because the technical means can not obtain all the object characteristics, only through partial simulation of all, through the abstract model to describe the object. And after the advent of computers and the Internet, the ability to, and the ability to analyze and excavate data are greatly enhanced, and the number of samples to be explored is becoming more and more detailed.



It's like we want to know the quality of the apple in this car. Previously, only randomly sampled 100 samples to see if there was any damage to the appearance of pests; now a sample of 7,000, each apple with more than 30 data to describe Apple characteristics and quality. There is no need to sample 100% of all to get data, and then each Apple has more than 100 data describing features and quality, even the entire growth cycle data.



But whether it's statistics, data analysis, data mining, or big data today. Our mission has not changed: by collecting, collating, and analyzing data, we look for patterns, infer nature, and even predict the future.



At any stage, the task is limited, and we can only speculate on the part of the object's nature, not all of it. In the development of technology to a certain stage can produce new technology and methodology, can also be a step closer to speculation and prediction, the step out of this can greatly improve productivity, which is the value of large data.



Specific industry examples



Next we choose a more abstract example of the industry to illustrate: basketball (NBA).



In the early days of the NBA, because of the lack of commercialization, the statistics on a game were very limited, whether it was a player, a coach, or a team manager who knew the players in an intuitive or basic statistic.



1986 NBA began the full data statistics. So now the news all love to use: "Since 1986 have statistics, this is the Nth player single field to play XXX data ..." NBA statistics officially entered the modern, database technology successful application, so that you can find the historical data from the www.nba.com.



From this day on, another topic emerges. Just as we like to put the martial arts swordsman figures, column seating, data integrity, a large number of data cited as a new hobby of the media. So, "scoring sharp weapon", "Defensive jag", "The shooting master" These words, gradually by "how many points per game", "how many rebounds + cover", "shooting hit" and so on. All the fans are starting to like the data.



But only looking at the data, it would be hard to understand: Marbury, a young man, who averaged 20 points and 7.6 assists, was called a lone wolf. Look at the data, it will be difficult to understand, Bowen this data prosaic, steals no gorgeous guy, defending is far more powerful than the two steals King magician? How can you understand, Stoudemire career 8.8 rebounds 1.4 caps, Garnett in the Celtics also have 8.9 rebounds 1.4 caps, but KG's defense and Stoudemire, that is the difference?



In fact, because the data is too simple to describe the player's microscopic data, it is impossible to use data to describe the role or characteristics of a player on the pitch.



21st century, the details of the microscopic data more and more into the NBA, professional NBA data mining company Synergy QSL appeared. "SI" disclosed a basketball god Jordan's professional statistics: the bull 80.2% of the attack to go through his hand; 83.9% of the shooting is a jumper, 54.3% of the shooting from the right side of the pitch, 17% of the attack from the opening of the singles, the singles Fortune 2.67 steps after pulling off the jump, the opponent jamming in place, the hit rate is 46.3%;



At this point, the data began to enter the new era. And this year's NBA playoffs, the United States media began to run the field are running distance, speed, the fastest speed, etc. also added to the analysis of the dimension. The new technology means the discovery of microscopic data value. Maybe we can call it: Big data.



Correct view of large data



Data does not lie. But to be precise about one thing, you need enough data, and enough microscopic digging. But the data, never enough. For example, basketball games, data and perception, will be forever intertwined. More and more data models give the result of an infinitely close impression, but when the data or the perception of either side eminence, it is no fun to talk about basketball. And no matter how much you know about the data, you need coaches to design tactics, play player-specific, motivate team morale to win the game, and the data will not "win".



Big data is a kind of progress, but we have absolutely no need to myth, and not necessarily demonized. Large data is a concept, and it is only the natural outcome of our understanding of the development of the world to the present stage. The rational view of large data, so that good for the production and research services, more to play our own innovation and initiative, will be more valuable.



Excerpt from: Tomsinsight



(Tomsinsight is a start-up company focused on in-depth data analysis insights on China's internet.) The founder of the former Microsoft chief business analyst, former Baidu large data architecture director, former senior regional manager and so on. At present, the main business for Wall Street to provide in-depth analysis of stocks. The development direction is the provision of light micro-Internet consulting services, and has now opened a new micro-credit subscriber service (micro-signal: tomsinsight).


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.