In the big data age, what is more important than why.

Source: Internet
Author: User
Keywords Big data they big data times we

For the four characteristics of large data we certainly understand that relevance is also found, followed by the use of large data to create value. "Big data is what", "big data has anything to do with me", this is the first time many people have heard the big data in the heart of the question.

According to 30 magazine, Huberg in front of the audience more want to learn about Http://www.aliyun.com/zixun/aggregation/12368.html's future trends, he made a very understandable speech, telling businesses and people, What is large data.

Large data to identify dependencies

A new influenza virus was found worldwide in 2009, when the United States was not spared, and the Disease Control Agency (CDC) called on front-line physicians to have a swine flu case to be notified immediately. Even so, the speed of the bulletin is always slow and will be 1-2 weeks late. Such a prescription so that the CDC can not grasp the real situation, the right remedy.

Several Google engineers had published a paper in the famous Nature science journal. They used the first 50 million search keywords most commonly used in the United States, then compared with the CDC's 2003-2008 influenza transmission data, using up to 450 million different mathematical models to find out the frequency, time and location of these words, and whether there is a statistically relevant (correlation). Finally they dug the treasure, the software found 45 flu key words, put into the mathematical model, the results of the forecast and the official release of the true data, there is a strong correlation.

Using this mathematical model, Google has once again mastered the peaks and regions of influenza, keeping the epidemic in sync and not falling behind.

In another example, astronomy, NASA is implementing a digital Sky survey called Sloan (Sloan Digital Skies Survey), which, since 2000, has been using a telescope in New Mexico to collect data, but for a few weeks, The amount of astronomical data received exceeds the sum of all previous astronomical histories. By the year 2010, the plan received 140TB of data. But the new plan, expected to debut in 2016, will be available within 5 days for the Future Survey telescope.

Huberg said that when the data went into astronomical times, Huberg reminded: How big is the big data? In fact, it is not so important, the focus is to enlarge, expand the level of data, you can make a small amount of information to do things.

For example, drawing a picture of a horse is not too difficult, but if you draw a lot of pictures of horses and then present them in 24 frames per second, it becomes an animation. Here to emphasize the "quantitative change" has produced "qualitative changes", the huge amount of this truth, the volume of the difference, but also changed the essence.

What is more important than why.

The increase in volume, there is another big data features: Chaos (messy). The content of the huge amount of information is often chaotic and inconsistent in quality. This is because, a huge amount of data collection process, it as long as a general direction can, do not need to pay attention to an inch, a point. "It's not that we've given up on precision, but that we're not going to schooled precision," Huberg said.

For example, to measure the temperature of a particular vineyard, if the whole vineyard had only one thermometer, the thermometer would have to be precise and not faulty, but it would also mean that it would be expensive. In other words, there can be no clutter or error; Conversely, if we put 100 thermometers in the vineyard today to measure the temperature, we could use a cheaper, simpler thermometer to measure the precise temperature.

100 thermometers represent a large quantity, although several may be less accurate, but can collect a large amount of data. Compared to only a thermometer, you can see the full picture, representing the whole. At this time, a little bit of clutter seems trivial.

The emphasis came again, Huberg suddenly stood up to all the audience, said, in the big Data age, the data quantity is more important than the data quality. Not to a little information deviation and affect the overall analysis, want to dispose of inaccurate information, the cost will be very high, there is no need!

Another interesting example is Wal-Mart (73.54, 0.00, 0.00%) (Walmart), they found from the huge transaction record that before the hurricane hit, sales were not just flashlights, there is a small American dessert top-tarts, the store will be before each hurricane, a box of Top-tarts placed on the necessities of the cyclone, convenient for customers to meet at once, "especially strawberry taste, sell best." ”

Please note that here Walmart not to figure out why people especially want to eat top-tarts when hurricanes, but to find out this correlation, directly take more favorable marketing action.

Huberg special emphasis, the big data age, what than why important.

Another example, which happened to his friend, a large data expert, teaches at the University of Washington's Professor Izony (Oren Etzioni). In 2003 he wanted to take his brother's wedding from the Seattle Tower to Los Angeles, and he wanted to buy the tickets as early as possible, and he bought a ticket a few months ago and found it cheap. I did not expect him to ask the next door passenger how much money to buy when he was on the voyage, out of curiosity. As a result, he was very angry when he answered that he had only recently bought the answer and had bought it cheaper than he had.

After getting off the plane, he decided to study the purchase of the ticket. He found that if the average ticket price were to fall, the ticket could be taken slowly;

It took 41 days to get more than 12000 tickets on a travel website, and he built a model that would save a lot of money for a simulated consumer. In this model, the consumer does not understand "why (why)", only know "precisely so (what)", the consumer decides now is "buys or not buys".

Later, the model developed a business plan, he created a farecast website, consumers can make the best judgment, when to buy, or not to buy.

Large data and value

When we know the characteristics of large data, we find the correlation, and then we create value by it.

In Seattle, the United States, INRIX, a data company dedicated to the real-time positioning of vehicles, comes from hundreds of billions of vehicles. At the same time, it has launched a mobile app service that offers services in exchange for specific driver information, including where they have been driving, weather and road conditions. They will then sell the information they receive to an investment fund, which guesses its performance based on road conditions in the vicinity of large retail stores, before retailers make their quarterly announcements to decide whether to buy or sell. Because the tide of the car is the money tide. That's the value.

Britain's Royce is a famous aircraft engine manufacturer, which controls whether the engine is functioning properly by installing a monitor on the engine. As a result, the data they collected found that the engine could be in trouble when something was wrong with the engine, which turned out to be a prediction that greatly reduced the accident. Royce from past manufacturing engine companies to service consulting, they have made the data valuable.

Huberg said a lot of big data, but he stressed that the big data has its dark side: Privacy is certainly a focus of attention, but he stressed that the more frightening is the various algorithms to predict whether there will be a heart attack? or if you will commit a crime. Sometimes, calculations and predictions based on large data are less important than free will.

At the same time, we are worried that more and more companies will have more information, but what is such a huge collection of information for them to do? What is the purpose? It is not necessarily subject to supervision and management.

"The huge amount of information is controlled by humans, not by the vast amounts of data," is Huberg's last reminder.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.