Real life in the big Data age

Source: Internet
Author: User
Keywords We we can we can very we can very these we can very these they

The scientific revolution that has changed our lives rests on the frontiers of natural science and touches human beings from the absence of the Passover.

If we study humans as we do with natural phenomena, we can predict human behavior.

Once enough data is collected, the basic question can be raised: How high is our predictability? And will get a shocking answer.

Albert Barabasi

Humans don't want to be seen under a microscope.

In some ways, the statistics on human behavior are really important. How does a lot of data about human behavior work for us? I'm a physicist, or a natural scientist, and I think that natural phenomena can be parsed, described, quantified, and predictable and controllable, which is understandable. This is what scientists should do and drive scientists to study. So what if we use humans instead of the natural phenomena mentioned earlier? That sentence will become: human beings can be parsed, described, quantified, and predictable and controllable. This is obviously a very scary statement.

However, we do not need to panic, this is actually a good news. In the field of science there is a little secret that we never discuss, and the scientific revolution that changes our lives rests on the frontiers of science and touches human beings.

We will not stop scientists from predicting the trajectory of electrons, but not the electronic crisis, the financial crisis, and so on; we don't mind scientists studying genes, but not predicting wars, major crises, etc. The reason is very simple. There is a fundamental difference between humans and bacteria or other organisms, that is, bacteria are not irritated by being put under a microscope, and the moon does not complain because the spacecraft landed on its surface.

Forecast needs data

To predict, you have to have a lot of data, and those who say they don't need the data to make predictions are not palm-reading business advisers.

We now have a lot of data to support the prediction of human behavior. Every message we send leaves a human relationship, hobbies and other life clues. The Bank understands our ability to pay, taste, willingness to buy and places to shop. Although we often choose not to think about it, the fact is that we have put ourselves under multiple microscopes that record the facts, and the details of the data allow others to quickly understand http://www.aliyun.com/zixun/aggregation/32533.html "> Our lives.

"Eruption" is about the changes that data gives to the research that controls human behavior. This includes all aspects of data, one of which is privacy data. Although the book "Eruption" mentions privacy, it is not about privacy. It says that our society is becoming a big laboratory, and the data collected automatically reveal the patterns of human behavior.

When it comes to human behavior, let me first explain the question: "Why should a physicist pay attention to human behavior?" "The fact that physicists care about human behavior is to understand the complex system behind it." There are many complex systems that are worth studying. The brain is on the one hand, the economy, the cell, and the computer system. Last year we found that society is the best platform to understand personal behavior. This data helps us understand the day-to-day behavior of each individual, such as what each neuron is doing every moment, or what each gene does. Because this set of personal statistics counts everyone's behavior, including their behavioral patterns, patterns of movement, and everything else, so if you want to be practical and believe that every complex system is similar, you're moving toward more data and more progress. Over the past five and a half years, human society has been evolving towards a complex system of origin, a system that is easy for us to control, but it is a long process.

The so-called "eruption" is a pattern of behavior that everyone is following, and if you look at the behavior patterns of people in real life: when to email, when to call, when to browse the Web, you will find that there is a pattern, and we have a lot of data in this area. These behaviors are not random, but gather, and eventually erupt. That is, you send a lot of emails in a very short period of time and then do nothing for a long time, and then there is another outbreak, and so is the call. So one of the most important discoveries we have about human behavior over the past decade is that human behavior is not random, it's a burst of aggregation. The most important thing is that these behaviors follow the "Power Law distribution".

Of course no one thinks their behavior patterns are random, which is never the problem. The question is, what are the characteristics of random behavior? The eruption is one of the characteristics, and the eruption will lead us to the next question, as I mentioned earlier. If we study humans as we do with natural phenomena, we can predict human behavior.

The prediction itself is actually a frightening word. What are we going to predict? Do we have to predict what we're going to dream about tonight? Do we have to predict when the next promotion will be? Or predict who we'll run into? All of these predictions require data support, which requires a lot of data. Our ability to make predictions depends on how much data we have, and we can say how likely the predictions are. So, when I started thinking about it a few years ago, I decided to start by collecting data about the trajectory of human behavior, where we are and where we go next. I didn't have a channel to collect data from other people, but I was curious about the possibility of forecasting, so I decided to start by collecting my own data.

Into the big data age

In fact, many people's behavior trajectory data are collected. Does anyone still use a cell phone now? Of course, people will not delude themselves into denying the fact that your mobile phone manufacturer knows where you are every moment. They know not only your location, but also every phone call you make (to count the charges). They know not only where you are, but also where thousands of other customers are. So, compared to the data I've collected about myself, they have a lot of data to learn. With this data, people can compare different individuals. Of course, mobile phone operators are worried that the data will leak out, because they want to keep users trust them, while the release of information will be punished by the law. But in recent years they have come to realize the value of the data and have begun to provide data to researchers and other companies. My research team also got a lot of information about human behavior trajectories and call patterns, and the owner of the information was, of course, anonymous, and we didn't know who the owner was or what their phone number was. We just look at them as small individuals moving in the universe, just like the bromine that makes up gasoline.

With this data, we can finally ask: what is the predictability of human behavior? Can human behavior be predicted?

One of the questions we asked earlier was: How far do people move every day? The answer is simple. If you want to find out how many people are moving so far, this is a typical driving distance that most people will move so far. You will find that most people tend to move in a relatively small range. Of course, there are a few people living in the suburbs will be moving a considerable distance, the number of people who move a smaller range of people in the larger range of movement is precisely in line with a precise "power law distribution." So, if you have a lot of data, you can predict how many people are traveling, how many people are working far away, how many people spend most of their time around, or work from home. This is the first step in our study. It has been shown that when we study a large population, we find that different people behave differently. Next, we use the behavioral trajectory to figure out the entropy of everyone.

What is entropy? The entropy of the whole system is zero, which means the state of the system is very clear, you know where every point is, and where each point is fully defined, which is what we say: "Entropy is zero." "Entropy is a value that measures randomness." In principle, if a data mining algorithm can be written based on a person's past location, it is possible to calculate exactly where he will appear, and his predictability is 1, which means that the person's movements are completely random. He commutes home and at the same time every day.

We think there is a big difference between human behavior patterns, and many people's behavior is difficult to predict, because their lives are rich, and action is unplanned. But there are those who are more likely to be predicted, as we mentioned at the outset, and their actions occur within a certain range. So we calculated the average of these people and labeled their predictability in the diagram. We measure the predictability of a large number of mobile phone users, the first thing to notice is that the projection is large, with a peak of 93, which means that for a normal person, if we know where he has been, in principle there is a 93% chance of predicting exactly where he will appear next. And the predictability of all is above 80%.

Thus, once enough data is collected, the basic question can be raised: How high is our predictability and how can we get an astonishing answer? If we talk about predicting the future, we can say: "If we have enough data, is everything predictable?" "This is the question we have to think about now.

(the author is an honorary professor at Northeastern University, director of the Complex Network Science Research Center, the author of the outbreak.) This paper is a culture of Zhan-Lu. )

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.