July 9, the United States, "Wired" magazine recently published a review of large data articles. The authors argue that large numbers do not make sense if there is a lack of field research into people's real life.
In just a few decades, the relationship between "technological genius" and society has changed: they have turned from a loner in a house to a savior, from anti-social to social best hope. Many people now seem to believe that the best way to understand our world is to sit in front of a computer screen and analyze the mass of information we call "big data."
In this regard, we just need to look at "Google Flu Trends (Google Flu Trends)". When Google launched the service in 2008, many in Silicon Valley touted it as a sign that big data would soon be phased out of traditional methods.
But they were wrong.
"Google Flu Trends" not only provide an accurate description of the spread of influenza, but also can not achieve big data advocates dream. This is because if there is no "thick data" (rich, contextual data, they can only be achieved by throwing away the computer, deep into the real life), large data is meaningless. Geeks have been ridiculed for not being able to adapt to social life, and they are told to "go out more". In fact, if believers in large numbers want to understand the world they are also involved in shaping, they really need to go out more.
Google's failure has nothing to do with algorithms
The aim of the Google Flu trend is to identify the search terms that people use in the flu season, and then track the peak times of these search terms in real time. In this way, Google will be able to alert the new flu before the outbreak, and early warning time than the official Centers for Disease Control and Prevention of early warning time about two weeks.
For many, the "Google Flu trend" has become a typical representation of big data, and it shows the power of big data. In the best-selling book Big Data: A revolution that changes the way life, work and thinking (A revolution that would Transform how We live, Work and think), author Victor Maire Schoenberg (Viktor Mayer-schönberger and Kenneth Couqueil (Kenneth cukier) claim that the "Google Flu trend" is a more useful and timely indicator of flu than the government's lagging data.
However, an article published this month by the famous Science magazine tells us that the "Google Flu trend" has overestimated the prevalence of influenza almost every week since August 2011.
In 2009, shortly after the "Google Flu Trend" was launched, it was completely unaware of the outbreak of swine flu. In fact, many commonly used search terms in the flu season have nothing to do with influenza, which is closely related to the usual outbreak season of influenza, winter.
Many argue that the failure of the "Google Flu trend" stems from the immaturity of big data. This is not the point. Of course, tweaking algorithms and improving data collection techniques will make the next generation of large data tools more efficient. The real hubris of big data advocates, however, is not too much confidence in an immature algorithm, but a blind belief that sitting in front of a computer screen and crunching numbers can fully understand the world.
Why do I need thick data
Big data is just a lot of "thin data", which is obtained by tracking people's activities and behavior. The places we go most often, the things we search on the Internet, how long we sleep every day, how many contacts we have, the type of music we listen to, and so on. The data is collected by "cookies" in your browser, Fitbit wrist straps you wear on your hands, or GPS on your phone. This information is undoubtedly important, but we cannot gain a complete understanding of people through them.
In order to really understand people, we soon need large data, and need thick data. Thick data consists not only of facts, but also of facts. For example, 86% of households in the United States drink more than 6 quarts of milk a week, but why do they drink milk? How did they drink it? A piece of cloth containing three colors, embroidered with stars and stripes, which is thin data; the American flag, which is flying in the wind, is thick data.
Based on "What we Do", the big data is a simplistic understanding of us; thick data tries to understand us through our connection to the world around us. Only by understanding the relationship between people and the world, people can understand the world as a whole, which is exactly what Google, Facebook and other companies want to do.
Understand our world
Think of the Grand Manifesto of Silicon Valley. Google's mission is to "organize global information so that everyone can access them and benefit from it." Mark Zuckerberg recently told investors that in today's world of globalisation and the growing importance of knowledge economy, Facebook is committed to a new mission: "Understanding the world," says Zuckerberg. "People posted billions of content and links on Facebook every day," he said. With their help, we set up the clearest model for everything in the world through a special algorithm mechanism. "Even some small companies are involved in understanding the world." Last year, Jawbone's vice-president, Jelimi Robinson Jeremiah Robison, said their health-tracking device, Jawbone Up, aims to "understand the science of [human] behavioural change." ”
These goals are really big. It is not surprising that businesses are eager to better understand society. After all, understanding the customer's behavior and sociocultural-related information is essential for business operations. Moreover, in the age of knowledge economy, the information itself has become a kind of currency, they can exchange clicks, browsing volume and advertising revenue. Or, to put it simply, they can get power. In the process, if companies such as Google and Facebook continue to help us improve our collective knowledge, it is legitimate to gain more power. The problem is that if they claim that computers can organize all of our data or provide us with a complete understanding of flu, health or social relationships, they fundamentally underestimate the meaning of "data" and "understanding".
If the big data advocates in Silicon Valley really want to "know the world", they need not only big data but also thick data. Unfortunately, to get the latter, they need to leave their computers to experience the world, not just through Google Glasses (or through Facebook's virtual reality device) to see the world.
People's Behavioral situations
If you are highly familiar with a field, have the ability to fill information gaps and imagine people's behavior, then "thin data" will be useful. In other words, if you can imagine and reconstruct the situations in which people behave, the actions you observe are meaningful. Without an understanding of behavioral situations, it is impossible to introduce any causal relationship or understand the causes of people's behavior.
That's why researchers have to make every effort to control the environment of laboratories in scientific experiments to create an artificial place where all kinds of influence factors are taken into account. But the real world is not a laboratory. The only way to make sure you know something about a strange world is to observe and interpret it and explain everything that is happening.
People's Background knowledge
If large data is good at observing people's behavior, it is not good at understanding people's background knowledge of everything. How do I know how much toothpaste to use every time I brush my teeth? How do I know when to get into another traffic lane? is that "funny" or "My eyes are in something"? These all involve people's intrinsic ability, unconscious and background knowledge, which control most of people's behavior. As with the surrounding things, these invisible background knowledge can only be found if the observer actively sees it. But they have an important impact on everyone's behavior. It can explain the connection between things and people, and the meaning of things to people.
There are many ways of observing and interpreting human behavior in anthropology and social science. Researchers not only observe human behavior, but also examine their situation and the background they have. These methods have a common feature: they require researchers to delve into the chaos and real human life.
No single tool can be a super weapon to understand human beings. Although there are many great inventions in Silicon Valley, there should be a limit to the expectations of any digital technology. What the "Google Flu trend" really teaches us is that we can't just ask how big the data is, but how thick the data is.
Sometimes, going into real life can get better results. Sometimes we have to drop the computer.