In this era, the words "Big Data", "Cloud storage" and "cloud computing" are absolutely the first of your vision and hearing in a group of mixed Internet, whether it's technology or product or operation or business. But what is big data? What's the big data for? What is it? It takes a cool look.
In fact, big data is not a iffy technology, but a common skill, a skill to discover value from data.
I. Four misconceptions about large data
1. A large number of errors is data
"Everbright is not enough!" When I started with this sentence, just a girl pushed the door and into, heard this sentence, slightly Zheng, bowed down and sat down.
Now in many occasions, a mention of large data, the basic will say "Day processing data volume XXGB, upload pictures xxgb, concurrent number of XXX" "Hadoop cluster has XXXX node, total storage Xxpb" ... Such technical language. But not the data is big, can reach the realm of big data, can achieve the great harmony of life together?
Data again big, do not use, stay in the room Pianan corner, that is not big data, but the black sheep. On this issue, the traditional portal site is basically belong to sit on the Golden Hill but no money on the brothel. Regardless of Sohu Sina NetEase, the number of users per day hundreds of millions, but in addition to simple advertising presentation, and not through the analysis of data to produce more value. The reason is not to mention the penguin, because they have the largest QQ users, did not promote but does not represent did not do, such as the introduction of the QQ circle of the kind of amazing feeling, now think of it also shocking.
And for small and medium-sized websites, do not blindly pursue the advanced technology of the structure. The first thing to consider is the business operation and promotion, only if the user surge in the case of red, then consider technology upgrades. Give you a double choice, a. 1000 users, the framework of the full reference to the United States Amazon never downtime, B. Daily users 100,000 people, because of high concurrency has to go down three times. What would you choose?
Don't you think, "My site is too much too enthusiastic and down because of users?"
2. Misunderstanding two understand big data must understand the technology
"I don't know technology, can I learn big data?"
Big Data is more of an ability, not a skill, to see the value of business opportunities from endless data. Zhuge Liang understands the art of war, he knows where the ambush should be lit, he does not need to know how Guan Yu is playing broadsword, nor does it need to know whether Zhang Fei killed the snake spear or cut it.
3. Misunderstanding all three companies must understand big data
I admit that if the aunt who sells pancake fruit can develop an app, get the customer to whether the pancake is crisp or not, the chili sauce is flavoursome direct opinion, it is advantageous to the product improvement. But will you download a 8M app for a 3-dollar pancake?
And at the mobile Internet Conference, Evernote's CEO, Phil Libin, made it clear that the business model for his products was to charge users with the willingness to pay for the product experience, rather than playing with the big current data.
It's so common to know what you can play, and it's more valuable to know you can't play without playing.
4. Misunderstanding four data the more the better
From Edison Chen to Li Zongrui, all revealed a thick collection of addiction. Do those girls who are hidden in the depths of the hard drive really have a day to see the daylight? Whether Tokyo is hot or the Caribbean, the origin is only new Asia buys the dish, who listens to the old person come on?
and search a "deposit depreciation" of the key words, from "50 years ago million to 13", to "10,000 yuan save one year to compensate 19 yuan" can be seen, money must be used to have value, the data is the same.
Only the constant use of data, mining data behind the relationship and value, can be such as snowball general, so that the relationship between the data richer and more perfect.
Two. The core idea of large data
1. What's not important?
At the meeting, helpless to find whenever the case between men and women, the public to understand the ability to soar, and the technical aspects of biased products, the following is a face callous sleepy. (You!!! Are you here to listen to a sex talk?
The self-help stories of Netflix using big data to make "solitaire" have been heard countless times. Netflix is America's largest DVD and Internet video rental site, with 27 million users in the country and 33 million globally. The data they collect from streaming video users is surprisingly amazing, every search, every pause, every positive or negative evaluation, and its own location data device data social media data, after analyzing these data, found that their audience like actor Kevin-Shi-west, Also like director David Finch, and like 1990 years of the British TV series. Together, Netflix is determined to shoot the card house and use the data analysis to its fullest. On Netflix's viewing page, the feature of a paused screenshot is provided, and they rely on the data to determine what kind of scene and picture the audience prefers.
Of course, these three factors are in fact unstable, for example, Kevin Spacey has made only 4.2 points of "Fred Santa Claus", David Finch's first directed "Alien 3" is also the worst in the series. But in terms of probability, these three success factors include more reliable episodes.
In addition, the introduction of the "namesake" application, although it seems to be more Guo, but the usefulness of the identity behind the data can be used for various two times. Every time I see a similar application, I always put the former colleagues "in Switzerland," the name of the exotic to enter, now only this application is accurate to find the country only 1 in Switzerland, and Anhui people. Call the Swiss embassy and lock him up quickly!
So the core of the big data is not the data, but the data. That is to say, you can't just occupy someone else's body, but also occupy someone else's heart.
2. More fault-tolerant data and more diverse sources
A 500M user Data Excel table is not large data? Once in the user screening of a product, such an Excel successfully dragged my computer three times, I was angry said: "KAO, this SB big data!" Now think of it, I wronged the big data gentleman, you suffer.
Real large data, should be from different dimensions, different ways to come over the various formats of data fragmentation, not limited to text/video/voice/location/picture and so on. Only by putting the data of different dimensions together can the trend be more realistic. When the same data accumulates more than a certain limit, we get less useful information from the new sample, just as the marginal effect in economics is diminishing. And the simple explanation is that if you have already dated five it yards, then the sixth one will not be new to where to go, it is better to readjust the direction, change a Gaofu to find some different stimulation.
The source of diversity also avoids being stuck in a cul-de-sac. For example, "three years of natural disasters", if only to check the three-year weather conditions and the relationship between the number of deaths. Then it may be concluded that "clear weather is more likely to cause death than cloudy days." But in fact, if to combine the "People's daily" 100,000 kg per mu of relevant and fair reports, plus those years of Chinese food imports and exports, then we will come to more constructive conclusions. In 190 after the sister's cross-examine, only said four figures. In 1958, China exported 2.8834 million tonnes of food and imported 223,500 tonnes. In the first year of difficult times, 1959, China exports 4.1575 million tons, imports 2000 tons, that year, I heard that everyone is very hungry. Don't say much, lest you drink tea.
For example, this two-day Taobao home to me to push personalized ads in addition to the sex goods is "Playboy" costumes big discount, I am not looking for material and map search for "sex goods" Mody? You've been pushing this for me forever. If you can get my primary school teacher's comments, Junior high School teacher's comments and conduct evaluation, high school teacher's comments and conduct evaluation, through a variety of sources of different data sources analysis, then you give me the push will be "how the Iron and steel" "Lei Feng Diary" and other good books. The university teacher's comment even if, that the tussle CS was torn by me too long is easy to grudge. )
3. The body with large data, but also the heart of large data
Previous data analysis, more accurate sample/depth data mining, "precision" is its pronoun. Do not conform to the specifications of the sample filter out, and then dig deeper into the relationship between the data fields, to get a few accurate and incomparable figures to do PPT, or from a series of data accurately find a face of a nasty candid girl.
But big data is more about getting some sort of trend out of data analysis, which doesn't have to be too precise, but it gives the decision makers the clout to make a decision. Big data is not important, it is the people who use large data.
Because even for the exact same data source, the conclusions or decisions of different people may be quite different. The Three Kingdoms Chibi War, when pang suggestions "if the boat is all match, or 30 for a row, or 50 for a row, the end of the chain with hoop, spread wide board, Hugh said people can be crossed, the horse can go on, and at the same time to hear the words of the two people, Caocao under the seat and Xie," not Mr. Good seek, Ann can break East Wuye! ", and Xu in private pang "You are bold, only afraid of burning." "This shows that" people-oriented "is how incisive a nonsense ah!
Whether you NB or SB, the data is always there, never abandon.
4. Emphasizing trends and the future
Big data, more should be the analysis of the past, remind now, looking to the future. The big data that can't be used in practice is bullying, whether this result is beneficial to all mankind, or help the website increase 1% conversion rate, this is useful.
Pictured above, the violence was a high-speed train developed by the United States and the Soviet Union during the Cold War, and they put the turbo-bombers directly on top of the train. M-497, the United States, ran 295.54 km/h on the Ohio tracks in 1966. Although a few decades later, the modern high-speed railway, but without the initial barbaric experiments, I am afraid that the independent development of high-speed rail is not so strong.
Talk about the trend of big data, how can not mention Google's flu trend? Google analysis of its own hundreds of millions of search queries, near real-time to provide many countries and regions of the world Influenza epidemic assessment, from the screenshot can be seen, Google's trend curve and the U.S. official data overlap is very high, But the latter in time and efficiency can not match the Google trend.
If you're going on a business trip and you find that the plague is erupting where you're going, I think most people will cry and shout to quit.
In 2012, for example, a famous condom brand in the United States issued a campus sexual health report, the report shows that in the United States, 25% of college students have sexually transmitted diseases, of which the number one is Condyloma acuminatum, the highest disease rate of the school is located in Colorado State, the United States Air Force Academy (U.S Air FORCE Then, after that, I was bored from the college official website to see the school boys accounted for 78.1%, girls accounted for 21.9%.
From this data at least we can analyze a result: Before a one-night stand with the returnees, please check the degree card.
Three. Large data applications Blind thinking
1. Medical services
Through a series of medical records and other data, to get a certain kind of people's health trends, such as "Do IT8 Year" "Overtime 6 hours a day" "The disease also insists on working" "Eat lunch every Day" "a pack of smoke every day" "every day a cup of coffee refreshing" the label specific to a group of people, then personalized push a piece of information as shown above.
Anyone else working overtime?
2. Crime Alert
Through historical crimes and police records of the city, we can conclude that there are many areas of high crime in cities, such as chop-hands area/digging area/sexual assault area ... And so on, every day to release reminders, pushed to the public mobile phone/flat and other terminals. I believe that the rogue in the sexual assault zone would be overwhelmed and surrendered.
3. Older Youth Blind Date
This idea if with similar to Google Glass wearable equipment is even better, in hundreds of men and women Dating Conference, you with full-featured glasses look around the girl, the system automatically analyze a variety of data, help you find the most collocation of the other half, according to the matching degree automatically walk together. Of course, if the infra-red perspective is very developed, it would be great to add the most matching degree to the VIP function that pays to see the nudity.
Four. Summary
Even coax to cheat the vernacular 1 hours of large data topics, back to the station soon, there is a trainee boy sent me the following message:
"Teacher, listen to your big data sharing, I can understand so?" The big data is the system based on a dick's age/height/weight/dressing style/shopping inclination and other basic information, plus the previous look at the frequency of a film/starring/style/duration/fast-forward times, and comprehensive reference to me the same cock silk Group goddess/play format /area and other factors, when I turn on the computer, automatically give me a recommendation of the most suitable for my a film?
I read this passage, a long time can not calm. Only one song can express my mood at the moment.
"Ah ~~~~~~~ how painful the understanding!"