Big data is undoubtedly hot topic, but it was once questioned as hype, the Ithome comments Pro data is not empty talk, but the challenge is not entirely technical aspects of the topic, the greater challenge is at the business level, even the management side of the problem. The following is full text:
Big data, probably the fastest-growing technology after cloud computing, in the past year, although cloud computing is still a hot topic, but more popular is big data, the situation is like a few years ago, manufacturers are talking about the same cloud.
The whole industry talked about cloud computing, from top to bottom almost all companies can be touched by the above, it is questionable is hype, but it turns out that cloud computing has not yet become a bubble, there are many further results. However, encountered today the entire IT industry is chasing large data topics, in the same situation or inevitably to question: this is not also in the hype?
The most direct question is: Big data refers to http://www.aliyun.com/zixun/aggregation/14294.html "> Large data analysis, is nothing new, a large number of data processing and analysis of the application has long existed, Many enterprises use data warehousing to solve a large number of data processing and analysis problems. This kind of situation is like the cloud computing was enlarged to explain as the webpage e-mail, make everybody confused, unavoidably feel that "originally cloud computing already existed, just old bottled new wine", but from the current development situation of cloud computing, this is certainly a misunderstanding.
The misunderstanding of big data, in fact, has been caused by translation from Chinese. Large data is really a bad translation of the noun, it is difficult to find a suitable translation from Chinese, any one of the methods, can only express part of the meaning, it is bound to cause another part of the misunderstanding.
The big data has 3 kinds of characteristics: Volume, Velocity, variety,volume refers to the amount of data, and in the end the amount of data to calculate it? This actually does not have a certain boundary, but many enterprises already faced with the daily data quantity to dozens of, hundreds of TB's speed increases, but the total data quantity also reached the PB (Petabyte) level, such data quantity has made the traditional database difficult to handle; Velocity is the increasing speed of data, such as mobility, the popularity of social networks, so that the speed of data growth than traditional enterprise applications come much faster, once the data proliferation speed, data processing, analysis speed will have to keep up; and produced refers to the diversity of data, we now surf the internet is not just look at information, At the same time we are constantly in the output of data: Upload photos, upload video, Weibo, on the other hand, it in-depth life of all levels, a wide range of monitors, sensors also constantly produce machine information, the data type is not as simple as in the past.
These 3 data features are already in the current style, not in the future. But how to solve the increasingly pressing problem of large data processing? Internet companies such as Facebook and Twitter, which face big data explosions, are starting to use new technologies such as Hadoop and nosql to solve problems.
Hadoop is a decentralized processing technology that is based on a distributed architecture, so it can use a large number of inexpensive servers, create huge processing power, and increase processing capacity from a horizontal expansion to cope with larger data processing requirements.
With the open source code technology like Hadoop, so many people do not need to buy large data analysis equipment, there are ways to analyze a large number of data, such as Japanese pharmaceutical companies through the analysis of Twitter users of the message, analysis of colds, runny nose and other symptoms of the word, can understand the epidemic trends, grasp the market fluctuations And in the past, if you didn't have a viable big data analysis tool, you probably wouldn't even want to analyze Twitter.
As for the traditional data analysis vendors, the data analysis platform has been transformed into a decentralized processing architecture, providing the ability to expand horizontally, or to increase the processing speed of database technology to cope with the 3 characteristics of large data. This development will also help enterprises in response to future data processing challenges, for users who have adopted data warehousing, such as banking, can smoothly transfer. After all, Hadoop is still a very new technology, with a higher technical threshold.
So, big data won't be an empty talk, and there are a lot of changes in technology that are happening. However, the challenge of big data is not entirely technical, and the bigger challenge is at the business level, and even the management side of the problem.
(Editor: Lu Guang)