Cloud computing encounters Big data collision technology revolution

Source: Internet
Author: User
Keywords Big data which cloud computing encounters


Two days ago, someone asked questions on Weibo, in what way to tell the big data and cloud computing can not be heard by professionals more clearly, in fact, there are many cases of large data, business intelligence analysis has repeatedly mentioned the value and significance of data mining, but today to see more data than before, big data is not terrible, The scary thing is that his real-time analytics will expose flaws and truth to people, so when cloud computing encounters big data and a brain pours into companies, can companies manage?



The so-called Big data mainly covers 3V orientation, respectively, processing time (Velocity), data Format (produced) and data Quantity (Volume), so large data is not a single technology, but a collection of many technical projects, and their common purpose is to deal with a large number of structured, semi-structured or unstructured data. Only by mastering the key technologies can we analyze and handle large data and establish business application value.



Continue to tell the story of a man who knew her daughter was pregnant earlier than his father. One day in early 2012, a father in Minnesota went to the store angrily and asked the director why he would send an ad to his high school daughter with a baby coupon. Does this encourage underage girls to be pregnant?



It turned out, however, that the father's daughter was pregnant and that the store was not a random advertisement. One wonders why it is so powerful that it is possible to find out the truth. The answer lies in real-time analysis of big data, including the girl's keyword search for commodities, and the trajectory of the behaviour on social networking sites that are already rich enough to show the fact that she is pregnant, and the answers to what she needs to buy the next time.



This shows that, if the use of massive data real-time analysis, these seemingly boring and trivial information, immediately can be transformed into fascinating value assets, creating endless business opportunities. It can help boutique apparel operators, quickly insight into customer preferences change, immediately form the best production and sales decisions, and then create a steady stream of revenue growth momentum; it can help the credit providers to analyze the mood changes from the tweets in order to improve the accuracy of the stock market forecasts. Creation is far superior to the peers in the fund investment rate of return.



No wonder all parties are flocking to huge volumes of information, such as the Obama administration's decision in March 2012 to invest as much as $200 million in research and development funding to improve the development, collection, storage, management, sharing and analysis tools and technologies needed in the vast data age, In order to use these technologies to accelerate scientific and engineering discovery Footsteps, strengthen national security, and improve the relevant education and learning model.



Why the Big Data marvellous



Indeed, although these gadget the large data application of its technology, all let enterprise heart yearning, however, most IT managers for half of the technology is not solved, leading to the impact of the business value of the output, it is a pity.



How can enterprises improve their technical readiness based on the analysis and application requirements of large data? The massive data precedence matrix (Priority matrix for the Big Data) presented by Gartner, a renowned research institute, in its published "Hype Cycle for large data", has been broadly predicted for the ups and downs of many technologies in the future; Which technologies are positioned as "revolutionary", transformational to pay close attention to, which technologies are highly (high) development track, worthy of good use, and which technology is probably maintaining the medium (moderate) development pattern, the future is not high opportunities, If you want to inject a lot of investment, fear to think twice, enterprise IT staff in the heart of the basic spectrum map and context.



According to Gartner, the first technology to reach a revolutionary level in two years from now is the field-type database (column Store-dbms), which will present a high level of development, while the technology of predictive analysis, social media monitoring, web analytics, and so on, is on the same track. In view of this, the field-type database and forecast analysis will be the target of the enterprise's urgent priority layout.



Explore the field database The first incense, in fact, it is not difficult to understand, because for data, storage, use, sharing and analysis of the use of the database system can be called the most critical carrier, so its face large data reading and writing efficiency, and near real-time (Near real-time) computing power, must be carefully considered; In this case, the traditional use of row as the index Access database, the efficiency is obviously not strong, unable to bear large data derived from a large number of workloads (Work load), if not to do this change, the back of the advanced analysis of the application, can say even think about it.



Of course, with Hadoop in its way, and associated with Key-value databases such as BigTable, HBase or Cassandra, these can be collectively referred to as "NoSQL" databases, whether Key-value database, As database, Graph database or document database, are different from the traditional relational database structure, seems to be closer to the processing of large data requirements, so why not directly using the NoSQL databases, rather than the field-type database?



In fact, another layer of nosql is "not just SQL", designed to complement the existing SQL, rather than replace SQL, enterprises should first from the database I/O requirements, Schema free requirements, a single data table storage requirements ... In the process of dealing with large data, what are some of the challenges that SQL can solve and what SQL cannot solve, and not to be fashionable in the pursuit of fashion; Thus, the field-style database is, indeed, very broad, at least, in the space that the enterprise relies on, It is certainly much more powerful than the NoSQL database for the reading efficiency of the data.



Cloud computing and Memory database revolutionary technology deserves attention



The "second tier" (Note: 2-5-year fermentation), which was named as the revolutionary technology by Gartner, contains two items, namely cloud computing, Memory database system (as DBMS).



As for the highly developed technology that also falls in the 2-5-year range, the project is quite diverse, including advanced fraud detection and analysis techniques, cloud based grid computing, data scientists, memory analysis (as Analytics), memory data grid (AS) Grids), Government Open data (open Government), predictive model solutions (predictive Modeling FX), social analysis (Social Analytics), social content (Social Content), as well as text-Analytics analysis.



The importance of cloud technology for large data processing and analysis is beyond doubt. From the private cloud point of view, in order to pass whether the MPI or MapReduce to carry out large data distributed computing, need to be based on computing, storage or network resources such as flexible scheduling, on this occasion, if the cloud to abandon, it seems that only a huge amount of money to deploy supercomputers.



Second, it talks about the public cloud. Although all walks of life can benefit from large data analysis, but most of the application field, in fact, not always need to do analysis, the use of frequency even up to once a quarter, or every six months; The cost of material and time constructs the Hadoop environment, the investment return rate does not seem to be very economical, indeed is questionable.



At this moment, if the enterprise can be paid in a flexible manner, to the public cloud service providers, leasing large data analysis of the necessary computing resources, and can be compared with internal premise management rules related settings, it is a good thing. Microsoft, for example, provides Hadoop leasing services on Windows Azure public cloud platforms, allowing businesses to build large numbers of servers and databases without having to invest in them, and even advertise windows and SQL The server's easy to manage feature takes you to the Hadoop environment, which is a fairly typical cloud big data service.



As for the memory database, that is, the relational database, even the field-style database, the whole place in memory, the advantage of this approach is that the most criticized disk I/O bottleneck in the past, can sound should be broken, thereby greatly boosting performance, shorten the response time of database operations; In the age of speeding, If a business can get the result of a commercial operation faster, it also means that it is more likely to win.



So it's not hard to find that when the memory database is expected to be revolutionized in 2-5 years by Gartner, the word "as" continues to appear repeatedly in the highly developed technology quadrant, including memory analysis, memory data grids, and so on, highlighting the "effectiveness" for large data processing, is too important, and the strength of performance is even more significant for the performance of the final business application.



It is also worth mentioning that in the 2-5-year list of technologies that Gartner has turned into "moderate development", there are many recent popular projects such as MapReduce, NoSQL databases, Database SaaS (DB Software as a Service; Dbsaas), this seems to be from the heat of the development trend of cooling, it is also worth the concern of enterprises.



(Responsible editor: Schpeppen)


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.