In the face of rapid growth to the PB level of mass business data, people naturally have many questions in mind, big data can provide my business how to help? What is the relationship between cloud and big data? Cloud can land? Hadoop in the end can be used as a panacea for big data? What role will virtualization play in big data? With so many doubts, today we come to big data.
According to a IDC report titled "Digital Universe," it is estimated that global data usage will reach 35.2ZB by 2020. In such a huge amount of data in front of the efficiency of data processing is the life of the enterprise.
The source of big data
Mass data to big data development process is a quantitative change to qualitative change process. Data has been in development for years and is doubling every year. In the past, the technologies that grew up in the past were able to manage it better, but once it reached a critical point, there was a qualitative change. Past technologies have not met the current trends and new technologies are needed to meet new orders of magnitude. So the concept of big data came into being.
Big data has the following meanings: First, large-scale. The rapid growth of business on the market, increasing the number of customers, the amount of data generated more and more, according to IDC survey shows that the next 10 years, the global data volume will grow four times; second, wide source. Now the concept of big data, the data is not only derived from the internal application of the data, but also includes the external. As a business, you can include suppliers, customers and other data sources. Third, there are many types of data, including transactional structured data, semi-structured data and unstructured data.
The current market is fiercely competitive, customer needs vary widely, the market situation is rapidly changing, making these data must be constantly changing as the business. So in the big data era, as businesses, people began to think about how to effectively master and manage big data. How to extract useful information about the business development of a company from a large amount of data to help enhance operational efficiency and how to make big data have a great value.
VMware big data landing
Big data has too much voice and discussion. But how to get there, how to help business? Let me give you a practical example. In the Credit Card Department of CITIC Bank, there are about 1,500 promotional activities in 2011. Before the promotion of a promotional campaign takes two weeks, with a big data solution requires only 2-3 days; commitment to customers after brushing a certain amount of gifts, before this action takes a few days to deal with, and now as long as the required amount, You can send gifts to customers in real time.
At CITIC Bank's Risk Assessment Center, big data is used to evaluate daily credit card transactions and creditworthiness for each client. Before adjusting a customer's credit line may need to be done every month, or even every quarter, now every day to adjust. After CITIC Bank adopted a big data solution, it conducted 40 million customer credit adjustments. This is absolutely not possible without big data solutions. Behind this case is VMware's big data solution.
Another big data user in the reality version is Google. Google produces a huge amount of data every day, and it has a complete set of analysis systems and solutions to handle the data itself, which can be further processed and used. For many other agencies, such as government agencies and enterprises, they also want to do the same thing. However, this must be done using vendor-specific hardware, software, and solutions. VMware hopes to help these enterprises in the cloud era, no longer be limited by the physical environment, can be more flexible, effective, low-cost way to achieve. The future will see the banking industry can use big data to analyze customer credit and risk management, the retail industry can analyze their information through big data to make the supply chain and capital chain more smooth operation.
Big data is high-speed sports car cloud computing is the highway
If the big data is high-speed sports car, then cloud computing is the highway. Some people say that cloud computing and big data are twins, two are different individuals, dependent on each other and complement each other. First, the two are conceptually different: cloud computing has changed IT, and big data has changed the business. However, big data must have a cloud as an infrastructure to operate smoothly. There is no such cloud highway, big data such super sports car can not run. When the market for big data this super sports car demand is high, the cloud computing highway will be in all directions to develop in all directions, forming a positive interaction.
Second, the target audience for big data and cloud computing is different. Cloud computing is a technology and product sold to CIOs. It is an advanced IT solution. The big data is sold to the CEO, sold to the business layer products, big data decision-makers are the business layer. Because they can feel the pressure directly from the market competition, they must overcome their competitors in a more competitive business. For example, telecom operators can use big data analysis mobile phone users what is the reason. A week ago, a leading mobile phone operator who took a big data solution found the cause and gave the company a high return of $ 100 million.
VMware is the industry's leading cloud infrastructure vendor with strong technologies, products and solutions in IaaS, PaaS and SaaS. When managing the Hadoop platform, VMware has corresponding products, such as vFabric Data Director and Serengeti. Both technologies effectively manage the Hadoop platform for rapid deployment, one-click management and more.
VMware recently acquired Nicira, a cloud services company that conducts online service analytics. It can make a lot of data, whether it is preset data, or other application data, can be easily uploaded to its services, for rapid analysis, and display the results of the chart. It's easy to apply big data technology, whether it's a big company or a small company, or a department. VMware is dedicated to building freeways for high-speed sports cars that allow big data to be effectively combined with the cloud.
Virtualization enhances Hadoop's security, agility and manageability
Hadoop is sponsored and developed by the Apache Foundation and is recognized as one of the open platforms in the industry. Authorize a company to publish its own version of Hadoop. Distributed system represented by Hadoop is a necessary but not sufficient part of big data system. Necessity is because much of today's big data is machine generated data or a wide variety of detectors and computer generated logs from the Internet of Things that are artificially generated and huge enough to be placed directly Go to the database. Hadoop provides a completely new way to easily scale out and put the data in the library for arbitrary data analysis. Hadoop has successfully built this environment so that software around Hadoop can provide a wide range of capabilities to accomplish intelligent analytics.
The reason it is not sufficient is because we need to analyze the data, the client can put the data in the pool, and Hadoop divides the data into hundreds and thousands of nodes, which is necessary for certain scenarios part. But more applications require real-time response, interactive response, this time you need other technologies, including memory retrieval technology, and even in the data generated when the need for real-time response technology. These technologies are combined, is a complete big data processing system. So VMware and its partners have been constantly working on real-time response, interaction and content retrieval.
VMware's strategic direction is to work with industry-leading variants to create an ecosystem of openness that will allow all versions of Hadoop to run on VMware's virtualization platform. In this guidance, VMware has done the following aspects of the work. On the one hand, working closely with the community, VMware developers, along with community developers, enter the Apache source code base. Hadoop did not take into account the virtual environment when it came out, it was a physical environment of technology, such as the physical concept of machines and racks, but without the notion of virtual machines, the code VMware added was added to the virtual machine's Concept, the concept of virtual machines and some other concepts are not the same, need special treatment, in the source code to know it in the virtual environment to run and optimize. Through the efforts of VMware, Hadoop now open source technology to run in a virtual environment. Can make it ten minutes or even less time, from scratch, resulting in a new cluster. VMware's goal is to create a spacious avenue for super sports cars that will allow big data sports cars to run quickly.
In addition, the role of VMware virtualization for Hadoop is to make it more connected to the cloud and from a realistic perspective to the cloud computing environment, making it easier to manage and secure.
First, to make Hadoop suitable for a multi-tenant environment, there are many times when a company needs a Hadoop or big data system, and often not just one department. Different departments may need their own Hadoop cluster in a private cloud. With more similar needs in the public cloud, virtualization provides a very good architecture that allows multiple clusters to run very flexibly and simultaneously without impacting each other.
Second, improve the security of Hadoop. Now the industry's basic programs are running on a Hadoop platform, information protection is very insecure, mutual can see each other's data. VMware's virtualization creates strong isolation between different clusters.
Third, increase the scalability of Hadoop. Because of the Hadoop cluster, once a physical environment is created, adding nodes, especially nodes, is not easy. Hadoop needs to look at the needs of each department, or fluctuate over time. In the case of virtualization, it will make it easy to scale up and down such nodes.
The last point is to increase CPU utilization. According to a general response from the Hadoop community, 40,000 nodes are clustered with an average CPU utilization of only 20-30%, and virtualization dramatically increases this utilization.
Software-led data center open is the last word
VMware has a profound impact on the data center architecture. In recent years the entire data center has evolved from a hardware-led to a software-led world. In the past, data centers were mostly computationally led, and more and more applications are now data-driven. VMware offers a unified infrastructure that wants to meet both compute-centric and data-driven applications.
Open is the essence of VMware, in the data management of the development path also confirms the long-term will be together, for a long time will be divided. Forty years ago, data management was the era of warlords. There are many data companies that have their unified solutions. Nowadays big data is disruptive, and the era of all-encompassing development can not be met. All data needs. At this time, VMware hopes to provide a good soil, through a more flexible infrastructure, making it very easy for customers, and very low threshold to try a variety of new technologies, without much effort to try Hadoop .
Song Jiayu, president of VMware Greater China, said: "The cloud era vendors are not self-defeating, is completely market-oriented era. The market tells us that customers have a wide range of past, present and future choices. We insist to understand customer needs , And open to cooperation with manufacturers.We often see the success of a successful vendor in the past, but this success is often a burden, VMware is very aware of this and always keep an open attitude and strategy, and this is why we The secret to staying true to innovation and leading the market is where you stand.
Big Data China Heart
With cloud computing and big data strategy in China, VMware's R & D efforts have also made great strides in China. Following the expansion of Beijing R & D team last year, in September this year, Shanghai R & D also announced expansion and increased R & D investment. The overall development shows the confidence and affirmation of VMware in the innovation capability of the Chinese R & D team and the determination of the Company to further develop and support the cloud computing market in China.
Chinese team also live up to expectations, in many projects have outstanding performance. Fan Chenggong, senior vice president of global VMware, said: "We are very pleased to see the Chinese R & D team to participate in the development of global mainstream technology has made outstanding achievements .Hadoop related technology was born in China, the first engineer in China independently developed Hadoop Technology has been approved by the headquarters, and then only to enlarge the project, half of the current project engineers in China, such a leading technology led by the Chinese R & D team.