The topic of large data, from west to east, from the IT industry to government officials, has been on fire for two years, but there is no consistent definition. At present, the industry generally agree with Gartner's description, that is, the "3V" Characteristics of the dataset, is large data. The first one is volume, great data quantity, second is produced, very complex data type and data source, its three is velocity, very high data generation, propagation, and reaction speed.
In my opinion, the "data gap" in the age of large data should be met by the decision makers, who need to have three abilities of big data strategy, large data management and large data ecology.
Large Data Strategy Integration: Vision, Perspective, value
The value of large data has been proved by many industries such as electricity quotient, FMCG, advertisement, but it is not easy to dig out the value of large data. In my opinion, the enterprise decision-makers need to start with the "new 3V" of Vision (view), view (view) and value (value) when they make large data strategy.
1th from the perspective, the CEO must be large data, cloud computing as the core strategy, but not just the big data as an enterprise IT management. Be determined to invest, regardless of software or hardware facilities.
The second is to have the enterprise's own point of view, namely, the collection and processing of data strategy. For example, the stock market, we often face the same data, but the data processing is not the same, some people say the stock market down time, some people say the stock market down time to withdraw. For the same data, or even the same software, the decision way, the idea is different, the processing result will be very difference, this should become a company decision-making system core.
Third, the value, in the determination of ideas, the analysis of the data to be able to solve the actual problem of implementation, so as to achieve the value of large data. As Mr Ma recently cited, where is the best place to sell a bikini on Taobao? It's Inner Mongolia and Xinjiang, not the coastal areas that people usually think of as Hainan and Guangdong. Large data can help people find hidden inner associations, but it does not mean that they can directly bring social and commercial value. If you are a manufacturer of swimwear and sunscreen, what kind of marketing strategy will you make?
Large Data management integration: Simple, open, flexible
Big data strategy is important, but more important is how to execute, namely Big data management problem. can also be solved by three steps. The first is how to obtain, store and protect the data, the other is the data rich, namely how to clean, discover the data correlation between different data, third is the data insight, namely through analysis, presentation and decision-making tools gain insight, and finally through action, to produce value.
Microsoft's Big Data management platform, with its full consideration of the big data lifecycle, is why we integrate open source architectures such as Hadoop into Microsoft's Big data platform, adding Hadoop as a complement to non relational data processing, and, on the other hand, Hadoop as a service, Integrated into Microsoft's public and private cloud platforms. It is worth emphasizing that Microsoft is not simply migrating Hadoop to Microsoft's big data platform, but rather a real fusion that takes into account its usability, reliability, security, ease of deployment and flexibility, and even integration and optimization of tools on Hadoop. At the same time, Microsoft will adhere to the principle of open source, will do some research and development work on Hadoop back to the community, and the community to form a positive interaction.
Large Data Ecological Integration: platform vendors, data providers, developers, data players
The future of the large data ecology, the same will follow the most simple market rules, different roles of organizations and individuals, through the gradual maturity of the exchange mechanism, the use of the platform to provide data transactions, data analysis sites and basic tools;
The raw data provider provides free trading data sets; Developers provide data set based applications and services, as well as customized analysis and presentation tools; Data players like shareholders, in the market looking for a worthwhile investment in the data sets or institutions to invest, to obtain returns; now people fry rooms, fry, fry gold, perhaps people will fry the data in the future.
Microsoft has been experimenting with Windows Azure's marketplace, which is currently focused on business users, and has been able to link third-party solution providers, service providers, module providers, and final business users through this virtual marketplace, which can initiate free trade. On this basis, we have extended a data mart that allows the owner of the dataset to publish data to the market, providing a lot of very detailed data sets, small to cinema seating and road conditions, to national macroeconomic development data. This allows developers to use Microsoft's Easy-to-use APIs or tools to integrate these data into their environment and develop new applications.
Such large data ecology is clearly healthy and sustainable. For Microsoft, Amazon, Google, VMware, such as platform, focus on the underlying cloud computing infrastructure and large data services platform; For Taobao, China Mobile, the government ministries such as data providers, the original can only use data, in this model can produce more social and commercial value; For Salesforce, SAP, Ufida, Kingdee and other application developers, the traditional, very difficult, very cumbersome data integration, now through such a bazaar, can be the first time to achieve the different application systems to integrate the data to find value; For data players, Can have a Chaoyang type of investment platform to choose from, and not so easy to be manipulated by large organizations.
When data disclosure, data trading, and large data applications become natural habits, perhaps we can say that the big data age is really coming.