With the full integration of it and communication technology into the social life, the huge amount of data generated every day contains great value, the data is becoming the strategic assets of enterprises. It is one of the strategic directions of various industries, especially the telecom industry, to acquire new cognition and method from the mass data and create new value.
Telecom operators expected to become big data pilots
Big Data gives us the first feeling is big, then the big data is how big? A set of data called "one day on the internet" can give us some reference.
Throughout the day, the internet generated all the content can be engraved with 168 million DVDs, sent 294 billion letters (equivalent to two years in the United States, the number of paper letters), issued a blog posts up to 2 million (equivalent to TIME magazine 770 years of text volume) 378,000 handsets sold, up from 371,000 births per day in the world ... And these numbers are rising.
As of 2012, data volumes have jumped from TB (1024GB=1TB) levels to PB (1024TB=1PB), EB (1024PB=1EB) and ZB (1024EB=1ZB) levels. International Data Company (IDC) research results show that 2008 global data volume for the 0.49zb,2009 years of data for the 0.8zb,2010 annual growth of the number of 1.2zb,2011 years is as high as 1.82ZB, equivalent to the global output of more than 200GB per person data. By 2012, the volume of all printed materials produced by human beings was 200PB, and the amount of data in all the words in human history was about 5EB. According to the IBM study, 90% of all the data available to the entire human civilization have been generated in the past two years. By the year 2020, the world's data will be 44 times times the size of today's.
So big data just means big? No, it means that icebergs are coming out of the water. The present data contains the value of Jinshan has been widely concerned. In March 2012, the Obama administration announced that it would invest 200 million dollars to launch a "Big Data research and development program" that would bring big data to the national will, with far-reaching implications for future technology and economic development.
So what is the big data? Broadly speaking, large data refers to the vast amount of data that exceeds the processing capacity of traditional database systems, its data scale and transmission speed are very high, or its structure is not suitable for the original database system. To be exact, large data has four characteristics, namely large data (Volume), variety of data (produced), Real Time (Velocity) and the large amount of commercial value (value) it contains.
Since the big data has such great value temptations, it is not surprising that he can make many "heroes" bow. At present, the "heroes" with large data value mining technology include telecom operators, internet manufacturers, financial enterprises and so on. Among the many heroes, because of the big investment of Internet manufacturers, they have taken a more leading position in the Big data field. However, carriers, with the most complete and full data in the telecommunications network, have become the most powerful competitor for large data cross-industry applications. In the future, with the continued strong commitment of operators, he will become the leader in this field.
There are some types of data in carrier network: Model data of various terminals, user location data, Internet business data, user basic attribute data and user consumption data.
ZTE believes that large data will become the industry, especially the telecommunications industry, the strategic direction of development.
For the telecommunications industry, the current homogenization of serious competition, there is a strong desire to find operating blue ocean. Because the network pipeline promotes the enterprise to upgrade and transform its profit model, it is one of the outlet of the operator to analyze and use it by its own huge amount of core resources-network data, so as to improve the operation level and find the innovation point.
As for the enterprise industry, with the increasingly socialized information interaction, corporate users have more choices, and it becomes more difficult to retain users. Enterprises need to deeply study consumer groups through business data, continuous improvement of products and service experience, in order to survive and develop. And the level of information technology, but also make products, services by competitors to imitate learning faster. On the whole, the Informationization promotes the market competition, the enterprise must through the effective utilization data potential value, can survive and the sustainable development.
As a senior telecommunications sector solution and product provider, ZTE believes that the telecom industry's foothold in the development of large data is the most valuable location, voice, network traffic, video processing, analysis and mining of large data, at four levels to support the higher level of business operations and innovation. First, the collection of large data preprocessing (ETL, cloud storage), the second is the data information (statistics, retrieval, query), the third is the depth of analysis and mining (user grouping, behavioral analysis), four is the forecast (product, tariff, user trend).
In the field of enterprise and enterprises, based on operational data (big business call center data, retail store information, traffic information, government departments and other data collection and processing, analysis and mining, and can be combined with the telecommunications industry data (mobile phone user location, etc.), through the analysis of large data utilization, directly support and promote the operation of enterprises and businesses to improve the level.
Innovation technology is the cornerstone of large data development
To realize the value of large data mining, large data technology is undoubtedly the cornerstone and booster. In the process of large data technology development, the essence of distributed computing is reflected incisively and vividly, in which Apache's Hadoop distributed open Source architecture is a booster for large data, which is adopted by many large data companies including IBM, Alibaba and ZTE.
The logical flow of large data mining is shown in the figure. Rely on different technologies in different application processes and fields.
A logic diagram of the mining technology of ZTE's large data value
The data mining technology of zhongxing communication can be divided into three stages and eight links. Data collection and storage and retrieval are the preprocessing stages of the database, data processing, analysis, mining and model prediction are the stages of the mining, and finally the output stage is the result.
In the data preprocessing phase, ZTE uses ETL tools, responsible for the distribution of data from heterogeneous data sources such as relational data, flat data files, etc. to the temporary middle tier after the cleaning, transformation, integration, and finally loaded into the data warehouse or data mart, as the basis of online analytical processing, data mining.
The data mining stage is the most critical stage of the whole process, and ZTE uses CEP (complex data processing) and MapReduce technology, in which CEP abstracts the data stream into event sequences, which enables the upper application to master the operation status and take action in real time.
Finally, it is the output presentation stage, and the aim is to be more intelligent, simple and smooth to render the various results of data mining. ZTE uses technologies such as cloud computing, tag clouds, and graphs.
Throughout the process, the large number of data involved in a wide range of technologies, but also with each passing day, technology is always evolving. ZTE, as a pacesetter in the communications industry, has been working on the research and development of large data technology, which has promoted the evolution of large data technology. In the evolution of large data, ZTE believes the following trends will become increasingly apparent:
In data storage and management, efficient storage becomes the main research direction of storage technology, and relational database and distributed data management approach gradually merge.
Large-scale data processing analysis of the diversity of requirements, resulting in off-line batch processing, real-time flow processing, distributed memory calculation, Graph computing framework and other computing frameworks coexist. A mix-and-match architecture is needed to meet application requirements. At the same time, the multi-mode computing framework is merged.
The demand of natural language understanding promotes the development of semantic Web technology, the data fusion service of cross Media promotes multidimensional multimodal information fusion and processing, and large data visualization becomes the best way to quickly understand large data.
The application of telecom intelligent data has been developed along the path of "MSS to BSS to OSS to telecom network Element", and the application scene is rich, and the data in different fields are merged.
Large data platform technology can intuitively embody the composition of large data mining technology. Large data platform is between data integration and application, with data preprocessing, processing, analysis and mining and external interface sharing functions. Data processing includes real-time streaming (CEP) and off-line batching (including key technologies such as file system HDFs and data processing mapreduce of Hadoop). In data mining, ZTE has developed different components for various applications, including crowd analysis components, user behavior analysis components, etc. The development of these components fully embodies the wisdom and ability of ZTE to devote to the development of the practical application of large data.
Hadoopmapreduce is the mainstream technology in off-line batch processing, a significant change in Hadoop development process is to introduce yarn, to split the resource management scheduling in Mapredcue, and to lay the foundation for the integration of multiple computing architectures. The open source of Hadoop promotes the rapid application of large computing technology, but the imperfect of open source system is also the problem that needs to be solved in the actual use process. ZTE has done a lot of work in high availability, performance optimization and management optimization.
The real-time stream processing CEP adopts event triggering mechanism to deal with the input events in time in memory. CEP supports rules to meet flexible event handling requirements. CEP uses distributed memory database, message bus and other mechanisms to realize fast real-time response.
Large data applications are becoming richer and older
The application of large data service is personalized, social and intelligent, and the demand of human-computer interaction promotes the development and application of intelligent question answering.
The application of large data has been widely used in all aspects of our lives, including communications, medical, energy, economy, transportation, retailing and other industries:
Setonhealthcare is the first customer to use IBM's latest Watson technology healthcare content analysis, which allows businesses to find a large number of patient-related clinical medical information and to better analyze patient information through large data processing.
Vestas Wind systems, which rely on Biginsights software and IBM supercomputers, then analyze the meteorological data to find the best place to install wind turbines and the entire wind farm. With the use of large data, the past few weeks of analytical work will now take less than 1 hours to complete.
DoCoMo the mobile phone location information and the information on the Internet, to provide customers with the nearby food and beverage information, close to the last bus time, provide the last bus information services.
Retail companies also monitor customers ' shop moves and interactions with goods. They combine these data with the transaction records to start the analysis, this approach has helped a leading retailer reduce its inventory by 17% per cent, while maintaining market share, by increasing the proportion of its own-branded goods, as well as advising on what to sell, how to place goods and when to adjust prices.
And for our special attention to the communications market, operators in large data applications are mainly reflected in the following four levels.
At the market level, operators can use large data for their own product services, through large data analysis of user behavior, improve product design, and through user preference analysis, timely and accurate business recommendations, strengthen customer care, so you can continuously improve the user experience, increase the user's information consumption and the viscosity of the operator;
At the network level, we can analyze the traffic flow and flow direction of the network through large data, adjust the resource allocation in time, and analyze the network log, optimize the whole network, improve the network quality and network utilization;
In the enterprise management level, through the business, resources, financial and other types of data analysis, quickly and accurately determine the company's management and market competition strategy;
In the business innovation level, can ensure that the user's privacy is not violated under the premise of in-depth data processing, providing information services, to create new value for enterprises. In this way, the large data will help the operator to realize the transformation from the Network service provider to the information service provider.
In short, operators are seeking to maximize the value of their own data, improve operational efficiency, reduce operational costs, improve customer care quality, enhance operational capacity, open up business market, accelerate the integration of ICT.
As the leader of large data technology, ZTE mainly based on operators ' data and industry data, and developed a complete set of industry-leading data application in the application and development of large data across telecommunications, finance, transportation, Internet and other industries.
Taking shop location as an example, combining user location data and portrait shop location, overcoming the limitation of traditional artificial location, helping commercial users to low-cost fast and accurate location, realize the deep bundle of value-added business and commercial customers, and enhance the stickiness of value customers.
If the large data is applied to the passenger flow analysis, the real-time data of the user's position signal is collected, the regional passenger density and the crowd flow track are dynamically presented, and the crowd activity rules are found and predicted, which meet the needs of the municipal department's Road Traffic Planning and emergency safety control.
There are intelligent operation applications, based on large data new network Regulation Network optimization solutions to refine operations-oriented, taking the customer experience promotion as the target, taking the network Regulation Network excellent performance index quantification, the scientific assessment forecast and the accurate resources launch as the hand, through introducing the new data source and adopting the massive data processing way For network construction planning and optimization to provide a strong support and protection.