Fanzian: Development status and future planning of China Unicom's large data

Source: Internet
Author: User
Keywords Cloud storage Apple Intel cloud applications cloud storage cloud applications
Tags analysis apple application application innovation applications asset asset management based

2014 Zhongguancun Large Data day on December 11, 2014 in Zhongguancun, the General Assembly to "aggregate data assets, promote industrial innovation" as the theme, to explore data asset management and transformation, large data depth technology and industry data application innovation and ecological system construction and so on key issues. The Conference also carries on the question of the demand and practice of the departments in charge of the government, the finance, the operators and so on to realize the path of transformation and industry innovation through the management and operation of data assets.

In the afternoon of the operator @big Data Forum, China Unicom Information and E-commerce Division, deputy general manager of the Information Center Fanzian to make keynote speeches. Fanzian Manager introduced the development of Unicom's large data and its development plan in 2015.

Fanzian: Thank you for your general introduction, leaders, ladies and gentlemen. I am invited today to give you a brief introduction of China Unicom, especially from the group's point of view on the development of large data, and our planning in 2015.

Looking back on the 2014, from the point of view of what we have done concrete work. First of all, we can say that prior to this, there were large data studies in the Total Corporate Research Institute of China Unicom, but from the perspective of large data centers, we achieved a 0 breakthrough in 2014, especially at this level of infrastructure, we built the initial 28-node platform early this year, Up to now more than 400 nodes are in the production operation of the large data platform, and in the deployment of the early 1200 nodes scale platform. This size in the country's enterprises should be considered more important such a platform. At the same time we are in the original BIM, based on the traditional Oracle Data Warehouse architecture, as well as the MPP database, plus the Hadoop database I mentioned just now, the integration of the key points in this mashup architecture is integrated into three different platforms through tools, Different data source allocations are also made.

At the same time in 2014, we also made the following work in data acquisition. It's not so much the case. Our data systems are mainly based on BI, with the local base fertilizer system PIM and our ERP system as the main words, 2014 we first of all the network data collected users of the Internet to synchronize to our data center. This is an important breakthrough from the point of view of large data application. Because we can integrate the data measured by the network and the data of the region, and play a greater value of the data. This is a multiplication effect, not just a simple accumulation of data.

Another point you may also know that Unicom is strongly focused on it construction, in addition to the IT system, we are building a billing system, in this system we have collected all the 31 provinces of the original statement, including the whole network 2G3G4G mobile network and broadband original business statement.

Finally, we prepared the billing system, the first issue we call 4.0 use, gathered 3G broadband user migration is in the middle. This 4G centralized system produces these new detailed units, customer base information, order product ordering relationships, and so on new data, the most bring us a big change is what? The original collection of data is to Headquarters collation, now the new system to build our system flow is from the top down. In addition to the construction of the platform and data collection, we have learned a lot from this year's data mining level. Because the work reports we do in the traditional bi field are mainly the work of report generation and data analysis. With some of the new data and new collections I've just talked about, we learned how to use large data mining tools. Our start is actually through a simple project, our 365 mentoring program is a mentoring program launched by our marketing department to use traffic for 3G new users, and when it comes to the key points of the first and first months of the month, the user's preferences are depicted through the portrait, For the marketing department to be able to use these data to determine the best coaching users of the flow of the plan, and then to push these plans to the user, this is our first project, this is the beginning of this year, we built. Through this project we gradually learned to raw data for simple processing, to later on these data content, especially the user online record URL and other analytical work, on this basis we have now precipitated in our knowledge base of nearly 100 million web site content records, parsing the 5,000 mobile app, and a label definition for 130 million of mobile users.

In addition to the processing of raw data, the analysis of Internet data, we also use the mining tools closely with the needs of the business sector to combine to make a different customer management system model. such as user identification model, user evaluation model, terminal adaptation, create card, as well as card users, such as nine types of models, effectively support the Unicom company's sales transformation. Why do I say this transformation? Because I want to be like other friends, the beginning of this year Unicom company by the increment of the company's specific increment and stock and direction, is to be based on large data to do a good job of accurate and effective user maintenance, the main basis is through these models to achieve.

Just now I have talked about the transformation of certain business of the company. Over the course of the year, we have effectively supported the transformation and innovation of the company's business through the application of the model data acquisition and the processing that I just talked about. Here are a few examples, such as mobile resale, we all know that this is the first big news in 2014, we can now ensure that through the entire data platform for data collection, processing, etc., to ensure that in 30 minutes, or even as if it is 15 minutes, the mobile virtual operators need to push the data to their

At this July World Cup, we also made use of our platform capabilities, and data processing capabilities, to make a World Cup capacity marketing an effective activity. Customer maintenance Just now I talked about that through just modeling. Internet finance is also a hot topic, if you pay attention to the recent media, you may also know that China Merchants Bank and Unicom set up a joint venture, called the recruitment company, its purpose is to launch the Internet financial business before the end of this year, Behind this is actually the Unicom's data and the CMB's support in the wind control model, the development of an effective risk assessment model for the Internet financial. Now we are actively building this platform, and model development.

One of the big data applications is to open up and partner with the outside world, and we're still trying to figure out how to effectively use our data in the financial, advertising, and industry applications. One of the paths we've found is by building an open data mining platform where unicom can provide data, provide storage capabilities, compute power, and then invite Third-party partners to work on this data mining process to share these development results. UnionPay Wisdom is another way of cooperation, mainly in the field of credit, which is an online data user identification and his trust degree of such a query business, but also in China Unicom and UnionPay wisdom of both sides of the project, will also be launched before the end of the year.

Other industry applications, I said just now, for example, we have a demographic analysis of the National Bureau of Statistics, automotive Industry index reports, and so on, these are some of the internal and external typical large data applications.

Through this summary of the above work, we found that can be summed up to three major levels. At the bottom is the infrastructure layer, where we make full use of the typical architecture of the internet, distributed, X86, cloud computing, the ability to fully build our platform in the medium, just talking about the new data collection methods, data processing, and mining capabilities. The third tier of value layer or application layer is to develop and promote value-oriented application services.

These three-tier architecture is fully in line with the overall information of the three-tier system, China Unicom's overall information architecture is also divided into the bottom of the IaaS, the middle of the PAAs, as well as the SaaS layer. The IaaS layer is cross-border, PAAs for geographical and other applications, the top application will be built on these PAAs platform, so you can see that the red part of our data domain development of the typical architecture, and the overall structure is fully consistent. Aside from the big data, I am also responsible for the development of cloud computing in China Unicom, so I am here to talk about our overall development in large data and cloud computing. The first is that in the construction of large data platforms, we have to follow the Internet cloud computing this thinking, to achieve the ability to open, flexible support, security services. How can we effectively combine these two areas? 1th, while we promote the private cloud, especially the IaaS Cloud platform management, we put all the hardware resources of large data into this cloud management platform, so that our private cloud management platform from the outset has a certain scale.

How is the second to advance the use of our cloud platform? In this respect we see a lot of internal business units, external partners, who often come to us for such and such data, rather than providing simple data services like this, we want to be able to combine data in the future, or in the next few months, with the cloud platform, As I said just now, we can provide users with not only data, but also computing power, storage capabilities, mining tools, while pushing to the place. This gradually raises a user base of Unicom's private cloud.

The final point is that at the PAAs level, I've just talked about three different PAAs clouds, but I've overlooked the public services under the PAAs cloud, and the PAAs services for the data must be cross-domain, so we'll talk to other Cross-domain architects about how to put relational databases, memory databases, The Distributed file system, etc. is precipitated at this pass layer as a public service level. So in simple terms, we hope to combine the development of cloud computing with the development of large data.

What I have just said is a review of the 2014 Unicom Group's summary of large data construction. I'll introduce you to some of the major jobs we've planned in 2015. The first is the construction of the platform. On the 1200 nodes that I just talked about, we hope to further enhance the data coverage and data support capabilities of the Headquarters large data platform in 2015, and further realize the whole group data collection, a little processing conversion, a little data provision and a little service support, which is the responsibility of unicom management to our data center. At the same time, we set up the application of a variety of data architecture, ERP platform, there is a Hadoop platform for the upper level to provide efficient and flexible data support capabilities, how to concretely reflect the scope and ability to expand data support? We have just collected the data, we have begun to launch the customer service class data, typical unstructured data, especially customer service voice data acquisition, such as fixed network broadband users on the Internet record, to supplement the data that I talked about mobile users online records.

The rare data on the network side, especially the data information and track information of the PS data in the wireless port. At the same time we are prepared to build a layer of data acquisition and exchange at the bottom of three different platforms, so that we can point this new collection of data sources to such or such a data integration platform.

Today we have 200 nodes of the MPB data cluster, we also encountered in the data acquisition of such a problem, so in the new year we are ready to build expansion MPB cluster, and optimize it, especially in the stability aspect. We also need to improve the capabilities of the data management platform, especially for Unicom's provincial companies. Just now I talked about in the region, Unicom is advocating centralized construction, in the data domain we are also considering, because in the provincial branch of the main level is sales, customer service and so on, the data will sink, promote the front-line. How is the future data system built? Now we are also discussing with the provincial branch, different large provinces companies to our original data work in the logic of the integration of the application, to small branch offices they may be directly linked to the headquarters of large data open platform for their localized application development.

Last but not least, the building of an open platform for large data capabilities will facilitate our cooperation with external partners and create greater value for large data applications through open platforms with large data capabilities.

This picture may seem a bit complicated, and it represents the current status and infrastructure of our large data platforms. The bottom is our collection layer, the data source, the middle tier platform layer is just I talked about the MPB,DW and Hadoop data platform, and above is the service layer, and this above is we have now developed applications.

The second picture may be more complicated, and the red part is the content of the project we are planning to build in the year 2015. For example, we will be in the first stage of the Hadoop platform in batch processing, mainly to Hbas,hap, in the second phase we will be spk,sthin, processing streams, data learning Hadoop and so gradually introduced, and also focus on building an open platform.

The construction of the platform is not enough, we must talk about the application, because the value of large data to be reflected through the application. In the 2015 planning, we also planned the internal and external applications, internally we mainly by four types of applications, customer maintenance platform, which is a natural development results. I just talked about the fact that in 2014 we built different data models for customer service data centers, the results of this model are provided to customers in a variety of ways, and they need a platform tool to correlate and refine the data before they can finally reach the true strategy of sustaining, the best channel to push, And the effect of recycling. This is the kind of auxiliary platform that we are going to build in 2015.

Intelligent Voice Analysis, just now I also mentioned in the collection of our customer service telephone 10010 data voice collection, after collection We are also ready to make it a typical large data application. The application is also technically challenging because it requires that we turn this voice into text, then voice analysis, and apply it on top of this analysis. The third is to support our centralized ERP, because many ERP reports are in the GRP forecast, especially the self-service reporting function migrated to the mashup of large data groups. We have set up the China Unicom Company's 4G Operations Center, the new operation center just now I also talked about helping to support the company's new transformation, but also includes various transformation and monitoring means to effectively for the national 4G operation monitoring work.

In the field of external applications, industry applications, internet finance, automobiles, hotels, business circles, electric dealers, which we have found so far, we have established some areas of cooperation in the initial relationship. In addition, we must vigorously promote the large data capacity open platform, ready to try to let Third-party partners have the opportunity to demonstrate their capabilities. Finally for the data open this is also a hot topic, we are actively engaged in the attitude of large data activities, together with peers to explore the road to open data, and promote the improvement of data laws and regulations.

Here are some details on the stock operation, the application of some of the details, I will not say one, I do not know whether this material will be provided to you, for example, how the stock management through data collection, processing for the user portrait, and then this maintenance platform according to the marketing activities of the divided customer base for two times, to push the channel, Finally form the user link processing. Another is the voice analysis, the need for unstructured voice files, the ability to transform into large data platforms, do speech recognition, data modeling, and then the electronic customer service source analysis, confirmation and so on.

In this regard we have launched pilot projects in 10 provinces, the user of the Intelligent Voice Analysis tool is mainly the headquarters of the Customer Service center, and the province's Customer service center is a level two use unit. The 10 provinces involved in the pilot project are also listed here, representing more than 50% of the total number of calls that Unicom accounts for the country. That's all for my presentation, and the first one is to give you a look back at the work we did in big data building in 2014. The second is to introduce some of our 2015-year program, from the application level and platform level, thank you.

(Responsible editor: Mengyishan)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.