Building a new generation of data centers with Hadoop

Source: Internet
Author: User
Keywords Data center Hadoop data center

In recent years, more and more enterprises have focused on optimizing their business in order to gain greater and more lasting competitive advantage, which is inseparable from the analysis and utilization of the existing data of the enterprise, in order to create more value or avoid more risks and better optimize the business, the construction of a new generation of data center is put on the agenda. The new-generation data center is developed on the basis of traditional data warehouse and Dynamic Data Warehouse, which can be divided into three kinds: relational data center, non relational data center (based on Hadoop or enterprise content Management) and mixed data center (large data platform) according to different data types. This chapter focuses on what is the goal of building a new generation of data centers, what stages have been experienced in the development process, and how to build a hybrid data center and data center based on reference architectures for some examples in the banking and government industries.

Overview of building a new generation of data centers

The current business environment is characterized by uncertainty and increasingly fierce competition, at the same time, customer loyalty continues to decline, unless the full use of analytical capabilities, enterprises will not be able to gain long-term advantage. Fragmented use of analytical techniques is difficult to help companies gain comprehensive insights and collaborate effectively across different business units within the enterprise. More and more enterprises are focusing on the comprehensive use of analytical technology to gain greater and more lasting competitive advantage in market competition. For example, banks want to know how to effectively identify credit risks, how to detect irregularities such as fraud and money laundering, how to more efficiently cross-sell and promote sales, how to optimize customer maintenance and retention activities and enhance performance assessment; telecom companies want to know how to accurately analyze the market business development and competitive environment, In order to provide in-depth analysis of market decision-making support, improve the accuracy of marketing activities, improve customer satisfaction, maintain and nurture high profit customers, cultivate new business model, etc. insurance companies want to know which claims customers are more likely to cheat and which customers are High-value Low-risk customer base. These are inseparable from the enterprise data support and the existing data analysis and utilization.

Data center Goals

By summarizing the needs of different customers in many industries, we find that industries have some common goals in building a new generation of data centers, including the following points.

Through the system to achieve data sharing, to resolve information islands, improve data quality. To construct a single view of enterprise information and realize the unified management and insight of structured, semi-structured and unstructured data. Provide a sound business model mining, definition and management, and on this basis to provide real-time decision support. Provide accurate and effective customer characteristics management, to provide in-depth insights into customer segmentation, sales promotion, cross-selling, marketing, customer maintenance and retention. To build enterprise-level Data Warehouse, master data management, enterprise content management and large data management to provide a unified data service for enterprises. To construct a complete and unified metadata management system and to formulate a comprehensive metadata management strategy for the enterprise to provide a unified and efficient metadata management services and exchange. The data management system is constructed to ensure the consistency of data and eliminate the problems of redundancy, conflict and lack of information. Provide efficient, real-time and accurate multidimensional data analysis, Report statistics, ad hoc query, Dashboard, multimedia analysis, flow analysis and content analysis functions, to provide comprehensive support for business operations analysis. Provides concise and easy-to-use data Mining (Mining) and predictive analysis support for enterprise analysis. Provides support for collaborative work, Rule engine, and event-handling capabilities for effective collaboration between applications based on comprehensive analysis capabilities. Provide a sound IT security management, integrated monitoring and enterprise asset Management.

In addition to these goals, there are challenges to building a new generation of data centers. For example, how to build a data-oriented culture, how to break the organizational barriers, how to control the implementation of the integration project cycle and risk, how to overcome the integration of technical complexity.

Data center development process

In the "Application agenda" era, enterprises have built a number of business systems, in order to meet market competition, enterprise management and regulatory needs, enterprises began to build a number of report query system. Over time, these report query system is more and more unable to meet the needs of enterprises, such as query access performance is relatively slow, the report statistics relative fixed, difficult to meet the enterprise flexible business needs, can not be multidimensional analysis. So some enterprises began to try to use traditional data warehouse technology for the construction of BI system, that is, using ETL or ETCL tools to achieve data export, conversion, cleaning and loading, using the operational data storage (operational-store,ods) to store detail data, Use data marts and data warehousing technologies to implement a theme-oriented historical data store, use Multidimensional Analysis tools for front-end presentation, and use the mining engine provided by the data Warehouse tools or based on a separate data mining tool for predictive analysis. Compared with the previous report query system, traditional Data Warehouse technology has the following advantages.

The accuracy and consistency of ODS data are ensured through the perfect data cleaning conversion. Improved performance of BI systems through data Warehouse technology. Through Multidimensional Analysis display tools, to provide customers with comprehensive multidimensional analysis, report statistics and ad hoc query functions. Through the data mining technology, helps the customer to carry on the forecast analysis flexibly.

Traditional Data Warehouse technology has been widely used in more and more industries, which has played a very important role in improving the operation efficiency, improving the competitiveness of enterprises and reducing the risk. At the same time, with the development of business, enterprises using traditional data warehouse technology are facing the following new problems.

with the further intensification of competition, enterprises need to respond to market changes in a timely manner, the timeliness of data warehouse requirements more and more high, and traditional data warehouse data are regularly updated in batches, difficult to meet the timeliness of the requirements. More and more front-line users need to use the Data Warehouse, while the traditional data warehouse users usually only for high-end management or a small number of managers, more front-line users can not access the data warehouse, such as the bank has thousands of customer managers and customer representatives expect to access the Data warehouse. The traditional data warehouse is more and more needed to provide the corresponding analysis capability, while the traditional data warehouses do not push the analysis ability actively.

So the enterprise began to use Dynamic Data Warehouse technology to solve the above problems, compared to traditional data warehouse technology, Dynamic Data Warehouse has the following advantages.

first-line users can dynamically (or in real time) access the Data Warehouse to obtain the information they need. Use Dynamic Data loading methods. Compared with traditional data warehouse, Dynamic Data Warehouse uses batch form to load data, it usually loads data continuously in quasi real time (mainly by incremental data loading), at least to the time interval of second level, which guarantees the real-time of data warehouse data fundamentally.

Use event-driven and proactive methods to provide analytical capabilities for business systems, such as when a bank's credit risk manager approves someone's loan request, information about the applicant's risk rating is pushed over. The Dynamic Data Warehouse Reference Architecture is shown in Figure 3-1.

Figure 3-1. Dynamic Data Warehouse Reference Architecture

Figure 3-2. The hierarchical pattern of the pace

Through the construction of a new generation of data centers, we can achieve intelligent insights in all walks of life, such as real-time traffic flow optimization in the transport industry, bus line optimization, traffic forecast based on travel routes recommended, in the banking industry anti-fraud, anti-money-laundering and risk management integration, in the retail sector to predict customer purchase intention. Examples of in-depth analysis by industry are shown in Figure 3-3.

Figure 3-3. Conduct in-depth analysis in various industries

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.