Application framework of Hadoop platform in financial banking
Shijiangyan
I. Current status of financial banking
With the development of financial banking and the improvement of the level of network communication infrastructure, the informatization of financial banking is becoming more and more popular, but with the rapid development of Internet technology and application, many new payment methods are emerging, the data information of financial banking industry is increasing rapidly, and the related business data is increasing sharply, Financial banking is about to enter a big data age.
Due to the congenital insufficiency of relational database, huge amount of data will bring great pressure to the traditional relational database model. Therefore, the current strategy adopted by the financial banking industry is: 1. Increase the machine performance and storage space of the core system, improve the service data processing ability; 2. Backup historical data, reduce the data storage of core system, reduce the pressure of core system, and improve business data processing ability. However, the above strategy will lead to the following deficiencies: 1. Increase the machine performance and storage space, directly increase the core system operation and maintenance costs; 2. A large number of data off-line storage, resulting in customers can not quickly obtain transaction information, reduce customer satisfaction, resulting in loss of customers; 3 The Bank enterprise cannot analyze the whole quantity business data, can not correctly grasp the development direction of the banking industry, thus unfavorable to the Bank enterprise competition and the rapid development.
Second, the development of Hadoop technology
Hadoop platform architecture is the subversion and innovation of the traditional architecture, it can realize the low cost of mass data storage, fully support the distributed computing, support the Advanced Data mining algorithm model, the large data mining application to a new level.
Hadoop technology has been widely used in the Internet industry and e-commerce industry, which can realize the low cost storage of massive data, efficient data calculation and data analysis. At present, Alibaba Group adopts Hadoop technology to realize the dynamic analysis of data storage and transaction of Taobao commodity, which has brought them huge profits. The advantages of using Hadoop technology in coping with the big data age are obvious, and more and more companies are using this technology to solve the big data problems they face.
Application framework of Hadoop technology in financial banking
Based on the characteristics of Hadoop technology, it can be used to store off-line data of the banking industry, and the corresponding algorithms are developed to excavate and analyze these data, so as to improve the utilization value of historical data by Bank enterprises.
At present, the basic logic structure of banking enterprise consists of peripheral system, predecessor business system and core business system, as shown in the following figure.
Perimeter system: Responsible for direct interaction with customers, providing business services, all banking-related systems can be classified as peripheral systems.
Pre-Business System: Business transaction Data jump, which is responsible for receiving transaction data from the peripheral system, and then forwarding different core systems for processing according to the transaction code, while returning the processing result from the core system to the peripheral system.
Core Business System: responsible for dealing with the implementation of all transaction business.
The basic idea that the bank Enterprise uses Hadoop platform technology is: Keep the original system architecture unchanged; At the core system layer, add the Hadoop platform system, realize the historical data storage backup of the core system, provide the data query function externally, and provide the data mining processing function according to the data storage characteristics.
After adding the Hadoop platform system, the basic logic architecture of the banking business is shown in the following illustration.
Perimeter system: does not occur.
Pre-service system: According to different business code, some of the query services of the peripheral system are transferred to the Hadoop platform system for processing, and then the processing results are returned to the peripheral system. Core Business System: The core data required for scheduled backups to the Hadoop platform system to achieve some query business requirements.
Hadoop platform System: Based on business requirements, the use of historical data imported from the core system to deal with business transactions, and will process the results, through the predecessor of the business system, return to the perimeter system, you can also return the processing results to the core system data warehouse, for some report function to show the need.
In order to satisfy the service demand in the financial field, Hadoop platform system is designed with MVC pattern. First, the upper system, through the interface module and display module, from the external system to obtain resources, and then the processing results through the display module to show. The intermediate processing layer of the system will provide different business processing function modules for different business requirements, and data processing and data algorithm mining to generate all kinds of data to meet the needs. At the bottom of the system, the Hadoop platform system is used to carry out large-scale data storage, providing HBase database and unstructured data storage.
The commonly used functional module structure is shown in the following figure.
Each functional module is specified below.
Source Data Module
The main function is to provide the system with processing of the source data. In financial banking, these source data are their core business data.
Interface Module
The main function is to provide the corresponding data import processing method for different data source and data format.
Functional Application Module
The main function is to provide corresponding function processing module according to the need of business processing and system operation, and the function application module includes data mining algorithm, business processing process, etc.
Data module
The main function is to provide HBase database, unified storage management of unstructured data, provide HDFs file system, and provide data multiple copy backup storage management.
Display Module
After the processing of the results of the Web page display, but also in accordance with the requirements of the original system, provide different data display processing methods.
Financial banking has a very high requirement for data storage security, so the system must design remote disaster-tolerant backup storage. The Hadoop platform system software should be deployed in different engine room clusters, and the system will be deployed using the primary standby cluster pattern, as shown in the following illustration of the common physical deployment structure.
Iv. Architecture Advantages of Hadoop technology
The introduction of the above framework scheme into the financial banking sector will take full advantage of the following advantages.
1. Take full advantage of the storage advantages of Hadoop platform technology. The Hadoop platform can provide PB-level data storage, which can store all the business data generated by the banking business in the Hadoop platform system and realize massive data storage.
2. Make full use of Hadoop platform technology to quickly search for massive data. Trillions record, millisecond search results, can provide users with any transaction time real-time transaction data, improve customer satisfaction, achieve customer-centric, improve the competitiveness of the bank.
3. Make full use of the Hadoop platform technology data mining function. According to business needs, data mining algorithms can be written, using transaction data to quickly locate the transaction records of illegal money laundering, providing a strong technical support for supervision.
4. The use of Hadoop platform system, bear the core system of some of the consumption of transactions (such as: Account history Data Printing query function transactions), so that the core system to better deal with real-time transaction business, give full play to the advantages of traditional databases, so as to ensure that the financial and banking industry it information systems to
At present, Tian Yun large data company, has successfully applied the architecture solution to a bank's historical data query system, to achieve the Bank of all accounts of the transaction history of the millisecond response query results. Therefore, the Hadoop platform technology will provide a strong technical guarantee for the financial banking industry to deal with the arrival of the big data age.
The author introduces: Shijiangyan, computer professional master, long-term engaged in storage software, large data-related technology development work, has a number of telecom operators, financial industry, large data Solution architect experience and project management experience. Currently in the cloud base-sky cloud large data as project manager and architect.