1 Overview 1.1 the question is raised. Many applications store data in the database and read and process data from the database. In Service Processing
1 Overview 1.1 the question is raised. Many applications store data in the database and read and process data from the database. In Service Processing
1 Overview 1.1 proposal of questions
Currently, many applications store data in the database and read and process data from the database. In business processing, applications are often divided into two types: Data consumer applications ("Producers" or "producer Applications ") data producer applications ("consumers" or "consumer applications" for short) are data-writing applications in databases in the past, while producers are applications that read data from databases. In the traditional design mode and development mode, consumers obtain database data through continuous polling of the database. This design mode can meet application requirements, however, resource usage and efficiency are too high and inefficient, because an application needs to establish a database connection, occupy the SQL Buffer, memory buffer, and CPU time slice of the database to query a database, it also occupies network bandwidth. If the data volume is large and the application does not perform incremental data acquisition, it takes a long time to poll and read the latest data result set to the application. as producers and consumers get more and more complex, this method of getting the latest data through round-robin is becoming increasingly unsatisfactory in terms of resource consumption and efficiency.
1.2 solution Overview
To solve the above problems, this paper proposes a database Message Service Based on log analysis technology. It obtains the change events of the Oracle database from outside the database through monitoring and analysis of Oracle log files, including DDL, DML, and other SQL operations in the Oracle database management system, and the ability to convert captured event content into messages recognized in the middle of the message according to the pre-customized format and data content of the application, then, the message is sent to the consumer application subscribed to the message through the message middleware.
Consumer applications that care about database changes can use this system to obtain change information in real time, avoiding repeated database connections and polling. This improves real-time performance and reduces database resource occupation.
The database Message Service Based on log analysis technology is particularly suitable for scenarios with high requirements on real-time response capabilities and the randomness of Database Change Events; it also provides a loosely coupled and efficient communication method between applications to facilitate the development of distributed software. That is, message producer applications can be used when the consumer applications are offline, operate on the database, and the operation result message will be retained in the message middleware, while the message consumer application can read the message from the message middleware when the message producer is not online.
2 research and preliminary comparison of Database Change information capture technology
Scholars have been committed to solving the problem of Database Change capture. So far, many technologies have emerged to Capture Database Change Events, the most comprehensive methods to capture Oracle Database Change events include: database polling technology, database trigger technology, data change technology based on program control logic, and Oracle Log Capture technology based on LogMiner. Next, we will introduce these technologies separately. Finally, we will compare these technologies with the log monitoring and analysis technologies described in this article.
2.1 database polling technology
In the question raised at the beginning, the methods for capturing such database changes are introduced, and the limitations and disadvantages are pointed out, that is, too many resources are consumed and the efficiency is relatively low.
2.2 database trigger Technology
Database trigger technology [1] is to create a trigger for the data of interest. When the data changes, the database management system will Insert/Update/Delete according to the database operation type) activate the corresponding trigger, so that we can add our code for processing Database Change Events in the trigger code. This technology is relatively simple to use and is widely used in current applications.
This method has the advantage of fast response and high efficiency. It also has the disadvantages of high resource usage by database polling technology, because if there are many consumer applications and different data sources of interest, you need to create Insert, Update, and delete triggers for each table, if the database contains 1000 tables of interest to consumer applications, 3000 triggers need to be created. Once the data producer frequently operates on the database, this will greatly occupy the memory and CPU resources of the database, thus reducing the performance of the database server.
This technology is based on the internal mechanism of the trigger. Therefore, message capture is implemented by the internal processes of the database through the callback mechanism, therefore, it occupies database resources.
2.3 Data Change Technology Based on program control logic
The Data Change Technology Based on the program control logic creates a tracking table for multiple tables to be tracked to record their changes. The structure of the tracking table is relatively simple, it mainly records the changed table name, operation type, and Other lightweight data. The tracking program, that is, the database event capture program, polls the tracking table, for example, the early version of the famous Database Synchronization Product Data Mirror uses this technology to capture database changes.
It can be seen that this technology bypasses the internal functions of the database management system to achieve data capture. Its advantage is to save database resources, although it is also a round robin technology, however, it saves more resources than the common database polling technology, because it queries not the entire data table, but the "tracking table" that records data changes ". The disadvantage of this technology is that, as a polling technology, it also needs to occupy CPU time slice resources. At the same time, if data producers perform frequent database operations, writing and reading to the tracking table will lead to low efficiency.
2.4 Oracle Log Capture technology based on LogMiner
LogMiner [2] is a useful analysis tool provided by Oracle company after product 8.0. It can be used to easily obtain the specific content in the Oracle reproduction log file, especially, this tool can analyze all DML (insert, update, delete, and so on) Statements for database operations. In addition, it can analyze and obtain some necessary SQL statements for rollback. This tool is especially suitable for debugging, auditing, or rolling back a specific transaction. Currently, many applications use LogMiner to obtain Database Change Events.
Compared with the previous technologies, LogMiner has the advantage of capturing Database Change information from the external logs of the database. Therefore, LogMiner does not occupy the resources of the database and the system.
The disadvantage of LogMiner is that it first needs to export the Oracle Data dictionary to a file, which is more suitable for applications where the data dictionary will not change. If the data dictionary changes, you need to re-execute the LogMiner configuration process. At the same time, it provides PL/SQL interfaces to the application, which is not suitable for application development.
2.5 log monitoring and analysis technology
The log monitoring and analysis technology introduced in this article is to extract the latest SQL statements and data by monitoring the Redo log file of Oracle and manual analysis of the file, A custom access interface is provided for the upper layer to capture data change events.
Compared with the above four methods, the log monitoring and analysis technology is the technology of its own research. It can control the access interfaces provided externally, that is, it can provide SQL statements or change data; at the same time, the data changes are monitored from the outside, without occupying the database resources; and because the analysis and capture of the latest changed log content, the data capture efficiency is very high. By analyzing the actual application results, log monitoring and analysis technology is the best comprehensive evaluation technology among all data change capture technologies.
3 architecture design and function Overview
Shows the architecture of database Message Service Based on log analysis technology. It consists of data producer applications, data consumer applications, message middleware, database message conversion, log monitoring and analysis, and database management systems.
The data producer application, data consumer application, and inherent part of the Database Management System Application System, database message conversion, log monitoring, and analysis are the key components of the database message service that constitute the log analysis technology. Therefore, in the following introduction, we will focus on the two core functional modules of message middleware, database message conversion, log monitoring and analysis:
Database Message Service Architecture Based on Log Analysis Technology
3.1 Message Middleware
Message-oriented middleware is a mechanism for loose coupling communication between applications. Its worker is a mechanism for subscription/distribution or active pull. The subscription/distribution mechanism is the registration and subscription of A Message Queue that the application is interested in. Then, the application continues its application process, after the message-oriented middleware system generates a message queue, it can actively push data to the application through the callback mechanism of Windows. Of course, the message-oriented middleware system also provides a pull mechanism, that is, waiting for applications to read messages.
The main purpose of introducing message-oriented middleware is to publish messages as needed, because not all applications require all messages, but specific applications need to receive specific messages.
Message-oriented middleware has many products and has many mature commercial software, such as ibm mq Serials and Microsoft MSMQ, as well as domestic software, such as Kingdee, Dongfang Tong, etc. In this system, the message-oriented middleware developed by China Electronics Technology Group 15th Institute will be directly adopted.
3.2 database message Conversion
This part is used to listen to database change events from the log analysis service and publish Database Change events to the Message Queue according to the subscription conditions in advance, data consumers can obtain the latest Database Change messages from the message queue in real time. It mainly includes message listening, message forwarding, and message ing management.
3.2.1 message listening
The main function of this module is to listen to the underlying log Monitoring and Analysis Service. by calling the callback function of this service, you can obtain the latest data captured by this service in real time.
3.2.2 message forwarding
This module converts the data change information to the format required by the Data consumer application according to the pre-configuration and ing of the messages received by the message listening module, the message is encapsulated according to the Message format of the message middleware, and finally sent to the specified message queue.
3.2.3 message ing Management
The main function of this module is to provide an application consumer with a tool for configuring messages. The consumer selects the database table and field set of interest from the database table, determine the format of the message content it recognizes (the message forwarding service will assemble the message content in this format), and determine the name of the message proxy to be sent for the message, maintain the ing from the table to the Message proxy, and specify the message subscription mode (Real-time push/active pull ).
3.3 log monitoring and analysis
The main function of the log monitoring and analysis module is to monitor the redo logs and archive logs of Oracle databases, analyze the changes, and convert the changed logs into data formats that can be understood by applications. It consists of two main modules: Real-time log monitoring, log Content Analysis and extraction.
3.3.1 real-time log monitoring
Since Oracle redo logs and archive logs exist in the form of Windows operating system files, you can monitor the changes of these two types of log files through the mechanism provided by Windows. It mainly implements file monitoring through APIS provided by Windows: SHChangeNotifyRegister, SHChangeNotifyDeregister, and SHChangeNotifyRegister. First, call the SHChangeNotifyRegister function to add the monitoring process to the system Message notification chain, and use the SetWindowLong function to change the default message processing function of the monitoring process. After receiving the system notification message, obtain the content advertised by the system based on the passed parameters and start the log analysis logic. Call the SHChangeNotifyDeregister function to log out of the system Message notification when you exit the program.
3.3.2 log Content Analysis and Extraction
The main function of this module is to read Oracle redo logs, archive log files, and obtain the changed data blocks from the last read time to the current read period; then, analyze the acquired data blocks and extract the SQL statements and data content that change the data according to the transaction, and return the extracted data content to the application (here, the message conversion module of the database) in the form of a callback function ).
4. Key Technologies
From the above features, we can see that the most important key technologies for implementing the system include three parts: establishes the relational process from the database to the application, establishes the message ing between the database table and the message proxy, and analyzes and extracts logs.
4.1 System Operation Process
The following is the workflow of running the database Message Service Based on Log Analysis Technology:
Flowchart of database Message Service Based on Log Analysis Technology
During system operation, the data producer application performs the Insert/updagte/delete operation on the database through the database access interface, and the Oracle database enters the operation log in the log file; after the log monitoring and analysis module monitors the changes in log files, it analyzes and extracts the contents of log files; the database message conversion module converts the monitored content to the predefined content format of the application and sends it to the Message Queue. The message middleware sends messages in the message queue to the data consumer application. This completes a complete process of capturing and forwarding database messages.
4.2 message ing technology
The message ing relationships of the system are established through message Ing tools. On the visual interface, you can not only visually create message queues in message middleware, but also browse all Oracle Data Objects, select fields of interest for Data Objects, determine the Message format structure, and specify the message sending destination (determine the ing relationship between the table and Message Queue ). The mappings between database tables and message queues are as follows:
Table 1
Database Table Name
Message Queue name
Message format definition (delimiter between field values)
Subscription Method