Database message service based on Oracle Log Analysis Technology

Source: Internet
Author: User
Tags format definition ibm mq

1 Overview 1.1 proposal of questions

Currently, many applications store data in the database and read and process data from the database. In business processing, applications are often divided into two types: Data consumer applications ("Producers" or "producer Applications ") data producer applications ("consumers" or "consumer applications" for short) are data-writing applications in databases in the past, while producers are applications that read data from databases. In the traditional design mode and development mode, consumers obtain database data through continuous polling of the database. This design mode can meet application requirements, however, resource usage and efficiency are too high and inefficient, because an application needs to establish a database connection, occupy the SQL Buffer, memory buffer, and CPU time slice of the database to query a database, it also occupies network bandwidth. If the data volume is large and the application does not perform incremental data acquisition, it takes a long time to poll and read the latest data result set to the application. as producers and consumers get more and more complex, this method of getting the latest data through round-robin is becoming increasingly unsatisfactory in terms of resource consumption and efficiency.

1.2 solution Overview

To solve the above problems, this paper proposes a database Message Service Based on log analysis technology. It obtains the change events of the Oracle database from outside the database through monitoring and analysis of Oracle log files, including DDL, DML, and other SQL operations in the Oracle database management system, and the ability to convert captured event content into messages recognized in the middle of the message according to the pre-customized format and data content of the application, then, the message is sent to the consumer application subscribed to the message through the message middleware.

Consumer applications that care about database changes can use this system to obtain change information in real time, avoiding repeated database connections and polling. This improves real-time performance and reduces database resource occupation.

The database Message Service Based on log analysis technology is particularly suitable for scenarios with high requirements on real-time response capabilities and the randomness of Database Change Events; it also provides a loosely coupled and efficient communication method between applications to facilitate the development of distributed software. That is, message producer applications can be used when the consumer applications are offline, operate on the database, and the operation result message will be retained in the message middleware, while the message consumer application can read the message from the message middleware when the message producer is not online.

2 research and preliminary comparison of Database Change information capture technology

Scholars have been committed to solving the problem of Database Change capture. So far, many technologies have emerged to Capture Database Change Events, the most comprehensive methods to capture Oracle Database Change events include: database polling technology, database trigger technology, data change technology based on program control logic, and Oracle Log Capture technology based on LogMiner. Next, we will introduce these technologies separately. Finally, we will compare these technologies with the log monitoring and analysis technologies described in this article.

2.1 database polling technology

In the question raised at the beginning, the methods for capturing such database changes are introduced, and the limitations and disadvantages are pointed out, that is, too many resources are consumed and the efficiency is relatively low.

2.2 database trigger Technology

Database trigger technology [1] is to create a trigger for the data of interest. When the data changes, the database management system will Insert/Update/Delete according to the database operation type) activate the corresponding trigger, so that we can add our code for processing Database Change Events in the trigger code. This technology is relatively simple to use and is widely used in current applications.

This method has the advantage of fast response and high efficiency. It also has the disadvantages of high resource usage by database polling technology, because if there are many consumer applications and different data sources of interest, you need to create Insert, Update, and delete triggers for each table, if the database contains 1000 tables of interest to consumer applications, 3000 triggers need to be created. Once the data producer frequently operates on the database, this will greatly occupy the memory and CPU resources of the database, thus reducing the performance of the database server.

This technology is based on the internal mechanism of the trigger. Therefore, message capture is implemented by the internal processes of the database through the callback mechanism, therefore, it occupies database resources.

2.3 Data Change Technology Based on program control logic

The Data Change Technology Based on the program control logic creates a tracking table for multiple tables to be tracked to record their changes. The structure of the tracking table is relatively simple, it mainly records the changed table name, operation type, and Other lightweight data. The tracking program, that is, the database event capture program, polls the tracking table, for example, the early version of the famous Database Synchronization Product Data Mirror uses this technology to capture database changes.

It can be seen that this technology bypasses the internal functions of the database management system to achieve data capture. Its advantage is to save database resources, although it is also a round robin technology, however, it saves more resources than the common database polling technology, because it queries not the entire data table, but the "tracking table" that records data changes ". The disadvantage of this technology is that, as a polling technology, it also needs to occupy CPU time slice resources. At the same time, if data producers perform frequent database operations, writing and reading to the tracking table will lead to low efficiency.

2.4 Oracle Log Capture technology based on LogMiner

LogMiner [2] is a useful analysis tool provided by Oracle company after product 8.0. It can be used to easily obtain the specific content in the Oracle reproduction log file, especially, this tool can analyze all DML (insert, update, delete, and so on) Statements for database operations. In addition, it can analyze and obtain some necessary SQL statements for rollback. This tool is especially suitable for debugging, auditing, or rolling back a specific transaction. Currently, many applications use LogMiner to obtain Database Change Events.

Compared with the previous technologies, LogMiner has the advantage of capturing Database Change information from the external logs of the database. Therefore, LogMiner does not occupy the resources of the database and the system.

The disadvantage of LogMiner is that it first needs to export the Oracle Data dictionary to a file, which is more suitable for applications where the data dictionary will not change. If the data dictionary changes, you need to re-execute the LogMiner configuration process. At the same time, it provides PL/SQL interfaces to the application, which is not suitable for application development.

2.5 log monitoring and analysis technology

The log monitoring and analysis technology introduced in this article is to extract the latest SQL statements and data by monitoring the Redo log file of Oracle and manual analysis of the file, A custom access interface is provided for the upper layer to capture data change events.

Compared with the above four methods, the log monitoring and analysis technology is the technology of its own research. It can control the access interfaces provided externally, that is, it can provide SQL statements or change data; at the same time, the data changes are monitored from the outside, without occupying the database resources; and because the analysis and capture of the latest changed log content, the data capture efficiency is very high. By analyzing the actual application results, log monitoring and analysis technology is the best comprehensive evaluation technology among all data change capture technologies.

3 architecture design and function Overview

Shows the architecture of database Message Service Based on log analysis technology. It consists of data producer applications, data consumer applications, message middleware, database message conversion, log monitoring and analysis, and database management systems.

The data producer application, data consumer application, and inherent part of the Database Management System Application System, database message conversion, log monitoring, and analysis are the key components of the database message service that constitute the log analysis technology. Therefore, in the following introduction, we will focus on the two core functional modules of message middleware, database message conversion, log monitoring and analysis:

Figure 1 Database Message Service Architecture Based on Log Analysis Technology


3.1 Message Middleware

Message-oriented middleware is a mechanism for loose coupling communication between applications. Its worker is a mechanism for subscription/distribution or active pull. The subscription/distribution mechanism is the registration and subscription of A Message Queue that the application is interested in. Then, the application continues its application process, after the message-oriented middleware system generates a message queue, it can actively push data to the application through the callback mechanism of Windows. Of course, the message-oriented middleware system also provides a pull mechanism, that is, waiting for applications to read messages.

The main purpose of introducing message-oriented middleware is to publish messages as needed, because not all applications require all messages, but specific applications need to receive specific messages.

Message-oriented middleware has many products and has many mature commercial software, such as ibm mq Serials and Microsoft MSMQ, as well as domestic software, such as Kingdee, Dongfang Tong, etc. In this system, the message-oriented middleware developed by China Electronics Technology Group 15th Institute will be directly adopted.

3.2 database message Conversion

This part is used to listen to database change events from the log analysis service and publish Database Change events to the Message Queue according to the subscription conditions in advance, data consumers can obtain the latest Database Change messages from the message queue in real time. It mainly includes message listening, message forwarding, and message ing management.

3.2.1 message listening

The main function of this module is to listen to the underlying log Monitoring and Analysis Service. by calling the callback function of this service, you can obtain the latest data captured by this service in real time.

3.2.2 message forwarding

This module converts the data change information to the format required by the Data consumer application according to the pre-configuration and ing of the messages received by the message listening module, the message is encapsulated according to the Message format of the message middleware, and finally sent to the specified message queue.

3.2.3 message ing Management

The main function of this module is to provide an application consumer with a tool for configuring messages. The consumer selects the database table and field set of interest from the database table, determine the format of the message content it recognizes (the message forwarding service will assemble the message content in this format), and determine the name of the message proxy to be sent for the message, maintain the ing from the table to the Message proxy, and specify the message subscription mode (Real-time push/active pull ).

3.3 log monitoring and analysis

The main function of the log monitoring and analysis module is to monitor the redo logs and archive logs of Oracle databases, analyze the changes, and convert the changed logs into data formats that can be understood by applications. It consists of two main modules: Real-time log monitoring, log Content Analysis and extraction.

3.3.1 real-time log monitoring

Since Oracle redo logs and archive logs exist in the form of Windows operating system files, you can monitor the changes of these two types of log files through the mechanism provided by Windows. It mainly implements file monitoring through APIS provided by Windows: SHChangeNotifyRegister, SHChangeNotifyDeregister, and SHChangeNotifyRegister. First, call the SHChangeNotifyRegister function to add the monitoring process to the system Message notification chain, and use the SetWindowLong function to change the default message processing function of the monitoring process. After receiving the system notification message, obtain the content advertised by the system based on the passed parameters and start the log analysis logic. Call the SHChangeNotifyDeregister function to log out of the system Message notification when you exit the program.

3.3.2 log Content Analysis and Extraction

The main function of this module is to read Oracle redo logs, archive log files, and obtain the changed data blocks from the last read time to the current read period; then, analyze the acquired data blocks and extract the SQL statements and data content that change the data according to the transaction, and return the extracted data content to the application (here, the message conversion module of the database) in the form of a callback function ).

4. Key Technologies

From the above features, we can see that the most important key technologies for implementing the system include three parts: establishes the relational process from the database to the application, establishes the message ing between the database table and the message proxy, and analyzes and extracts logs.

4.1 System Operation Process

The following is the workflow of running the database Message Service Based on Log Analysis Technology:




Figure 2 flowchart of database Message Service Based on Log Analysis Technology

During system operation, the data producer application performs the Insert/updagte/delete operation on the database through the database access interface, and the Oracle database enters the operation log in the log file; after the log monitoring and analysis module monitors the changes in log files, it analyzes and extracts the contents of log files; the database message conversion module converts the monitored content to the predefined content format of the application and sends it to the Message Queue. The message middleware sends messages in the message queue to the data consumer application. This completes a complete process of capturing and forwarding database messages.

4.2 message ing technology

The message ing relationships of the system are established through message Ing tools. On the visual interface, you can not only visually create message queues in message middleware, but also browse all Oracle Data Objects, select fields of interest for Data Objects, determine the Message format structure, and specify the message sending destination (determine the ing relationship between the table and Message Queue ). The mappings between database tables and message queues are as follows:

Table 1

Database Table Name

Message Queue name

Message format definition (delimiter between field values)

Subscription Method

SCOTT. EMP

Queue_EMP

/

Real-time push

From the ing relationship, we can see that the consumer is directed to SCOTT. the change information of the table EMP will be sent to the queue Queue_EMP, and the format of the message content is separated, the subscription mode is the real-time push mode.

4.3 log analysis technology 4.3.1 introduction to database redo log files

A log file is a collection of data changes. Logs fully record all database activities, including the history of Data Operations: DML (Insert, Update, Delete) and DDL (Create, Alter, Drop), as well as the history of transactions: begin, Commit, and Rollback. Redo logs are performed only after the transaction is committed. After the transaction is committed, the data files are written according to the modification information of the database data in the log records, so as to realize the atomicity of transactions. Logs must be recorded for all Oracle Data Operations. Therefore, data changes can be obtained from outside Oracle through analysis and recovery of redo logs.

Oracle redo logs are classified into two types: Online redo logs and archived redo logs. Online redo logs are required for each Oracle database. They support normal operation of the database. Once the associated database changes, those records will be written into the online log. Online logs consist of more than two files, which record the current changes of the database cyclically. When one log file is full, the log management process uses another log file, after all the log files are written, re-write the first file to start a new loop.

Archived and redone logs record the change history. Online logs record the latest changes. By reasonably using these two types of log files, you can restore the Oracle database to any historical state. The Log serial Number (LSN) is used to uniquely identify each online Log in the archived Log.

4.4 Oracle redo log file structure analysis

Since redo log [3] is a non-public technology of Oracle databases, there is no redo log structure description, except for the LogMiner package that comes with the Oracle database, no third-party software is available to access [4] For redo logs. However, we have introduced the defects of LogMiner. Therefore, we have long-term analysis on the Oracle redo log running mechanism, Oracle log dumping, and Oracle redo log files, we have summarized the structure of Oracle redo log files.

4.4.1 preliminary structure of redo log files

Table 2 shows the initial structure of the physical and logical blocks of Oracle redo log files:

Table 2

Physical block number

0th

1st

2nd

3rd

4th

5th

6th

...

Block M

Logical Block number

0th

1st

2nd

3rd

4th

...

N-2 Block

Storage content

File Header

Redo record header

Redo record Block 1

Redo record Block 2

Redo record block 3

...

Redo record Block N

From table 2, we can combine Oracle logic with physical file blocks:

1) physical block: redo logs are physically formatted as 512 bytes. For SQL operations with less content, you can record an SQL operation in one segment. However, for SQL logs that contain varchar (4000), BLOB, or CLOB types, 512 bytes cannot be accommodated. Therefore, you need to expand these blocks, which leads to the concept of logical blocks.

2) Logical Block: Oracle manages logs in the form of logical blocks. Each logical block has a minimum size of 512 bytes or an integer multiple of 512 bytes.

3) file header: the first and second physical blocks of log files are file headers;

4) file body: other parts of the file except the first two

5) Logical Block header: each block has a 20-byte block header starting from the second block (I .e. redo log header). The block header consists of the REDOLOG sequence number sequence, block number blocknumber, block Update time write time, the position offset of the first redo record, and checksum. The block header contains the redo block address of the redo record.

6) logical block content: records the transaction information and physical operation information of the ORACLE database.

  • 1
  • 2
  • Next Page

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.