Design of Embedded Memory Database Engine

Last Update:2018-06-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Welcome to the IT Community Forum and interact with 2 million of technical staff. 1. The development of the embedded memory database technology. The database theory and technology are developing extremely fast, and their applications are increasingly widespread, in today's information society, it is almost ubiquitous. Relational databases are represented by three classic (hierarchical, mesh, relational) databases in traditional (Business and Management)

Welcome to the IT technology community forum and interact with 2 million technical staff> enter 1 the current situation of the development of embedded memory database technology the development of database theory and technology is extremely rapid, and its application is increasingly extensive, in today's information society, it is almost ubiquitous. Relational databases are represented by three classic (hierarchical, mesh, relational) databases in traditional (Business and Management)

Welcome to the IT Community Forum and interact with 2 million technicians>

1 Development Status of Embedded Memory Database Technology

Database theories and technologies have developed rapidly and are increasingly widely used. In today's information society, database theories and technologies are almost ubiquitous. The three major relational databases (hierarchical, mesh, and relational) have achieved great success in the traditional (Business and Management transactional) application fields, however, they are weak in the face of modern (non-traditional) Engineering and time-critical applications, facing new and severe challenges, leading to the generation and development of embedded real-time databases. In real-time applications, the operation logic (Operation Type, sequence, etc.) before the transaction runs, the dataset, its structure, behavior, and time correlation are all pre-analyzed. However, for a disk database, the I/O of data is a key factor that results in uncertain transaction execution time and inaccurate prediction. Therefore, it is necessary to use large memory as the primary storage medium for real-time databases, so that a transaction does not have I/O during the activity to achieve more accurate forecasts, so as to meet the scheduled restrictions of real-time transactions. However, two problems need to be solved: appropriate data placement and timely internal and external storage exchange. With the rapid development of memory technology, the memory database technology has become increasingly mature and has been widely used in non-real-time systems.

The memory database (MMDB) keeps the master copy ("working version") of the database in the memory, greatly improving the system performance. However, because all operations are directly applied to the master copy of the database in the memory, the database is vulnerable to damage caused by operating system and application software errors, the I/O operations (such as logs and backups) contained in database recovery are also prominent in memory database systems where transactions do not require I/O operations. Therefore, the recovery mechanism of the memory database has an important impact on the system. The recovery of the memory database is much more complex and critical than that of the traditional disk database. Data recovery is the key to the reliability and practicality of memory databases. The research on restoration technology has become the most popular topic in MMDB research.

2 memory database definition

The definition of a memory database should not involve the memory size, the amount of I/O required for data access, the time when data enters, and how data can be stored in memory, data access that only contains the resident memory (rather than the disk) of the database and transactions (not the system) only involves the memory. Its essential feature is its "master copy" or "working version" resident memory, that is, the active transaction only deals with the memory copy of the real-time memory database. Obviously, it requires a large amount of memory, but does not require the entire database to be stored in the memory at any time, that is, the memory database system still needs to process I/O. Traditional disk databases cannot be regarded as MMDB even if the buffer is large enough to accommodate all the data. It is designed based on the disk feature and the assumption that the database is resident in the disk. For example, the index structure is still for disk access, and data access must still be managed through the buffer zone. The organization and management of memory databases require a new data structure and algorithm suitable for memory features, new policies and mechanisms are required for data organization and placement, database access, internal and external storage data exchange, query processing and optimization, concurrency control, and database recovery.

The memory database is a new research area. The following definitions are provided based on the reference of various aspects:

Definition: database is set. DBM (t) is the data set of DB in memory at t time. DBM (t) is actually included in DB. t s is the set of all transactions, a t (t) is the active transaction set at t moment. a t (t) is actually contained in t s; D t (T) is the operation data set at T moment, D t (T) is actually included in DB; if t is at any time, there are:

For any transaction that has a t (T), D t (t) is really included in DBM (T), then it is called A memory database, which is abbreviated as MMDB.

According to this definition, the "working version" (or the entire database) of MMDB is resident in the memory, and no data I/O exists between internal and external memory during the execution of any transaction. Obviously, it requires a certain amount of memory capacity, but does not require the entire database to be resident in the memory.

3 memory database features

3.1 The Memory Database Data Storage Organization and Management of an MMDB logic consists of two parts: the memory version and the external storage version. The primary memory is a memory that is prone to loss. It stores the "working version" of MMDB ". It is logically divided into several partitions.

Partitions are used to store data of a link. Each partition is physically composed of several interconnected blocks. A block is a continuous area with a fixed length. It is the unit of internal and external memory I/O, and also the unit of memory allocation, recovery, and MMDB recovery. Separate indexes from data records for storage.

NV-RAM (NO Vola tile RAM) has the characteristics of fast memory read/write speed and non-Easy loss in the case of backup battery maintenance, but expensive, inconvenient plugging, widely used in embedded systems, it can be used with Flash-RAM to act as the Flash-RAM write buffer. NV-RAM is the extension of primary storage, which can be implemented by UPS, solid state disk or disk Cache. The primary storage data and NV-RAM data are collectively referred to as the memory version of MMDB ".

Disk memory is used to store the database data that is not stored in the memory, and is also used as a backup for database recovery. This is called the "External Storage version ". To facilitate data exchange between internal and external storage, You can logically divide disk data into fixed-length blocks with the same length as memory blocks and create indexes of the same type.

3.2 Transaction Processing

When a memory database is used for a non-real-time system, the transaction processing process is similar to that of a traditional database except for the log and system recovery methods. However, when a memory database is used for a real-time system, because of time constraints, transaction processing in a real-time memory database is no longer applicable to traditional databases. Traditional ACID transaction concepts and models are not suitable for real-time transactions. Real-time transactions show many different features, such as real-time and relevance. Although the correctness of real-time transactions is the same as that of traditional transactions, it also includes two aspects: Database status correctness and transaction execution correctness, but its meaning and content are quite different. In real-time transactions, the database state correctness includes internal consistency and time consistency. The transaction execution correctness includes the result correctness, behavior correctness, structure correctness, and time correctness. Related real-time scheduling algorithms must also be used for transaction scheduling. The memory database engine discussed in this article targets non-real-time systems.

3. 3 System Recovery

Database backup is a copy of the database and some control information. In case of a fault, you can use it for restoration at any time. Database Backup minimizes data loss, allowing you to reconstruct the failed database from the backup copy using the restoration process. There are multiple types of failures that cause the database to be restored. This includes statement failure, user error, process failure, database instance failure, and media failure. However, not all types of failures require manual interaction. However, in the memory database system, because all operations are directly applied to the master copy of the database in the memory, the database is vulnerable to damage caused by operating system and application software errors, therefore, MMDB recovery technologies such as backup, Checkpoint, and restart are more complex than conventional disk databases (DRDB. People have done a lot of research and exploration in terms of architecture, transaction commit, log system, backup, and checkpoint algorithm. Log Management is a vital part of the memory database recovery mechanism. Because the memory is volatile, it is recommended that logs be stored on another secure media (such as disks and non-volatile memory ), the I/O operations on logs will affect the performance of MMDB to some extent, and may become a bottleneck affecting the transaction throughput of the system. To this end, people have studied various solutions, such as constructing non-easy-to-lose memory to store some logs, adopting the "Group commit" technology, and using the shadow memory technology to solve the log bottleneck problem. In terms of checkpoints, in order to improve system efficiency, we usually try to make the checkpoints operate and process transactions simultaneously. To improve speed indicators and comprehensive performance, MMDB systems often use additional hardware devices such as non-Easy memory loss, dedicated log processors, and checkpoint processors to support efficient and fast data recovery.

4.1 overall architecture of the database subsystem on the 3G platform

4.1.1 position of each function module of the database system in the 3G platform system 3G software platform includes six software subsystems except OS and BSP.

The DataBase subsystem manages the physical resources of the platform and the signaling and Protocol configuration information implemented by the platform, and provides DataBase access interfaces to other subsystems. It works on the operating system. The bearer subsystem provides bearer services for business subsystems, signaling subsystems, OAM, and network pipe systems in terms of ATM, IP, and so on. It works on the operating system and DataBase subsystem. The signaling subsystem implements narrowband No. 7 signaling, broadband No. 7 signaling, IP signaling, and gateway control signaling, and provides services to the business processing subsystem. It works on the operating system, DataBase subsystem, and bearer subsystem. The system control subsystem monitors, starts, and downloads the entire system. It works on the operating system and DataBase subsystem.

The OAM and net pipe systems provide a unified interface between the platform and the network management backend, which is responsible for configuring and managing the protocol and signaling of the platform and providing necessary statistical data. It works on the operating system, bearer subsystem, and DataBase subsystem.

The business processing subsystem implements various services provided by the system. It works on the operating system, bearer subsystem, signaling subsystem, and DataBase subsystem.

Database subsystems are distributed on each processor of the 3G platform. In terms of structure, the database subsystem is a distributed database system. It mainly provides data support for various subsystems of the platform, and provides a database model that supports easy-to-expand mechanisms for extended businesses in the future.

4.1.2 division of internal modules of the database Subsystem

The database sub-system is divided into five functional modules, supporting data sources such as RAM, FLASH, hard disk-specific formats, and commercial databases. The five functional modules are: input and output module, object management module, business processing module, maintenance management module, and platform system toolkit.

The input/output module (I/O module) is responsible for all data, loading, dumping, and storage space management related to the hardware media of the entire platform.

The object management module is the core module (DBCORE) of the entire platform. It implements the main function, that is, the core organizational function of the series of objects including memory data, tables, indexes, and locks, this has a decisive impact on the performance of the entire system. Provides simple internal concurrency control measures.

The business processing module is an interface layer provided for database platform users to access memory databases. Supports API and SQL-like access. Supports distributed data access. The Maintenance Management Module monitors and manages the entire platform system, including probes, alarms, logs, real-time business and

Trace, remote network access, and other functions. The Platform System toolkit provides complete tool support for each module, as shown in the following figure: media layer tools: data storage file retrieval, modification, and generation of various media. Object Management Module tool: generate the main program framework based on the object property differentiation. If there are no special requirements, generate

The code can be directly used for object management. I/O module tools: generate data loading, dumping, and memory data storage space based on the memory database Distribution Model

Management Framework code, business processing subsystem tools, regular API code generation. Maintenance Management Module tools: retrieval and management tools for interface tracking, logs, and other output files. Others: various objects (tables, indexes ......) Design and related documentation generation.

4. 2. System Object design principle and processing process the subsystem is designed according to the object-oriented method. Therefore, when describing the system processing process, it focuses on the various objects of the subsystem, the synchronization process and monitoring process design have their own characteristics. The process is the main line of their description.

The data of the sub-system is divided into six categories: data table, index, queue, synchronization instance, monitoring instance, and one-way resource queue. The data table is the core of the database organization, and then indexes are created for the convenience of Data locating. A synchronization instance is created for the convenience of resource management, for the purpose of resource monitoring, a monitoring instance is established, and one-way resource queue is used as the memory management for auxiliary data tables and synchronization instances.

In addition to the one-way resource queue, each type of data is defined as the system's data object class. The system uniformly defines and manages the data of the same object class. A specific data object is called a data instance of a data object class. The system assigns a unique 32-bit integer to all data instances in tables, indexes, queues, synchronization, and monitoring, which is called the data instance handle. The data instance is accessed through the data instance handle. One advantage of this method is that the stability of data management does not change with the increase of Data instances, which is conducive to system stability. Another advantage is that it facilitates the expansion of data object classes. Because different data object classes are managed separately, the independence between data object classes ensures that the security of the original data is not damaged after the expansion of data object classes.

One-way resource queue is used inside the table object and synchronization object. There is no uniform handle allocation, and related tables are operated by saving pointers to one-way resource queue objects.

5 Summary:

The author's innovation is that this article first introduces the current situation of embedded memory database technology, then briefly introduces the features of the memory database technology, and puts forward an Embedded Memory Database Engine suitable for 3G platforms, with the rapid development of computer technology and the increasing demand for information processing speed, large-capacity memory databases and streamlined embedded memory databases will have more and more application space. At present, there is no good General Embedded memory database product. Development and implementation in this area will have great market potential and space.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Design of Embedded Memory Database Engine

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Design of Embedded Memory Database Engine

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support