Abstract: Compared with common databases, embedded databases have the characteristics of small size, complete functions, portability, and robustness. This paper analyzes and compares typical embedded databases SQLite and Berkeley dB. First, the SQLite and Berkeley dB are analyzed in detail from the perspectives of architecture, inter-subsystem call relationship, and task execution process, then, the similarities and differences between SQLite and Berkeley dB are discussed in terms of data types, storage methods, modes, database engines, error processing, and encryption functions, finally, an SQLite application instance based on ARM-Linux is listed.
Keywords: SQLite, Berkeley dB, SQL, virtual database engine (vdbe)
Introduction
With the continuous integration and penetration of computer technology and other disciplines, the scope of database application is more in-depth and specific. Database technologies that are only applicable to PCs with large sizes and long latencies cannot meet the needs of highly targeted embedded system development. Solite and Berkeley dB are two types of embedded databases that are widely used and stable in technology. However, the research on embedded databases in China started late and has not attracted more attention. More people are familiar with relational databases based on the C/S or B/S structure to implement data storage, retrieval, and other functions. However, in an embedded system, due to limited hardware and software resources, it is impossible to install a large database server, and users' requirements may be achieved by a simple disk file-based database system, this only utilizes the basic features of those databases. At this time, the study of embedded databases is particularly important.
1. Embedded Database
Embedded databases are usually integrated with embedded operating systems and specific applications. without the need to run the database engine independently, the program can directly call the corresponding API to access data. The development environment of the embedded system determines the characteristics of its database:
1. Proper volume
Due to the characteristics of the embedded system, there is a strong space limit on data storage and program operation. Therefore, the embedded database should first ensure the appropriate volume. In addition, it consumes as few Rom, ram, and CPU resources as possible.
2. Complete Functions
In the development of embedded systems, user requirements determine the need for a database of moderate size and complete functions to implement data
This allows developers to adopt a database technology that provides complete development documentation and is easy to develop. In addition,
National 863 Project 2002aa714023, Postgraduate excellent courses funded 05531451
In embedded devices, database management is transparent to users, which requires the database to automatically complete initialization, log management, data compression, backup, data recovery, and other functions; moreover, embedded devices often have unpredictable hard resets, which requires high robustness of the database.
3. Portability
There are many types of embedded systems. Therefore, embedded databases should be portable to different software and hardware platforms.
4. Open-source code
Open-source code not only reduces development costs during product development, but also provides the most thorough solution for the future maintenance and stable operation.
2 SQLite
SQLite is a powerful embedded relational database management system developed by D. Richard sipe using a small C library. Although the functions are slightly inferior to those of Berkeley dB, it is easy to learn and fast. It also provides a wide range of database interfaces and supports most of sql92: supports multiple tables and indexes, transactions, views, triggers, and a series of user interfaces and drivers.
The architecture of SQLite can be roughly divided into eight major subsystems, as shown in 1. Database operations are performed in this order one by one. The top layer is the tag processor (tokenize) and analyzer (parser ). SQLite has its own highly optimized code generator to generate code quickly and efficiently. The bottom is the optimized B-tree, which helps to minimize the disk search when running on an adjustable page buffer. The next step is the high-speed page cache, which serves on the OS abstraction layer. This architecture makes database portability possible.
The core of this architecture is the virtual database engine (VDBE ). VDBE completes all tasks related to data operations and is an intermediate unit for information exchange between customers and storage. From all angles, it is the core of SQLite. After the SQL statement is analyzed, VDBE starts to work. The Code Generator translates the analysis tree into a pocket program, which is then combined into a series of commands represented by VDBE virtual machine language. In this case, VDBE executes each command to complete the query requirements specified by the SQL statement.
SQLite has the following features: supports ACID transactions and Zero Configuration-no installation or management of configurations, a complete database stored in a single disk file, and database files can be freely shared among machines in different bytes, and database sizes are supported up to 2 TB, small enough, and about 30 thousand lines of C code in all source code, KB, faster than most popular databases, providing support for Transaction functions and concurrent processing. Application Transaction not only guarantees data integrity, but also increases the running speed, because multiple statements can be submitted to the database at a faster, independent, and independent speed than one-by-one statement.
3 Berkeley DB
Berkeley DB is a lightweight embedded database developed by sleepycat software. It is not only applicable to embedded systems, but also can be directly connected to the application and run in the same address space as the application. Traditional databases work as independent servers, while Berkeley DB is a software development library. Developers embed it into applications, and applications are itself a server, instead, we only use embedded database development to implement customized database logic, avoiding the overhead of communication with application server processes. Therefore, Berkeley DB has a high operating efficiency and is suitable for embedded systems with limited resources.
In general, the Berkeley DB database system can be roughly divided into five subsystems, as shown in figure 2.
1. Access Methods)
This subsystem provides basic support for creating and accessing database files. Without transaction management, the modules in this subsystem can be used independently to provide fast and efficient data access services for applications.
2. Memory Pool Management Subsystem (Memory Pool)
This subsystem is the common shared memory buffer used by Berkeley dB, which can be used independently by applications.
3. Transaction subsystem (Transaction)
This subsystem provides the transaction management function for berkekey dB to ensure the principles, consistency, and isolation of operations. The transaction subsystem is suitable for modifying data that requires transaction guarantee.
4. Locking)
This subsystem provides concurrent management between processes and within processes, and provides the system with shared control for multi-user reading and single-user modification of the same object. This subsystem can be used independently by applications.
5. Log subsystem (logging)
This subsystem adopts the log writing policy. It supports data recovery by the transaction subsystem to ensure data consistency.
4. Similarities and Differences between SQLite and Berkeley DB
Through some of the above introduction, we may have some knowledge about SQLite and Berkeley dB. From the current trend, these two embedded databases have a strong vitality, better application areas and development space. I have read a lot of data and compared their similarities and differences from various perspectives, as shown in table 1.
Table 1 similarities and differences between SQLite and Berkeley DB
Features
SQLite
Berkeley DB
Is it a relational database?
SQL supported or not
Development language
Data Type
Storage Method
Storage Mode
Database Engine
Applicable System
Error Handling
Encryption
Free or not
Difficulty level
Yes
Yes
C Language
None
Convert to ASCII code
Btree
Virtual
From ARM/Linux to Linux/Solaris hardware platforms
Relatively small
Weak
All
Ease of use
No
No
C. Java language
None
Original storage
Btree, Hash, Queue, and Recno
None
UNIX/POSIX systems, Win32 and embedded systems, such as WinCE and VxWorks
More detailed
Strong
Part
Hard
From this table, we can intuitively see that SQLite and Berkeley DB differ greatly in database types, development languages, storage methods, and modes. I will discuss some important aspects in detail below:
1. Database Type
SQLite is based on the relational database model and supports the vast majority of standard SQL92 statements. To a large extent, it implements the ANSI SQL92 standard. In particular, it supports views, triggers, transactions, and nested SQL statements. It uses the SQL Compiler (SQL Complier) to operate the database in the SQL language and stores the database in a single file. It is very convenient to use statements similar to relational databases. This makes learning SQLite easier for those who have previous PC database experience.
In addition, SQLite also has the concept of API and is extremely easy to use. It only requires three functions to execute SQL statements and obtain data. It can also be expanded to allow programmers to customize functions and then aggregate them in the form of callback. C language API is the basis of the script interface, such as the published (Tcl interface ). The open source code group has expanded many customer interfaces, adapters, drivers, and so on, which makes it possible for other languages to use SQLite.
Berkeley DB is not a relational database and cannot be operated on a database using standard SQL statements. Special APIs must be called to perform operations on it. These APIs provide query, insert, and delete functions. Using the functions provided by Berkeley DB for database access and management is not complicated. In most cases, you only need to call according to the unified interface standards to complete the most basic operations.
2. storage methods and modes
SQLite only provides the Btree data storage mode. For binary data, SQLite cannot be saved directly. However, binary data can be converted to ASCII encoding before being saved. Base64 encoding is the most common method to convert binary data into ASCII encoding. In the c language code encode. c of SQLite, Base64 encoding is provided.
Berkeley DB directly stores any stored data in the data file as is, whether it is binary data or ASCII or Unicode encoded text. Berkeley DB provides four data storage modes: Btree, Hash, Queue, and Recno. When you open a database, you must specify a storage mode.
The specific definitions, advantages and disadvantages, and applicability of the above various storage modes are described here due to limited space. If necessary, refer to relevant materials.
3. Data Type
The biggest feature of SQLite is that its data type is typelessness ). This means that data of any type can be saved to any column of any table to be saved, regardless of the Data Type declared by this column. Although the data type of each field needs to be declared when the table structure is generated, SQLite does not perform any checks. Developers rely on their own programs to control the types of input and read data. An exception is that when the primary key is an integer value, an exception occurs if a non-integer value is to be inserted.
Although SQLite allows data types to be ignored, we recommend that you specify the data type in the Create Table statement, because the data type facilitates program readability. In addition, although data types are not differentiated during data insertion or reading, different data types are different during comparison.
In Berkeley DB, keys and data are the basis for database management. key/data pairs composed of the two constitute a basic structural unit in the database. By using this method, you only need to provide keywords to access the corresponding data when using API functions to access the database. Keywords and data are expressed by a simple structure named DBT in Berkeley DB. Its function is to store the corresponding memory address and its length.
5 Applications
SQLite embedded database provides a source code release method. to port data on a large number of hardware platforms, you can perform cross-compilation on the source code based on different platforms. Compilation involves the following steps:
1. To begin.
2. Run the "echo $ PATH" command to check whether the PATH contains the cross-compilation tool arm-linux-gcc.
3. To run sqlite properly in ARM-Linux. h make some modifications to ensure that the btree (B tree) has the correct variable size, such as "ptr" and "char ". Linux with different architectures, such as X86 and ARM, may be slightly different. For ARM-Linux, you can find the following parts:
# Ifndef INTPTR_TYPE
# If SQLITE_PTR_SZ = 4
# Define INTPTR_TYPE int
# Else
# Define INTPTR_TYPE long
# Endif
Add a sentence before the above Code
# Define SQLITE_PTR_SZ 4
In this case, "typedef INTPTR_TYPE ptr;" is the defined "int" type instead of "long ".
4. Use configure for some configuration. Modify the configure in the sqlite directory so that configure does not check the cross-compilation environment. Due to limited space, we will not detail it in detail.
5. Modify the Makefile file. Change the code line BCC = arm-linux-gcc-g-O2 to BCC = gcc-g-O2. In addition, sqlite is usually run on the hardware board of ARM-Linux in the form of a static link. Therefore, modify the Makefile, find the code segment marked as sqlite:, and set libsqlite. change la. libs/libsqlite. a. After making the above changes, use make to generate sqlite, libsqlite. a, and libsqlite. so. To reduce the execution file size, you can use strip to process the file and remove the debugging information.
6. Run sqlite on the ARM board. There are many methods to copy sqlite to the ARM board. You need to select one based on the actual situation. Such as ftp, cm-dftp, and wget. Download sqlite to the/tmp directory of the ARM board because the directory is writable. Modify permissions and run:
Chmod + wx sqlite
./Sqlite test. sqlite
Will appear
Sqlite>
If everything works properly, sqlite is now running in ARM-Linux, and then you can proceed with the next step of application development.
6 conclusion
Embedded databases SQLite and Berkeley DB have similarities and differences in terms of volume, function, running speed and difficulty. However, they all have the ability to fully adapt to the hardware and can well adapt to the needs of embedded systems. From the perspective of the author, although SQLite is not as powerful as Berkeley DB, its design philosophy is small, fast, and minimal management. This makes SQLite find an ideal balance between size and function, and completely open source code makes it an ideal "Embedded Database ". Of course, you can select an application based on the actual situation in a specific embedded application.