Database Management System Overview
Describes a complete DBMS structure in which a single-threaded frame represents a system structure, a two-wire frame represents the data structures in memory, a solid line represents control and data flow, and dashed lines represent data streams only. Because the diagram is complex, there are several steps to consider in detail.
First, there are two command sources at the top that send commands to the DBMS:
1. Common users and applications, issue query data or modify data commands.
2, database administrator, a person or a group of people responsible for database structure or mode.
Data Definition Language Commands
The second command is relatively simple to handle, and the upper right shows the beginning of the command's whereabouts. For example, the Administrator or DBA of the university enrollment database can determine that the database should have a table or relationship that is comprised of students, the course that the student is taking, and the student's course. DBAs can also determine that valid scores can only be a, B, C, D, and F. These structure and constraint information are part of the database schema. The display is entered by the DBA in. Because these commands can affect the database deeply, the DBA must have special permissions to execute the schema modification commands. The schema modification data definition language (DDL) command is parsed by the DDL processor and routed to the execution engine, and then the execution engine modifies the metadata through the index/File/record manager, which is the schema information for the database.
Query Processing Overview
The main task of interacting with the DBMS is to follow the path along the left side. Users or applications use data manipulation language (DML) to initiate operations that do not affect the data schema, but these actions can affect the contents of the database, such as modification operations. The DML language is handled by two separate subsystems, and the descriptions of the two subsystems are as follows:
Query processing
The query completes parsing and optimization through the query compiler. The result of the compilation is a query plan or a sequence of operations executed by the DBMS and obtaining the results of the query, which will be sent to the execution engine. The execution engine sends a series of requests to the resource manager for small chunks of data, a typical small data relationship is a record or tuple. The resource manager knows the data file, the format and record size of the data file, and the index file. This information is useful for quickly finding the appropriate data element from the data file.
The data request is also passed to the buffer manager. The task of the buffer manager is to get the data from the level two memory (usually the disk, the persistent data) to the primary storage buffer. In general, a page or "disk block" is a transmission unit between a buffer and a disk.
In order to get data from the disk, the Buffer Manager communicates with the memory manager. The memory manager may contain operating system commands, but more typically, the DBMS sends commands directly to the disk controller.
Transaction processing
Queries or other DML operations are organized into transactions (transaction). Transactions must be units of atomic execution and must be isolated from each other in the execution of the transaction. Any query or modification operation itself can be a transaction. In addition, the execution of a transaction must persist (durable), meaning that any completed transaction must be persisted, even if the transaction has just completed the system failure. A transaction processor is divided into two main sections:
1, the Concurrency Control Manager (Concurrency-control Manager) or the Scheduler (scheduler), to ensure the atomicity and independence of transactions.
2, log (logging) and Recovery Manager (Recovery Manager), responsible for the persistence of transactions.
Memory and Buffer Manager
Database data is normally stored in level two memory. "Level two memory" in a computer system generally refers to a disk. However, the operation of the data can only be performed in main memory. The primary task of the memory manager is to control the location of the data on the disk and the movement between the disk and main memory.
In a simple database system, the memory manager can be the operating system's file system. However, to improve efficiency, the DBMS often directly controls the storage on disk, at least in some environments. The memory manager keeps track of the file location on the disk and, depending on the request, gets one or more disk blocks containing the requested file from the buffer manager.
The Buffer Manager is responsible for dividing the available main memory into buffers, which are areas containing several pages, where disk blocks can be transferred. As a result, all DBMS components that need to get information from the disk, either directly or through the execution engine, interact with the buffer and buffer managers. The types of information that may be required for each component are:
1, data: The contents of the database itself.
2. Metadata: A database schema describing the structure of the database and its constraints.
3. Logging: Newly modified information of the database, which supports the persistence of the database.
4. Statistics: Data collected and stored by the DBMS about the characteristics of the data. For example, database size, values in the database, various relationships in the database, and other components.
5. Index: Data structure that supports effective access to data in database.
Transaction processing
A transaction is typically composed of one or a set of database operations. The execution of a transaction satisfies atomicity and is isolated from the execution of other transactions. In addition, the DBMS guarantees the persistence of the transaction: the work of the completed transaction is never lost. The transaction manager receives transactional commands from the app that tell the transaction manager when the transaction starts, when it ends, and what information is expected to be applied (for example, some apps might not need atomicity). The transaction processor performs some of the following tasks:
1, log (logging): In order to ensure durability, every change in the database is recorded on the disk separately. The log Manager follows a design principle that, whenever a system fails or "crashes," the recovery manager can restore the database to a consistent state by checking the log for modification records. The Log Manager first writes the log to the buffer and then negotiates with the buffer manager to ensure that the buffer is written to disk at the appropriate time (the disk data can survive the system crash).
2. Concurrency control (concurrency controls): Transactions must be executed independently. But in most systems, many transactions are performed at the same time. Therefore, the scheduler (concurrency Control Manager) must ensure that a single action for multiple transactions is performed in a certain order, and that the effect in that order should be the same as when the system executes only one transaction at a time. A typical scheduler works by locking on certain fragments of a database. Locks prevent two transactions from being accessed with the same data fragment in an incorrect interactive manner. As shown, locks are usually stored in the lock table of main memory, and the scheduler affects queries and other database operations by preventing the execution engine from accessing the locked database contents.
3. Eliminate deadlocks (deadlock resolution): When a transaction acquires a lock through the scheduler to compete for the resources it needs, the system may get into a state. In this state, none of the transactions can continue because each transaction requires the resources to be occupied by another transaction. At this point, the transaction manager has the responsibility to mediate and delete ("rollback" or "terminate") one or more transactions so that other transactions can continue to execute.
ACID Properties of transactions |
A properly implemented transaction should typically meet the following "ACID" Properties: A (automicity) indicates atomicity, the operation of a transaction is either all executed, or all is not executed I (isolation) represents independence, and each transaction must be executed as if no other transaction is executing concurrently D (durability) indicates persistence, and once the transaction is complete, the transaction's impact on the database is never lost C (consistency) indicates consistency, that is, the association between the data tuples in all databases has a consistency constraint, or satisfies the consistency expectation (for example, the account balance cannot be negative after the transaction execution ends), that is, the transaction is expected to maintain the consistency of the database. |
Query processor
The part of the DBMS that the user can feel most affecting the performance of the system is the query processor. The query processor is represented by two components:
1. Query compiler: It transforms the query into the internal form of the query plan. A query plan is a sequence of operations on the data. Typically, the operations in the query plan are implemented with the "Relational algebra" operation. The query compiler consists mainly of the following three modules:
A) Query Analyzer: Query Analyzer constructs a query tree structure from the text structure of the query.
b) Query preprocessor: The query preprocessor makes semantic queries to queries (for example, to ensure that the relationships mentioned in the query do exist) and transforms the query syntax tree into an algebraic operator tree that represents the initial query plan.
c) query optimizer: The query optimizer translates the query's initial plan into the most efficient sequence of operations on the actual data.
The query compiler uses metadata and statistics about the data to determine which sequence of operations is the fastest. For example, an index is a special data structure that facilitates data access, and if an index exists and a given index data item value, the query plan using the index will be faster than other plans.
2. Execution Engine:
Execution engine: The execution engine is responsible for executing each step of the selected query plan. The execution engine and most of the other components of the DBMS are accessed either directly or through a buffer. In order to manipulate the data, it must take the data from the database to the buffer and must interact with the scheduler to avoid access to the locked data, and it also interacts with the log manager to ensure that all changes to the database are correctly logged.
World of database systems