The original: Everyone is DBA (I) SQL Server Architecture
When you understand a SQL Server database, you can first observe it from the architecture of your database. The architecture of SQL Server contains 4 main components:
- Protocol layer (protocols)
- Relational engines (relational engine), also known as Query Processor
- Storage engines (Storage engine)
- Sqlos
Protocol layer (protocols)
When an application communicates with a SQL Server database, it first needs to select the protocol that establishes the communication connection through the SNI (SQL Server network Interface) networking interface. You can use the following protocols:
- TCP/IP: The most widely used protocol;
- Named Pipes: Service only for local area network (LAN);
- Shared Memory: Only supported on the same machine;
- via(Virtual Interface Adapter): Only high-performance via hardware is supported; (the protocol is deprecated)
SQL Server can be configured so that it can support multiple protocols at the same time. Various protocols have different performance in different environments and need to choose the right protocol according to the performance requirements. If the client does not specify which protocol to use, it can be configured to try each protocol individually.
Once the connection is established, the application can communicate directly with the database. When the application prepares to query the database using the T-SQL statement "SELECT * from TableA", the query request is first translated into the TDS protocol package (Tds:tabular data stream is the table-formatted dataflow protocol) on the application side, and then through the connected communication The protocol channel is sent to the database end.
The SQL Server protocol layer receives the request and translates the request into a form that the relational engine (relational engine) can handle.
Relational engines (relational engine)
The relational engine, also known as the query Processor, consists of 3 parts: relational.
- Command Parser
- Query Optimizer (Optimizer)
- Query Executor (Executor)
The protocol layer parses the received TDS message back into the T-SQL statement, which is first passed to the command Parser.
command Parser checks the correctness of the T-SQL syntax and transforms the T-SQL statement into an internal format that can be manipulated, that is, the query tree.
- The query tree is the internal representation of the Structured Query Language SQL (Structured query Language).
- Data manipulation Language DML (manipulation Language) is a subset of the SQL language, including INSERT, UPDATE, DELETE three core directives.
- Data definition Language DDL (Language) manages tables and index structures, including commands such as CREATE, DROP, ALTER, TRUNCATE, and so on.
- Data Control Language is responsible for authorizing users to access and process data, including Grant, REVOKE, and so on.
- T-SQL, or Transact-SQL, extends the functions of the procedural programming language, such as Process control, on the basis of SQL.
- SQLCLR (SQL Server Common Language Runtime) uses. NET assemblies to extend functionality.
Query Optimizer , which obtains the query tree from the command parser, determines whether the query tree can be optimized, and then, in many possible ways, determines the best way to optimize the query tree.
- Statements that cannot be optimized, including control flow and DDL, are compiled into an internal form.
- An optimized statement, such as DML, will be marked for optimization.
The optimization step begins with a canonical query (Normalize query), which decomposes a single query into multiple fine-grained queries and optimizes fine-grained queries, which means that it will determine the plan for executing the query, so the result of the query optimizer is an execution plan (execution Plan).
Query optimization is cost-based (cost-based) considerations, which means choosing the most cost-effective plan. The query optimizer needs to select the least expensive plan based on the performance metrics of the internal records. These internal performance metrics include Memory requirements, CPU utilization, and number of I/O operations. At the same time, query optimization uses heuristic algorithms (pruning heuristics) to ensure that the time-consuming evaluation of optimizations and queries is not longer than the direct execution of non-optimized queries.
After the normalization and optimization of the query is completed, the results of these processes are compiled into the execution plan (execution plan) data structure. The execution plan includes information such as which table to query, which index to use, what security to check, and what conditions are the values.
Query Executor runs the execution plan produced by the query Optimizer, acts as a scheduler for all commands in the execution plan, and tracks the process of each command execution. Most commands need to interact with the storage engine (Storage engines) to retrieve or modify data, and so on.
Storage engines (Storage engine)
The SQL Server storage engine contains components that are responsible for accessing and managing data, mainly including:
- Accessing methods (Access Methods)
- Lock Manager
- Transaction service (Transaction services)
- Utility tools (controlling Utilities)
Access Methods contains specific actions for creating, updating, and querying data, and some types of access methods are listed below:
- Row and Index Operations: The data structure, which is the data row and B-tree index, that is responsible for manipulating and maintaining the disk.
- Page Assignment Action (page Allocation Operations): Each database is a collection of 8KB disk pages that are distributed across multiple physical files. SQL Server uses 13 disk page structures, including data pages, index pages, and so on.
- Version operation (Versioning Operations): A version that maintains line changes to support snapshot isolation (Snapshot isolation) functionality, and so on.
The access method does not retrieve the page directly, it sends a request to the buffer manager, the Buffer Manager scans the page in its managed cache, or reads the page from the disk into the cache. When the scan starts, the row or index in the page is validated using the lookahead (Look-ahead mechanism) mechanism.
lock Manager is used to control the locking of tables, pages, rows, and system data, to resolve conflicts in a multiuser environment, to manage the compatibility of different types of locks, to resolve deadlock problems, and to elevate the functionality of the lock (escalate Locks) as needed.
The transaction service (Transaction services) is used to provide ACID property support for transactions. The ACID properties include:
- Atomicity (atomicity)
- Consistency (consistency)
- Isolation (Isolation)
- Persistence (Durability)
The pre-write log (Write-ahead Logging) feature ensures that log records are always written on disk before a truly changing data page is written to disk, making it possible to rollback a task. The write transaction log is synchronous, that is, SQL Server must wait for it to complete. However, write data pages can be asynchronous, so you can organize data pages in the cache that need to be written for bulk writes to improve write performance.
SQL Server supports two concurrency models to guarantee the ACID properties of a transaction:
- Pessimistic concurrency (pessimistic Concurrency) assumes that conflicts always occur and that data is locked to ensure correctness and concurrency.
- Optimistic concurrency (optimistic Concurrency) assumes that there will be no conflict, and that processing occurs when a conflict is encountered.
In the optimistic concurrency model, data is not locked when the user reads the data. When you perform an update, the system checks to see if the data has changed after another user has read the data. If another user changes the data, an error occurs, and the user who receives the error message rolls back the transaction. This model is primarily used in environments where data contention is low, and when the cost of locking data exceeds the cost of rolling back a transaction.
SQL Server provides 5 isolation levels (isolation level) that can support different concurrency models when dealing with multiuser concurrency.
- Read UNCOMMITTED: Only pessimistic concurrency is supported;
- Repeatable Read: Only pessimistic concurrency is supported;
- Serializable: Only pessimistic concurrency is supported;
- Snapshot: Support optimistic concurrency;
- Read Committed: The default isolation level, which supports both pessimistic concurrency and optimistic concurrency depending on the configuration.
utilities (controlling Utilities) contain tools for controlling the storage engine, such as bulk load (bulk-load), DBCC commands, full-text index management (full-text index Management), Backup and restore commands, and so on.
Sqlos
Sqlos is a separate application layer, located at the lowest level of the SQL Server engine. The main features of Sqlos include:
- scheduling (scheduling)
- Memory Management (Management)
- Synchronization (synchronization): Provides locking mechanisms such as Spinlock, mutexes, ReaderWriterLock, etc.
- Memory Broker: Provides distribution rather than memory Allocation.
- Error handling (Exception handling)
- Deadlock detection (Deadlock Detection)
- Extended event (Extended events)
- asynchronous I/O (asynchronous IO)
Database Architecture Comparison
In fact, if you compare the overall architecture, the architecture of the various common relational databases is similar. This also allows us to understand a database, a general guess and a quick understanding of another database.
Here is the schema diagram for the Oracle database:
The following is a diagram of the MySQL database structure:
MySQL database in the storage engine part of the implementation of pluggable design (pluggable Storage Engines), you can choose different types of storage engine implementation according to the requirements.
Feature |
InnoDB |
MyISAM |
Memory |
Storage Limits |
64TB |
256TB |
Ram |
Transactions |
Yes |
No |
No |
Locking Granularity |
Row |
Table |
Table |
B-tree Indexes |
Yes |
Yes |
Yes |
Compressed Data |
Yes |
Yes |
No |
Encrypted Data |
Yes |
Yes |
Yes |
Full-text Search Indexes |
Yes |
Yes |
No |
Foreign Key Support |
Yes |
No |
No |
Even in the same DB instance, each data table can specify which storage engine to use.
CREATE TABLE INT CHAR (INDEX (a)) ENGINE=InnoDB;
Everyone is a DBA the index of the article series:
Serial number |
Name |
1 |
Everyone is a DBA (I) SQL Server Architecture |
2 |
Everyone is DBA (II) SQL Server Metadata |
3 |
Everyone is a DBA (III) SQL Server Scheduler |
4 |
Everyone is a DBA (IV) SQL Server memory Management |
5 |
Everyone is a DBA (V) SQL Server database files |
6 |
Everyone is a DBA (VI) SQL Server transaction log |
7 |
Everyone is DBA (VII) B-Tree and + + Tree |
8 |
Everyone is a DBA (VIII) SQL Server page storage structure |
9 |
Everyone is DBA (IX) Server information collection script compilation |
10 |
Everyone is a DBA (X) Resource Information Collection script compilation |
11 |
Everyone is DBA (XI) I/O information collection script compilation |
12 |
Everyone is DBA (XII) Query information collection script compilation |
13 |
Everyone is a DBA (XIII) index information collection script compilation |
14 |
Everyone is a DBA (XIV) Stored Procedure information collection script compilation |
15 |
Everyone is DBA (XV) lock information Collection script compilation |
This series of articles "Everyone is a DBA" by Dennis Gao published from the blog Park Personal technology blog, without the author's consent to prohibit any form of reproduction, any automatic or artificial reptile reproduction or plagiarism behavior are bullying.
Everyone is a DBA (I) SQL Server Architecture