Original: 01. How SQL Server reads and writes data
I. Data reading and writing process brief
SQL Server, as a relational database, also maintains the acid nature of the transaction, and the database read-write conflict is controlled by the transaction isolation level. Transactions are present, whether or not the transaction is turned on. The flowchart is as follows:
Data read/write flowchart
0. Transaction start
(1) All DML statements are necessarily transaction-based, and if the transaction is not explicitly opened, the Begin TRAN,SQL server writes each statement as a transaction and commits the transaction automatically.
This means that SQL SERVER does not turn on implicit transactions by default, which is the opposite of Oracle, where Oracle turns on implicit transactions by default, and each DML statement or block of statements is committed manually.
In SQL Server to change this default behavior, you can do the following settings in the session, if you do not open an implicit transaction, SQL Server will automatically submit the current DML statement, and open, you need to manually commit to commit.
-- To turn on an implicit transaction SET on -- Insert a record CREATE TABLE INT )INSERTintoVALUES(1)-- view open transactions DBCC Opentran ()
(2) If a transaction is manually turned on (BEGIN TRAN), the transaction (commit) needs to be committed manually, just as the implicit transaction (SET implicit_transactions on) is turned on.
1. Initiating DML
(1) DML usually refers to: INSERT, DELETE, UPDATE;
(2) DDL statements are eventually converted to DML on system tables, and DDL statements can also be rolled back in SQL Server, such as: Create/alter/drop/truncate, not available in Oracle, plus SQL DCL statement in server: Deny,revoke, can also be rolled back;
2. Is the data in memory
(1) Using the hash algorithm in memory to find data, if the data is recorded as logical read;
(2) If the data page is not in memory, you need to read the data from the data file on the disk into memory, that is, physical reading, physical reading will also be counted as logical read, that is, no matter in memory there is no data, logical reading is certain.
3. Modify the data
(1) Modify the data page in the data buffer of SQL Server memory, when the data page is called Dirty page (DIRTY page);
(2) record redo log in the log buffer of SQL Server memory, which is called dirty log;
4. End of transaction
(1) Commit, at this time the dirty log of the current transaction is flushed to the database log file, and the transaction end tag (commit), dirty pages may not be flushed to the data file;
The transaction log structure is as follows (can be viewed through a similar tool like Log Explorer):
BEGIN TRAN
Dml
COMMIT TRAN
(2) rollback (ROLLBACK), at this time read redo log to get reverse DML operation, reverse modify Dirty page, the forward dml+ reverse DML will be recorded in the database log file, and the transaction end tag (ROLLBACK), the same, dirty pages may not be flushed to the data file;
The transaction log structure is as follows:
BEGIN TRAN
Dml
Reverse DML
ROLLBACK TRAN
It is not hard to find that SQL server logs are a bottleneck (bottleneck), because while writing introduces a read, which introduces competition, and Oracle uses undo segment to avoid this problem well, REDO log is always written in serial.
5. Refresh the data page
(1) The SQL Server database adheres to the pre-write log (Wal:write-ahead Logging) principle, because the relational database is transaction-based, and the log is the guarantee of the acid property of the transaction and the assurance of data recovery;
(2) Checkpoint (CHECKPOINT), the checkpoint periodically refreshes the dirty page into the data file, and finally the log file is marked with a checkpoint (CHECKPOINT), so that the data modified in the transaction above is formally written to the data file on disk.
Two. Data read and write process in depth
Imagine:
(1) must the log be written to the log file after commit? If there is a very long and large transaction, then when the log is committed, the log is written to disk from the buffer, would it not wait long?
(2) data is not necessarily after the log is submitted, checkpoint occurred, only to write to the data file? If the log has not been submitted, then the data buffer is not very crowded?
Considering these 2 points, SQL Server will also log writer/lazy Writer variable time to refresh the log/data to disk, as for the consistency of logs and data, in the startup or database restore, SQL Server will check, that is, we often say roll forward (REDO) and rollback (UNDO).
Data read/write architecture diagram
0. SQL SERVER MEMORY
(1) SQL Server occupies a portion of server memory, not the memory used by SQL Server for use by the operating system and other applications on the server;
(2) SQL Server memory objects can be divided into two major categories, the graph only the data in the buffer pool and the log cache;
1. End of transaction
(1) The end of the transaction is the premise that the log cache is successfully written to the log file, that is, when the client receives a message that the Commit/rollback statement runs successfully, the log has been successfully written to the log file (data is not yet written to the data file);
(2) However, the log cache does not have to wait until the end of the transaction to flush to the log file;
2. LOG WRITER
(1) When a long transaction is encountered, it is not necessary to wait for the end of the transaction, log writer periodically flushes the dirty log to the log file to ensure that the user responds quickly when a commit is sent to end the transaction;
(2) Microsoft did not publish SQL SERVER except commit, log writer flushes dirty log to log file period, here can refer to Oracle's: every 3 seconds, or log buffer 1/3 full, or already contain 1M dirty log;
3. LAZY WRITER
(1) LAZY Writer periodically scans the cache (default 1s), maintains a free page list, and releases the pages that have been flushed to disk according to the LRU algorithm;
(2) In the case of a dirty page, lazy writer flushes the dirty page to disk (at which point the transaction may not have been committed) to eventually release the memory page and join the free page list;
4. CHECKPOINT
(1) checkpoint, like lazy writer, refreshes the dirty pages into the data file (only the committed transaction data is refreshed), but does not maintain a free-memory page list;
(2) The sp_configure ' RECOVERY INTERVAL ' option can be set to change the frequency of the checkpoint occurrence, which defaults to 1 minutes.
Summary : It can be found that data and logs are written to the data/log files and are not synchronized. It is possible to write/commit the log, the data is not written to the disk, the data may be written, the transaction is not committed;
(1) SQL Server rolls Forward (REDO) when the database is started/restored for full transaction log and data not written to disk;
(2) For transactions that have data written to the data file, the log is not fully committed, and SQL Server does rollback (UNDO) when the database is started/restored.