The database usually implements the transaction with the help of log, the common have undo log, redo Log,undo/redo log all can guarantee the transaction characteristic, Undolog implements the transaction atomicity, Redolog realizes the transaction persistence.
To minimize performance problems with IO bottlenecks when data is written, MySQL uses a caching mechanism: When query modifies data in the database, InnoDB first reads the data from the disk into memory, modifies the in-memory copy of the data, and persist the modified behavior to the transaction log on disk (write redo log buffer first, and then batch write periodically), instead of logging the modified data directly to the hard disk each time, and so on, after the transaction log is persisted, the dirty data in memory can be slowly brushed back to the disk, called Write-ahead Logging. The transaction log uses append writes, and sequential IO provides a better performance advantage.
In order to avoid data loss problems caused by power-down or system failure during dirty data brush back to disk, InnoDB uses transaction log (redo log) to resolve the problem.
First, a simple understanding of several concepts
The files stored in database data are called data file;
Log files are called log file;
Database data is cached, if there is no cache, write every time or read the disk, then the performance is too low. The cache of database data is called data buffer, and the log (redo) cache is called log buffer.
Memory Buffer Pool
Buffer pool If MySQL does not use memory buffers, every time you read and write data, you need to access the disk, which will definitely increase the I/O request, resulting in inefficiency. So when the InnoDB engine reads and writes data, the corresponding data and indexes are loaded into the buffer pool in memory, which improves the speed of data reading and writing.
Buffer pool: The largest chunk of memory, the cache used to hold various data includes index pages, data pages, undo pages, insert buffers, adaptive hash indexes, lock information for INNODB storage, data dictionary information, and so on. How you work always reads the database file by page (16k per page) to the buffer pool, and then retains the cached data in the buffer pool by using the least recently used (LRU) algorithm. If the database file needs to be modified, always first modify the pages in the cache pool (the Dirty pages dirty page after the modification), and then flush the dirty pages of the buffer pool to the file at a certain frequency.
Table Space
The tablespace can be seen as the highest level of the logical structure of the INNODB storage engine. Tablespace file: InnoDB The default tablespace file is ibdata1.
Segment: The table space consists of individual segments, the common segments are data segments, index segments, rollback segments (undo log segments), and so on.
Zone: consists of 64 contiguous pages, each with a size of 16kb, which is 1MB per area.
Page: 16KB per page, and cannot be changed. Common page types are: Data pages, undo pages, System pages, transactional data pages, insert buffer bitmap pages, insert buffered free list pages, uncompressed binary large object pages, compressed binary large object pages.
Redo log and undo log
In order to satisfy the persistence of the transaction and prevent the buffer pool data from being lost, InnoDB introduced the redo log. To satisfy the atomicity of the transaction, InnoDB introduced the undo log.
Second, undo Log
Undo log is to implement the atomicity of the transaction. Undo log is also used to implement multi-version concurrency control (abbreviation: MVCC).
Internal mechanism of the delete/update operation
When a transaction is committed, InnoDB does not immediately delete the undo log because it may also use undo log, such as when the isolation level is repeatable read, when the transaction reads the most recent commit row version when the transaction is turned on, as long as the transaction does not end, the row version cannot be deleted. That is, undo log cannot be deleted.
However, when the transaction commits, the undo log corresponding to the transaction is placed in the delete list and will be deleted by purge in the future. And when committing a transaction, it also determines whether the page that is allocated by the Undo log can be reused, and if it can be reused, it is assigned to the subsequent transaction, avoiding the waste of storage space and performance by allocating separate undo log pages for each individual transaction.
The results of the delete and update operations are recorded via the Undo log: (insert operation does not need to be analyzed, it is inserting rows)
- The delete operation is not actually deleted directly, but instead the delete object is marked for deletion by the delete flag, and the final delete operation is done by the purge thread.
- There are two cases of update: whether the update column is a primary key column.
- If it is not a primary key column, the direct reverse record in undo log is how the update was recorded. That is, update is done directly.
- In the case of a primary key column, update two executes: deletes the row before inserting a row of target rows.
Atomicity of the ① transaction
All of the operations of a transaction are either completed or not, and cannot be done in half. If an error occurs during execution, return to the state at the start of the transaction, and all operations are rolled back.
② principle
The principle of Undo log is simple, in order to satisfy the atomicity of a transaction, before any data is manipulated, the data is first backed up to a place where the backup of the storage data is called Undo Log. The data is then modified. If an error occurs or the user executes the ROLLBACK statement, the system can use the backup in the Undo log to restore the data to the state before the transaction began.
Suppose there is a, b two data, the values are 1, 2, respectively. Perform a +2 transaction operation.
A. Start of the transaction.
B. Record a=1 to undo log.
C. Modify the A=3.
D. Record b=2 to undo log.
E. Modify the b=4.
F. Write the undo log to disk.
G. Write data to disk.
H. Transaction submission
There is an implied precondition: ' The data is read into memory first, then the data in memory is modified, and the data is then written back to disk '.
Both atomicity and persistence can be guaranteed at the same time because of the following characteristics:
A. Record the undo log before updating the data.
B. To ensure persistence, the data must be written to disk before the transaction commits. As long as the transaction is successfully committed, the data must be persisted.
C. Undo log must be persisted to disk before data. If the system crashes between g,h, undo log is complete and can be used to roll back the transaction.
D. If the system crashes between a-f, the data is not persisted to disk. So the data on the disk remains in the state before the transaction begins.
Disadvantage: Writing data and undo log to disk before each transaction commits will result in a large amount of disk IO and therefore performance is low.
Third, Redo Log
Redo log is to save the executed SQL statement to a specified log file, and when MySQL performs the data recovery, the SQL operation of the redo log record is re-executed. The introduction of buffer pool causes the updated data not to be persisted to the disk in real time, and when the system crashes, although the data in the buffer pool is lost and the data is not persisted, the system can restore all data to the latest state based on the contents of the Redo log. Redo log exists as a standalone file on disk. By default, there are two files, named Ib_logfile0 and Ib_logfile1, respectively.
The parameter innodb_log_file_size specifies the size of the redo log; innodb_log_file_in_group Specifies the number of redo logs, which defaults to 2; INNODB_LOG_GROUP_HOME_DIR specifies the path where the redo log is located.
Innodb_additional_mem_pool_size = 100minnodb_buffer_pool_size = 128minnodb_data_home_dir =/home/mysql/local/mysql/ Varinnodb_data_file_path = Ibdata1:1g:autoextendinnodb_file_io_threads = 4innodb_thread_concurrency = 16innodb_flush _log_at_trx_commit = 1innodb_log_buffer_size = 8minnodb_log_file_size = 128minnodb_log_file_in_group = 2innodb_log_ Group_home_dir =/home/mysql/local/mysql/var
To satisfy the atomicity of a transaction, before any data is manipulated, the data is first backed up to the undo Log, and then the data is modified. If an error occurs or the user executes the ROLLBACK statement, the system can use the backup in the Undo log to restore the data to the state before the transaction began. Unlike redo log, a separate undo log file does not exist on the disk, and it is stored in a special segment (segment) inside the database, which is called the Undo segment (undo segment), and the Undo field is in the shared table space.
InnoDB implements three hidden fields for each row of records:
- 6-byte transaction ID (DB_TRX_ID)
- 7-byte rollback pointer (DB_ROLL_PTR)
- The Hidden ID
Log contents of redo Log
The Undo log and the redo log itself are separate. InnoDB's undo log is recorded in the data file (IBD), and InnoDB the contents of the undo log as data, so the operation of the undo log itself (such as inserting an undo record into the undo log, etc.) will log redo log. Undo log does not have to be persisted to disk immediately. Even if it is lost, it can be restored by redo log. So when you insert a record:
- Insert an undo log record into the undo log.
- Insert a redo log record that inserts an undo log record into the redo log.
- Insert data.
- Insert an "Insert" redo log record into the redo log.
IO Performance of redo log
In order to ensure Redo log can have better IO performance, InnoDB's Redo log design has the following features:
- Try to keep the redo log stored in a contiguous space. Therefore, the space of the log file is fully allocated when the system is first started. Records the redo log in sequential append mode.
- Bulk write logs. Instead of writing directly to the file, the log writes redo log buffer and writes the data in buffer to disk every second
- Concurrent transaction sharing redo log storage space, their redo log in the order of execution of the statement, sequentially alternating records together, to reduce the space occupied by the log.
- Only sequential append operations are performed on Redo log, and when a transaction needs to be rolled back, its Redo log record is not removed from the Redo log.
① principle
In contrast to undo log, Redo log is a backup of the new data. You do not need to persist the data until the transaction commits, as long as the redo log is persisted. When the system crashes, the redo log is persisted, although the data is not persisted. The system can restore all data to the latest state based on the contents of the Redo log.
②undo + Redo Transaction Simplification process
A. Start of the transaction.
B. Record a=1 to undo log.
C. Modify the A=3.
D. Record a=3 to redo log.
E. Record b=2 to undo log.
F. Modify B=4.
G. Record b=4 to redo log.
H. Write redo log to disk.
I. Transaction Submission
Iv. Summary
The A-G process is performed in memory, the corresponding operation is recorded in redo log buffer (b&e), redo log buffer (e&g), and the result of the transaction execution (not committed at this time) also exists in DB buffer (C&F). When buffer is full, write to disk, if buffer stores the number of transactions is 1, it means that the log is immediately flushed to disk, the data consistency is very good assurance. If there is more than one transaction, the redo log is synchronized to the disk with a status bit to record the commit, then the actual commit transaction, and the data in DB buffer is synchronized to the disk of DB. To ensure that the contents of the DB buffer are written to the disk database file, the contents of the log buffer should be written to the disk log file. This way, you can reduce disk IO and increase throughput. However, this approach applies to scenarios where conformance requirements are not high. Because if a system failure occurs such as a power outage, the completed transaction in log buffer, DB buffer is not synchronized to the disk and is lost. Like a bank that requires a higher level of consistency, it is important to ensure that every transaction is logged to disk, and if the server is down, go to redo log and redo the committed transaction.
V. The role of Redo & Undo Log
- Data persistence
A linked list, called Flush_list, is maintained in the buffer pool in order of the dirty page modification. The data is brushed to persistent storage according to the order of the pages in Flush_list. Arranged in the order in which the pages were first modified. Under normal circumstances, when does the dirty page flush to disk?
- When the redo space fills up, a portion of the dirty page is flushed to disk, and some redo log is released.
- When you need to assign a page to buffer pool, but it is full, you must flush dirty pages to disk. In general, you can control this situation by starting the parameter innodb_max_dirty_pages_pct, and when the dirty page in buffer pool reaches this scale, flush the dirty page to disk.
- When the system is idle, flush is detected.
- Data recovery
With the accumulation of time, Redo log will become very large. If you start the recovery from the first record every time, the recovery process is slow and cannot be tolerated. In order to reduce the time of recovery, the checkpoint mechanism is introduced. Assume that at some point in time, all dirty pages are flushed to disk. All redo logs before this point in time do not need to be re-made. The system logs the end of the redo log at this point in time as checkpoint. At the time of recovery, start with this checkpoint position. The log before the checkpoint point is no longer needed and can be deleted.
Vi. Recovery (Recovery)
Recovery strategy
The redo log is also logged for uncommitted transactions and rollback transactions, so these transactions are handled in a special way when recovering. There are 2 different recovery strategies:
A. When recovering, only the transactions that have already been committed are re-made. (returned to the client is a recoverable persistence that has committed certain data)
B. When recovering, redo all transactions include uncommitted transactions and rolled back transactions. The uncommitted transactions are then rolled back through the undo log. For example, in the process of b-e down, then the recovery based on the undo log to re-simulate the scene (but if the log buffer space is very large, log is not synchronized to the disk this process is no way to do, and because the transaction is not committed, the value returned to the client is not committed successfully, So it doesn't matter.)
InnoDB transaction log (redo log and undo log)