InnoDB Storage Engine Overview

Last Update:2018-06-14 Source: Internet

Author: User

Tags garbage collection mysql version rollback

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

InnoDB Storage Engine Overview

?? The InnoDB storage engine is known for its balance of high reliability and performance, and in MySQL version 8.0, the InnoDB storage engine is the default storage engine. (History traces from MySQL 5.5.5, the default storage engine is replaced with the MyISAM from InnoDB). Of course, you can also specify a different storage engine, using the CREATE TABLE statement plus engine=xxx to specify a specific storage engine, as shown in Table 5.1.

Key features of the InnoDB

DML operation statements Follow the acid model, with the ability to commit (commit) and rollback (rollback) of transactions, and the recovery of MySQL instance failures.
Supports row-level locks (row-level locking) and consistent read (consistent reads) with Oracle database style for improved performance of multiuser concurrency.
The InnoDB table has a primary key index called a clustered index (clusterd index) per InnoDB table, based on the primary key optimization table query. Indexes can look up data through primary keys, minimize IO When reading data, and improve performance.
To maintain the integrity of the data, INNODB supports foreign KEY constraints (foreign keys). When you use foreign keys, inserting, updating, and deleting data are checked to ensure that there are no inconsistent results between the different tables.

Why use the InnoDB storage engine

The benefits of using the InnoDB table are as follows:

If your service is down due to software or hardware problems, no matter what the database is doing, just restart the database, and no special action is required. InnoDB crash Recovery automatically commits the data that is finalized for modification and revokes data that has not yet been commit in the process.
When data is accessed, the InnoDB storage engine has its own set of buffer pool buffers to cache data tables and index data. Data that is used frequently is read directly from memory. Used to improve processing speed. As a unique database server, you can set 80% of the physical memory to be specified as buffer pool.
If you divide an associative data into different tables, you can use foreign keys to constrain the integrity of the data. When you modify and delete data, the data of the associated tables is automatically modified and deleted.
If the data on the disk or in memory corrupted, then the checksum mechanism will issue an alert to inform you of the data bug.
When you design your database, you design the appropriate primary key for each table and automatically optimize it when you manipulate its associated data columns. When you reference a primary key in the Where,order by,group by clause or in a JOIN operation, the query is very fast.
The automatic mechanism of change buffering allows multiple concurrent read and write operations in the same table.
Performance gains are not confined to SQL statements that query for long periods of time, and when the same data is accessed over and over the same table, the Adaptive Hash Index takes over the query, making them faster to find.
You can compress data tables and related indexes.
You can create and delete indexes with less impact on performance.
Truncate operation, in the case of file-per-table based table space, is very fast.
The InnoDB storage engine becomes more efficient for storing blobs and text fields.
You can monitor the internal workings of the storage engine through the INFORMATION_SCHEMA table.
You can monitor the performance information of the storage engine through the performance schema table.
You can randomly mix the tables under the InnoDB storage engine with the tables of other storage engines in the same SQL statement, for example: You can federate tables from the InnoDB storage engine and memory storage engines by using join statements.
InnoDB can take full advantage of CPU performance when dealing with large volumes of data.
Even if the file system is limited to a file size of 2GB, InnoDB can handle a large amount of data.

Recommendations for using the InnoDB storage engine

Usage recommendations for using the InnoDB storage engine:

Establish primary key (Primary key): For each data table to establish a primary key, you can choose the most frequently queried columns, if not, you can use the self-increment ID.
Performance Tuning Recommendations for JOIN statements: To achieve better performance, you can define a foreign key for a column that requires a join. Or the same data type for these columns in each table. Adding a foreign key (foreign keys) can index the referenced column to improve performance. Foreign keys also guarantee data integrity, which prevents child tables from inserting data when there are no related columns in the parent table.
Close autocommit: Avoid adding, deleting, and changing transactions to automatically commit, affecting the performance of the database (this operation is rarely turned off in a production environment, so programmers are less likely to control transactions through start transaction).
Multiple transaction commit statements: The start transaction and commit statements cover the increment, delete, and change operations, avoid frequent commits, and provide database performance.
Do not use the Lock Table statement: The INNODB storage engine supports multi-user concurrency operations, and you can use SELECT .... The FOR UPDATE statement obtains an exclusive lock that locks only the data that you want to modify. Support Row lock
Turn on the innodb_file_per_table parameter: innodb_file_per_table = 1 when turned on, each table data and index will be stored in the respective data file, file name such as: *.ibd file, and not all in a system table space. Improve the application performance of the InnoDB storage engine after distributed data storage.

Verify that InnoDB is the default storage engine

?? You can view the available MySQL storage engines by using the show ENGINES statement. View the default storage engine.
The statements are as follows:

([email protected]) [(none)]>([email protected]) [(none)]>SHOW ENGINES;

Another method:

SELECT * FROM INFORMATION_SCHEMA.ENGINES;

InnoDB and Acid Models

?? The acid model is a fundamental criterion for database design. Critical applications that are important in business data have a principle of reliability. In MySQL's components, the InnoDB engine was developed and designed to fit the acid model closely. Therefore, its data will not be lost due to software or hardware problems caused by abnormal problems. When you're compatible with the acid principle, you don't need to reinvent the wheel for data consistency check and instance recovery issues.
?? In you have extra software for guard protection and high reliability for hardware assurance. Or your app allows a little bit of data loss and inconsistent data. Then you can adjust the MySQL parameters to allow it to improve performance in the case of acid trading. The properties of acid are shown in 5.1.

Fig. 5.1 Acid Properties

Atomic Nature
?? The main aspect of acid-central InnoDB is related to the knowledge point of transactions. Features are as follows:
- Auto-commit settings (autocommit)
- Commit statement
- Rollback statements
- Data manipulation from the INFORMATION_SCHEMA table
Consistency
?? The consistency aspect of acid is mainly related to the internal processing mechanism of INNODB. You can recover data in the event that the DB instance crashes. The relevant features are as follows:
- InnoDB write two times (InnoDB doublewrite buffer)
- InnoDB Instance Recovery (InnoDB crash recovery)
Isolation of
?? The aspect of isolation in acid is mainly related to InnoDB transactions, which is mainly embodied in the isolation level (isolation levels). The relevant features are as follows:
- Auto-commit settings (autocommit)
- SET Isolation LEVEL statement
- There is a row-level lock in the InnoDB locking, in performance tuning,
  Details can be viewed through the INFORMATION_SCHEMA table.
Durability
?? The durability aspects of acid are mainly through the configuration of MySQL software features. Because many features depend on the performance of the hardware itself, such as: CPU, network, storage. This is when purchasing hardware, hardware vendors can provide a guideline, more than the scope of this book discussion. The relevant MySQL features are as follows:
- Inoodb doublewrite buffer. The configuration file can be configured by Innodb_doublewrite to decide whether to start this feature
- Configuration Options Innodb_flush_log_at_trx_commit
- Configuration Options Sync_binlog
- Configuration Options Innodb_file_per_table
- Write buffers in storage devices, such as SSD and RAID disk arrays
- There is a battery-backed cache in the storage device
- In the operating system, run the MySQL database service, especially it supports the Fsync () system call
- Uninterrupted UPS protects all servers and storage devices, without power down. Protects the MySQL service's uptime and data storage.
- Backup strategy to ensure data availability through scripting and offsite backup.
- Especially in distributed applications, the network transmission in two data centers ensures that some features of MySQL can run smoothly, such as MySQL Replication

InnoDB Multi-version control (multi-versioning)

InnoDB Multi-version control and mechanism
?? The InnoDB is a multi-version controlled storage engine. It retains an older version of the data information for the modified row. Used to support transactional attributes. For example: Concurrency and data rollback. This information remains in the table space in the data structure, which is called the rollback segment rollback segment. (There is also a similar data structure in Oracle).
?? When a transaction needs to be rolled back, InnoDB uses the rollback segment information to perform the undo operation. For a row, InnoDB will use the earlier version of the information to secure read consistency (consistent read).
internal mechanism
MySQL data in the InnoDB storage engine adds three fields to each row. Each field detail is as follows:
- a 6-byte db_trx_id field that is used to point to the transaction identifier of the last transaction. This transaction is used to delete rows or insert rows. Of course, the delete operation is considered an update operation in the internal mechanism. One of the special bits (bit) is used to identify the deleted flag.
- a 7-byte db_roll_ptr field, called the rollback pointer (roll pointer). Used to write the undo log to the rollback segment. If a row is updated, the undo log logs the necessary information for rebuilding the row data information before it is updated.
- a 6-byte db_row_id field that contains a ROW ID. used when inserting a new row is monotonically incrementing. If InnoDB automatically inherits the clutered index, the index contains the row ID value. Otherwise, db_row_id will not appear in any index.

?? The Undo log is divided into two parts in the rollback segment (rollback segment), part called Insert Undo log (insert undo logs), and the other part is called Update undo log (update undo logs). Inserting the Undo log is used only for transaction rollback, such as when a transaction is committed, the log can be discarded. Update the Undo log with read consistency, InnoDB specifies a snapshot of the snapshot that was built from the update to the previous version of the data row in the Undo log. Read consistency is ensured by snapshots of earlier versions of the data row, which can be discarded if this transactional data protection is not required.
?? Commit transactions on a regular basis, and these transactions include only consistent reads. Otherwise, InnoDB does not discard the update undo log, and the rollback segment may grow very large and fill the entire tablespace.
?? The physical space size of an undo log is typically less than the corresponding insert row or the data size of the updated row. So you can use this information to count the space needed for the rollback segment.
?? In the InnoDB multi-versioning architecture, when you execute the DELETE SQL statement to delete a row of data, the physical deletion of a row is not performed immediately. The InnoDB storage engine makes physical delete row operations as long as the undo log is discarded for updates to the delete class operation. This delete operation is called Purge. The purge operation speed is very fast and the SQL statement performs the delete operation at the time pitches.

Knowledge Point Note: Purge
?? is a type of garbage collection mechanism (garbage collection). This garbage collection mechanism manages control (control parameters innodb_purge_threads ) and runs periodically through one or more background processes.
?? purge resolves and processes the Undo log page from the History list. These log pages (rows that were deleted by the previous delete SQL statement) are marked as deltetion clustered and secondary index logging, and are no longer required for MVCC and rollback. These undo log pages are released from the history list when the purge is finished processing.
?? If you insert or delete rows at the same rate in small batches. Then the purge thread will start to lag. The data table has been caused by the "dead" data rows resulting in increasing size, causing everything to be tied to the disk so that it is very slow. In this case, you can adjust the new line operation of the valve to allocate more resources to the purge thread. By optimizing system Variables innodb_max_purge_lag .

InnoDB Storage Engine Overview

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More