MySQL has multiple storage engines, MyISAM and InnoDB are two of them commonly used. Here are some basic concepts (not in-depth introduction) about these two engines.
The MyISAM storage engine, based on the traditional ISAM type, supports full-text search, but is not transaction-safe, and does not support foreign keys. Each MyISAM table is stored in three files: the frm file stores the table definition; the data file is MYD (MYData); the index file is MYI (MYIndex).
InnoDB is a transactional engine that supports rollback, crash recovery capabilities, multi-version concurrency control, ACID transactions, and row-level locking (InnoDB table row locks are not absolute, if
MySQL cannot determine the scope to scan when executing a SQL statement , InnoDB table will also lock the whole table, such as the SQL statement during like operation), and provide a lock-free read method consistent with the Oracle type. InnoDB stores its tables and indexes in a table space, which can contain several files.
Core difference
MyISAM is non-transactionally safe, while InnoDB is transactionally safe.
The granularity of MyISAM locks is at the table level, while InnoDB supports row-level locking.
MyISAM supports full-text indexing, while InnoDB does not support full-text indexing.
MyISAM is relatively simple, so it is better than InnoDB in efficiency, and small applications can consider using MyISAM.
The MyISAM table is saved as a file, and the use of MyISAM storage in cross-platform data transfer will save a lot of trouble.
InnoDB tables are more secure than MyISAM tables, and you can switch non-transactional tables to transactional tables (alter table tablename type = innodb) while ensuring that data is not lost.
Application scenario
MyISAM manages non-transactional tables. It provides high-speed storage and retrieval, as well as full-text search capabilities. If the application needs to perform a large number of SELECT queries, then MyISAM is a better choice.
InnoDB is used for transaction processing applications and has many features, including ACID transaction support. If the application needs to perform a large number of INSERT or UPDATE operations, you should use InnoDB, which can improve the performance of multi-user concurrent operations.
Mysql storage engine and index
The database must have indexes. If there is no index, the search process becomes a sequential search. The time complexity of O (n) is almost unbearable. It is very easy to imagine how a table composed of only a single key can be indexed using a B + tree, as long as the key is stored in the node of the tree. When a database record contains multiple fields, a B + tree can only store the primary key. If a non-primary key field is retrieved, the primary key index becomes useless and becomes sequential search. At this time, a second set of indexes should be created on the second column to be retrieved. This index is organized by an independent B + tree. There are two common methods to solve the problem of multiple B + trees accessing the same set of table data. One is called clustered index, and the other is called non-clustered index. Although both names are called indexes, this is not a separate index type, but a data storage method. For clustered index storage, row data and the primary key B + tree are stored together. The secondary key B + tree only stores the secondary key and the primary key. The primary key and non-primary key B + tree are almost two types of trees. For non-clustered index storage, the primary key B + tree stores a pointer to the real data row at the leaf node, not the primary key.
InnoDB uses a clustered index to organize the primary key into a B + tree, and the row data is stored on the leaf node. If the primary key is searched using the condition of "where id = 14", the retrieval algorithm according to the B + tree You can find the corresponding leaf node, and then get the row data. If you perform a conditional search on the Name column, you need two steps: The first step is to retrieve the Name in the auxiliary index B + tree and reach its leaf node to obtain the corresponding primary key. The second step uses the primary key to perform another B + tree search operation on the main index B + tree species, and finally reaches the leaf node to obtain the entire row of data.
MyISM uses a non-clustered index. The two B + trees of the non-clustered index look the same. The structure of the nodes is exactly the same except that the stored content is different. The nodes of the primary key index B + tree store the primary key, and the secondary key index B + tree store Auxiliary key. The table data is stored in a separate place. The leaf nodes of the two B + trees use an address to point to the real table data. For table data, there is no difference between the two keys. Since the index tree is independent, it is not necessary to access the index tree of the primary key through secondary key retrieval.
In order to illustrate the difference between these two indexes more vividly, we imagine a table storing 4 rows of data as shown below. Id is used as the main index, and Name is used as the auxiliary index. The diagram clearly shows the difference between clustered and non-clustered indexes.
We focus on clustered indexes. It seems that the efficiency of clustered indexes is significantly lower than that of non-clustered indexes, because each time you use secondary index retrieval, you have to go through two B + tree searches. Isn't this an unnecessary thing? What are the advantages of clustered indexes?
1. Since row data and leaf nodes are stored together, so that the primary key and row data are loaded into memory together, the row data can be returned immediately if the leaf node is found. If the data is organized according to the primary key Id, the data is faster.
2. The advantage of using the primary key as a "pointer" instead of using the address value as a pointer for the secondary index is to reduce the maintenance of the secondary index when row movement or data page splitting occurs. The more space, the benefit is that InnoDB does not need to update this "pointer" in the auxiliary index when moving rows. That is to say, the position of the row (located by the 16K Page in the implementation, which will be discussed later) will change as the data in the database is modified (the B + tree node split in the front and the Page split), using a clustered index Ensure that no matter how the nodes of the primary key B + tree change, the auxiliary index tree is not affected.
So in the case of millions of data and larger data, the index performance of mysql innoDB is even better!