Primary key Index
Primary key index, referred to as primary key, the original is primary key, consisting of one or more columns, used for uniqueness to identify a record in the datasheet. A table can have no primary key, but only one primary key, and the primary key value cannot contain null.
In MySQL, the primary key design of the InnoDB datasheet we usually follow several principles:
A self-added attribute column without a business purpose is used as the primary key;
Primary key field values are always not updated, only add or remove two operations;
Types that are dynamically updated, such as the current timestamp, are not selected.
There are several advantages to doing this:
When new data is added, because the primary key value is sequentially increasing, the probability of splitting the InnoDB page is reduced; You can refer to the previous sharing "[MySQL FAQ] Series-why InnoDB table should be recommended to use the self-added key";
When business data is changed, the probability of changing the physical storage location is reduced without modifying the primary key value, and the probability of fragmentation in the InnoDB page is reduced.
MyISAM table because it is the heap organization table, the primary key type design can be so exquisite.
The secondary index, which is the index of our regular reference, is the original secondary KEY. Secondary indexes can also be subdivided into unique indexes, not unique indexes.
A unique index should actually be called a uniqueness constraint, and its role is to avoid duplication of one or more column values, which is a binding index.
In the MyISAM engine, a unique index has no intrinsic difference to the primary key index except that the key value allows null. That is, in the MyISAM engine, a unique index of null values is not allowed, which is essentially the same as the primary key index.
In the InnoDB engine, the difference between primary key index and secondary index is great. The primary key index is selected as the clustered index, and there is no essential difference in storage between the unique index and the normal secondary index, except for uniqueness constraints.
From query performance, the query performance of the primary key index in the MyISAM table and the unique index that does not allow Null is equivalent, and the conversion process from the secondary index to the primary key index is required more than once in the InnoDB table through a unique index query. InnoDB tables are more expensive to find based on common indexes because each time you retrieve a result, you need to retrieve at least one more time to confirm that there are more qualifying results, and that primary key indexes and unique indexes do not need to be done.
After testing, 1 million rows of data MyISAM do random retrieval (integer type), primary key and the efficiency of the unique index is basically the same, the retrieval efficiency of common index is slower than 30%. With the InnoDB table, the unique index is about 9% slower than the primary key index, and the normal index is about 50% slower than the primary key index.
Clustered Index
in MySQL, the InnoDB engine table is the (clustered) Index organization table (clustered index organize), and the MyISAM engine table is the heap organization table (heap organize table).
The clustered index is also referred to as the cluster index.
Of course, the concept of clustered index is not MySQL, other database systems also have.
In short, a clustered index is an index organization in which the logical order of the key values determines the physical storage order of the table data rows, and the nonclustered index is the normal index, which simply creates the corresponding index on the data column, without affecting the physical storage order of the entire table.
Let's take a look at the differences between the two storage types:
Simply put, the physical storage order of the data in the IoT table is the same as that of the primary key index, so if the new data is discrete, the data blocks will tend to be discrete rather than in order. The order in which hot table data is written is stored in write-time order.
The advantages of the IoT table compare to the hot table:
The scope query efficiency is more high;
When data is frequently updated (the clustered index itself is not updated), it is less likely to produce fragmentation;
Especially suitable for a small number of hot data frequently read and write scenes;
Access to data by primary key is fast reachable;
The deficiencies of the IoT table are:
- If the data change is mainly discrete, then the efficiency will be worse than hot table;
The shortages of hot tables are:
- Index back table reading overhead is large;
- Most of the data is random when read, and cannot be guaranteed to be read sequentially, the overhead is large;
- Only one clustered index can be created per InnoDB table, and the clustered index may consist of one or more columns.
As mentioned above, InnoDB is the clustered Index organization table, and its clustered index selection rule is this:
First, an explicitly defined primary key index is selected as a clustered index;
If not, select the first unique index that does not allow null;
Still not, use the InnoDB engine built-in ROWID as a clustered index;
As you can see, in the leaf node of this index structure, the node key value is the value of the primary key, and the value of the node stores the remaining column data, along with additional information such as ROWID, rollback pointer, Trx IDs, and so on.
Combined with this diagram, and as described above, we can see that in the InnoDB table, the clustered index is the same as the entire table, and the entire table is the clustered index. A primary key is necessarily a clustered index, and a clustered index is not necessarily a primary key.
MyISAM is the heap organization table, which does not have the concept of a clustered index.
The comparison between the two
The following is a simple comparison table