A learning guide to the concept of primary key index and focus index in Mysql

A learning guide to the concept of primary key index and focus index in Mysql _mysql

Last Update:2017-01-19 Source: Internet

Author: User

Tags comparison table

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Primary key Index

Primary key index, referred to as primary key, the original is primary key, consisting of one or more columns, used for uniqueness to identify a record in the datasheet. A table can have no primary key, but only one primary key, and the primary key value cannot contain null.

In MySQL, the primary key design of the InnoDB datasheet we usually follow several principles:

A self-added attribute column without a business purpose is used as the primary key;
Primary key field values are always not updated, only add or remove two operations;
Types that are dynamically updated, such as the current timestamp, are not selected.
There are several advantages to doing this:

When new data is added, because the primary key value is sequentially increasing, the probability of splitting the InnoDB page is reduced; You can refer to the previous sharing "[MySQL FAQ] Series-why InnoDB table should be recommended to use the self-added key";
When business data is changed, the probability of changing the physical storage location is reduced without modifying the primary key value, and the probability of fragmentation in the InnoDB page is reduced.
MyISAM table because it is the heap organization table, the primary key type design can be so exquisite.

The secondary index, which is the index of our regular reference, is the original secondary KEY. Secondary indexes can also be subdivided into unique indexes, not unique indexes.

A unique index should actually be called a uniqueness constraint, and its role is to avoid duplication of one or more column values, which is a binding index.

In the MyISAM engine, a unique index has no intrinsic difference to the primary key index except that the key value allows null. That is, in the MyISAM engine, a unique index of null values is not allowed, which is essentially the same as the primary key index.

In the InnoDB engine, the difference between primary key index and secondary index is great. The primary key index is selected as the clustered index, and there is no essential difference in storage between the unique index and the normal secondary index, except for uniqueness constraints.

From query performance, the query performance of the primary key index in the MyISAM table and the unique index that does not allow Null is equivalent, and the conversion process from the secondary index to the primary key index is required more than once in the InnoDB table through a unique index query. InnoDB tables are more expensive to find based on common indexes because each time you retrieve a result, you need to retrieve at least one more time to confirm that there are more qualifying results, and that primary key indexes and unique indexes do not need to be done.

After testing, 1 million rows of data MyISAM do random retrieval (integer type), primary key and the efficiency of the unique index is basically the same, the retrieval efficiency of common index is slower than 30%. With the InnoDB table, the unique index is about 9% slower than the primary key index, and the normal index is about 50% slower than the primary key index.

Clustered Index
in MySQL, the InnoDB engine table is the (clustered) Index organization table (clustered index organize), and the MyISAM engine table is the heap organization table (heap organize table).

The clustered index is also referred to as the cluster index.

Of course, the concept of clustered index is not MySQL, other database systems also have.

In short, a clustered index is an index organization in which the logical order of the key values determines the physical storage order of the table data rows, and the nonclustered index is the normal index, which simply creates the corresponding index on the data column, without affecting the physical storage order of the entire table.

Let's take a look at the differences between the two storage types:
Simply put, the physical storage order of the data in the IoT table is the same as that of the primary key index, so if the new data is discrete, the data blocks will tend to be discrete rather than in order. The order in which hot table data is written is stored in write-time order.
The advantages of the IoT table compare to the hot table:

The scope query efficiency is more high;
When data is frequently updated (the clustered index itself is not updated), it is less likely to produce fragmentation;
Especially suitable for a small number of hot data frequently read and write scenes;
Access to data by primary key is fast reachable;
The deficiencies of the IoT table are:

If the data change is mainly discrete, then the efficiency will be worse than hot table;

The shortages of hot tables are:

Index back table reading overhead is large;
Most of the data is random when read, and cannot be guaranteed to be read sequentially, the overhead is large;
Only one clustered index can be created per InnoDB table, and the clustered index may consist of one or more columns.

As mentioned above, InnoDB is the clustered Index organization table, and its clustered index selection rule is this:

First, an explicitly defined primary key index is selected as a clustered index;
If not, select the first unique index that does not allow null;
Still not, use the InnoDB engine built-in ROWID as a clustered index;

As you can see, in the leaf node of this index structure, the node key value is the value of the primary key, and the value of the node stores the remaining column data, along with additional information such as ROWID, rollback pointer, Trx IDs, and so on.

Combined with this diagram, and as described above, we can see that in the InnoDB table, the clustered index is the same as the entire table, and the entire table is the clustered index. A primary key is necessarily a clustered index, and a clustered index is not necessarily a primary key.

MyISAM is the heap organization table, which does not have the concept of a clustered index.

The comparison between the two

The following is a simple comparison table

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More