Clustered index
A clustered index is not a separate index type, but rather a way of storing data (not a structure, but a storage structure), and the details depend on how it is implemented, and the clustered index actually holds the Btree index and data rows in the same structure.
InnoDB will aggregate the data through the primary key, and if no primary key is defined, InnoDB chooses the first non-null unique index instead, and if there is no non-null unique index, INNODB implicitly defines a 6-byte rowID primary key to be used as the clustered index
The leaf page contains all the data for the row, but the node page contains only the index column (or the node page that can say that the non-leaf node contains index values, because the node pages contain values that are extracted from the index column)
Because clustered indexes and table data are put together, there can be only one clustered index in a table
Data distribution comparison of InnoDB and MyISAM physical storage:
MyISAM
is stored on disk in the order in which the data is inserted, the primary key index and the level two index in MyISAM are structurally not different, and the primary key index is a unique non-empty index named primary.
InnoDB
Because InnoDB supports clustered indexes, it stores the same data in very different ways, InnoDB the clustered index contains the entire table's data, not just the index, because in InnoDB, the clustered index is the table, So, unlike MyISAM, you need separate row storage. Each leaf node of a clustered index contains primary key values, transaction IDs, rollback pointers for transactions and MVCC, and values for all remaining columns, and if the primary key is a column prefix index, INNODB also contains the full primary key column and the remaining column values.
Another thing that differs from MyISAM is that the InnoDB level two index and the clustered index are very different, and the leaf node of the InnoDB two index is not the row pointer, but the primary key value, and as a pointer to the row, Such a strategy reduces the maintenance of the two-level index when there is a row movement or splitting of the data page, and using a primary key value as a pointer makes the two-level index take up more space, in exchange for the benefit that the InnoDB does not need to update the pointer in the two-level index when moving the row.
Benefits of Clustered Indexes:
Faster data access, clustered indexes keep indexes and data in the same btree, so getting data from a clustered index is usually faster than finding it in a nonclustered index (data is found directly through the index)
Disadvantages of Clustered indexes:
Clustered data maximizes the performance of IO-intensive applications, but if the data is all in memory, the order of access is less important, and the clustered index has no advantage
Insert data operation cost is relatively large
Secondary indexes may be larger than expected because leaf nodes in a Level two index contain the primary key columns of the referencing row
Secondary index access requires two index lookups instead of one
Because a two-level index leaves a node that is not a pointer to the physical location of the row, it is the primary key value of the row. This means finding rows through a two-level index, where the storage engine needs to find a two-index leaf node to get the corresponding primary key value
Then, based on this primary key value to find the corresponding row in the clustered index, here do the repetitive work, two times Btree lookup instead of once, for InnoDB, adaptive Hash index can reduce such duplication of work.
Use of MySQL clustered index