High performance MySQL notes-5th chapter indexing for Performance-005 clustered index

Source: Internet
Author: User

I. Introduction to the CLUSTERED index

1. What is a clustered index?

InnoDB ' s clustered indexes actually store a b-tree index and the rows together in the same structure.

2. Why can a table have only one clustered index?

When a table had a clustered index, its rows is actually stored in the index ' s leaf pages.  The term ' clustered ' refers to the fact, rows with adjacent key values is stored close to each of the other. You can have only one clustered index per table and because you can ' t store the rows in both places at once. (However, covering indexes let you emulate mul-
tiple clustered indexes; More on this later.)

3. Advantages of Clustered Indexes

you can keep related data close together. For example, when implementing a mailbox, you can cluster by user_id, so can retrieve all of a single user ' s messages By fetching only a few pages from disk. If you didn ' t use clustering, each message might require its own disk I/O.
data access is fast. A clustered index holds both the index and the data together in one b-tree, so retrieving rows from a clustered index is n Ormally faster than a comparable lookup in a nonclustered index.
queries that use covering indexes can use the primary key values contained at the leaf node.

4. Disadvantages of Clustered indexes

Clustering gives the largest improvement for i/o-bound workloads. If the data fits in memory the order in which it's accessed doesn ' t really matter, so clustering doesn ' t give much benefit .
insert speeds depend heavily on insertion order. Inserting rows in primary key order is the fastest-to-load data into an InnoDB table. It might is a good idea to reorganize the table with OPTIMIZE table after loading a lot of data if you didn ' t load the row s in primary key order.
updating the clustered index columns is expensive, because it forces InnoDB to move each updated row to a new location.
tables built upon clustered indexes is subject to page splits when new rows is inserted, or when a row ' s primary key I s updated such that the row must be moved. A page split happens when a row ' s key value dictates that the row must was placed into a page that's full of data. The storage engine must split the page into the
accommodate the row. Page splits can cause a table to use more space on disk.
clustered tables can be slower to full table scans, especially if rows is less densely packed or stored Nonsequentiall Y because of page splits.
secondary (nonclustered) indexes can be larger than you might expect, because their leaf nodes contain the primary key C Olumns of the referenced rows.

Secondary index accesses require, index lookups instead of one.

The difference between a clustered index (with InnoDB) and a nonclustered index (with MyISAM)

Table structure

1 CREATE TABLE layout_test (2int not  null,3int not null, 4 PRIMARY KEY (col1), 5 KEY (col2) 6 );

Structure of the 1.MyISAM

in fact, in MyISAM, there are no structural difference between a primary key and any Other index. A primary key is simply a unique, nonnullable index named primary.

Structure of the 2.Innodb

at first glance, which might not the look very different from figure 5-5. But look again, and notice it illustration shows the whole table, not just the index. Because the clustered index "is" of the table in InnoDB, there ' s no separate row storage as there are for My ISAM.

each leaf node in the clustered index contains the primary key value, the TRANSACTION&NB SP; id, and Rollback pointer InnoDB uses for transactional and MVCC purposes, and THE rest of the columns (in this case, col2). If the primary key is in a column prefix, innodb  includes the full column value With the rest of the columns.

Also In contrast to MyISAM, secondary indexes is very different from clustered in dexes in InnoDB. Instead of storing "row pointers," InnoDB ' s secondary index leaf nodes contain the primary key values, which serv E as the "pointers" to the rows. This strategy reduces the work needed to maintain secondary indexes when rows move or
When there ' s a data page split. Using the row ' s primary key values as the pointer makes the index larger, but it means InnoDB can move a row with Out updating pointers to it.

Third, when using the clustered index, primary key whether the continuous influence

1.

Notice that is does it take longer to insert the rows with the UUID primary key, But the resulting indexes is quite a bit bigger. Some of that was due to the larger primary key, but Some of it was undoubtedly due to page splits and resultant fra Gmentation as well.

2. Why is there a difference in the primary key succession?

Insertion of successive primary keys

Insertion of discontinuous primary keys

Disadvantages of inserting discontinuous primary keys:

The destination page might has been flushed to disk and removed from the caches, or might not having ever been placed into the caches, in which case InnoDB would have To find it and read it from the disk before it can insert the new row. This causes a lot of random I/O.
When insertions is done out of order, InnoDB have to split pages frequently to Make, for new rows. This requires moving around a IoT of data, and modifying at least three pages instead of one.
pages become sparsely and irregularly filled because of splitting, so the final data is fragmented.

High Performance MySQL Note-5th Chapter indexing for Performance-005 clustered index

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.