[mysql-notes] creating high-performance indexes

Source: Internet
Author: User

Indexing, also called "Keys" in MySQL, is a data structure used by the storage engine to quickly find records, and works like an index in a book, but there are differences in how it is implemented.

I. Index classification

B-tree Index:

  Advantages:

    • In MyISAM, the index references the rows that are indexed based on the physical location of the data, and the rows that are indexed are referenced in InnoDB based on the primary key .
    • The B-tree index speeds up access to data because the storage engine no longer needs to perform a full table scan to get the data it needs, but instead begins the search from the root node of the index.
    • B-tree is the sequential organization of indexed columns, so it is well suited to look up range data.
    • In general, B-tree can find a value in some way, or it can be sorted in this way

  Limit:

    • The index cannot be used if it is not started by the leftmost column of the index
    • You cannot skip columns in an index
    • If there is a range query for a column in the query, none of the columns to the right of it will be able to use index optimization lookups.

Hash index:

Hash indexes are based on hash table implementations, and only queries that exactly match the index of all columns are valid

Advantages:

    • Very fast

Limit:

    • The hash index contains only the hash and row pointers, not the field values, so you cannot use the values in the index to avoid reading the rows .
    • Hash indexes are not stored in the order of index values, so they cannot be used for sorting .
    • A hash index does not support partial indexed column matching lookups.
    • Hash indexes only support equivalent comparison queries.
    • With a lot of hash collisions, some index maintenance operations can be expensive.

Two. Advantages of the Index

The most common is the B-tree index, which stores the data sequentially, so it can be used for order by and group by operations. Because the data is in order, B-tree also stores the relevant column values in one piece. Because the actual column values are stored in the index, some queries use only the index to complete the query.

1. The index greatly reduces the amount of data that the server needs to scan

2. Indexing can help the server avoid sorting and staging tables

3. The index can turn random I/O into sequential I/O.

Three. High-performance indexing strategy

1) stand-alone columns

A separate column refers to an indexed column that cannot be part of an expression or part of a function.

2) prefix index and index selectivity

The trick is to choose indexes that are long enough to ensure high index selectivity , but not too long ( space-saving ).

3) Multi-column index

It is wrong to first "index the columns in the Where condition".

If you use two single-column indexes in a query, the old version of MySQL causes a full table scan, and in the new version, the query can use multiple single-column indexes and merge the results.

There are three variants of this algorithm:

    • Union of an OR condition (union)
    • Intersection of and Conditions (intersection)
    • Previous combinations of OR and and

The index merge policy is more descriptive of the poor index build:

    • When a server intersects multiple indexes, it usually means that a multi-column index with all related columns is required, rather than multiple independent single-column indexes
    • When a server needs to do joint operations on multiple indexes, it usually takes a lot of CPU and memory resources on the algorithm's cache/Sort/merge operations.

4) Select the appropriate index column order

The order of multi-column indexes is critical

There is a rule of thumb: when sorting and grouping are not considered, it is usually nice to put high-selectivity columns in front of you.

5) Clustered Index

A clustered index is not a separate index type, but a way of storing data , in fact, the InnoDB clustered index holds b-tree indexes and rows of data in the same structure . When a table has a clustered index, its data rows are actually placed in the leaf page of the index. "Clustered", which represents a compact storage of data rows and key values together.

Advantages:

    • Can save related data together, reduce disk I/O
    • Data access is faster because indexes and data are kept in the same b-tree
    • Use the Overwrite index scan query to directly use the primary key value of the page node.

Disadvantages:

    • Clustered indexes maximize the performance of I/O-intensive applications, but if the data is in memory, the clustered index has no advantage.
    • Insertion speed heavily dependent on insertion order
    • Updating clustered indexes is expensive
    • A page split can occur when a new row is inserted, or when the primary key update requires a move line .
    • Causes the full table scan to slow , especially when the row is sparse, or because the page split causes the data to be stored in a discontinuous
    • Secondary so may be larger than expected, because the leaf node of the level two index contains the primary key value
    • Secondary index access requires two index lookups, not one at a time.

6) Overlay Index

If an index contains (overrides) the value of all the fields that need to be queried , then the overwrite index

The overwrite index must store the value of the indexed column, while the hash index, the spatial index, and the full-text index do not store the value of the indexed column, only B-tree can

MySQL cannot perform the like operation in the index, which is the limit of the underlying API, which can only be compared by extracting the values of the data rows rather than the values of the indexes.

To make use of the overlay index, we can take advantage of a technique called delay correlation .

7) Use Index Scan to sort

Reference: High performance MySQL Chapter fifth

[mysql-notes] creating high-performance indexes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.