Mysql database optimization study 3-index optimization (2) _ MySQL

Source: Internet
Author: User
Mysql database optimization 3-index optimization (2) bitsCN.com

High-Performance Index policies
Creating a correct index and using it properly play a key role in the query performance. We have introduced the capabilities and weaknesses of various indexes. The following describes the power of indexes.

There are many effective ways to create and choose to use indexes, because there are many special situations of optimization and special behavior.

Isolated column
If you do not isolate index columns, MySQL usually cannot use indexes. An isolated column means that it cannot be an expression.
Part or in the function.
For example:
SQL code
SELECT actor_id FROM sakila. actor WHERE actor_id + 1 = 5;
SELECT... WHERE TO_DAYS (CURRENT_DATE)-TO_DAYS (date_col) <= 10;

You can transform it:
SQL code
SELECT actor_id FROM sakila. actor WHERE actor_id = 4;
SELECT... WHERE date_col> = DATE_SUB (CURRENT_DATE, INTERVAL 10 DAY );

You can modify the second statement and use CURRENT_DATE as a specific date value.
Hit query cache:
SQL code
SELECT... WHERE date_col> = DATE_SUB (, INTERVAL 10 DAY );


Prefix index and Index selectivity
Sometimes you need to see the index for a long string column, which will lead to a very large index, and pretty good. One type
The policy is to create a hash index, which we have discussed earlier.
You can also adopt another policy, such as creating indexes for the first few characters of this column, instead of all. The Index selectivity refers to the ratio of the number of different values in the index to the number of all rows. A highly selective index is good because it can filter out more rows.
A prefix index can achieve high performance if the selection is good enough. If you use BLOB or TEXT, or
For VARCHAR columns that are very long, you must define a prefix index, because MySQL does not allow indexing on all lengths.

An index selection technique is to select a long enough prefix to achieve better selection, but it is short enough to save space. For example, to create an index for the first seven characters in a city table:
SQL code
Alter table sakila. city_demo add key (city (7 ));


Clustered index:
Clustered index is not a separate index type, but a data storage method. Detailed information depends on implementation. The clustered index of InnoDB stores B-Tree indexes and rows in the same structure. For a table with clustered indexes, its rows are actually stored in the leaf node of the index. Clustering means that the values of adjacent keys are stored in adjacent spaces. A table can have only one clustered index, because you cannot store a row of data in two locations at the same time.
(However, overwriting indexes allow you to simulate multiple clustered indexes)
Because the storage engine implements indexes, and not all storage engines support clustered indexes. Currently, only solidDB and InnoDB are supported. We only discuss InnoDB, but some principles apply to the storage engines of all clustered indexes.

Some databases allow you to select clustered indexes, but at least MySQL does not. InnoDB aggregates data using primary keys. If you do not define a primary key, InnoDB will try to use a non-null unique index column. Without such an index, InnoDB defines a hidden primary key and uses it for aggregation. InnoDB aggregates only the same record, so the adjacent key may be far away from data storage.

Clustered index can help improve performance, but it still causes some serious performance problems. You need to be especially careful about aggregation, especially when you switch from InnoDB to other storage engines.

Clustered index advantages:
1. Save the relevant data in a similar location. For example, if you implement a mail system, you can aggregate by user_id, so that you can obtain all the messages of a single user by accessing a few disk pages. If you do not have a clustered index, each message requires a disk I/O.
2. fast data access. Clustered indexes hold both indexes and data on B-Tree, so it is faster to obtain data from clustered indexes than to have no clustered indexes.
3. for a query that uses overwriting indexes, you can use the value stored in the primary key of the leaf node. you do not need to find the corresponding row based on the key.

Clustered index disadvantages:
1. clustered indexes maximize the I/O load. However, if the data can be stored in the memory, the access sequence is not that large, so clustered indexes do not provide more benefits.
2. the insert speed depends on the insertion order. Insert data to the InnoDB table in the order of primary keys. If you do not load data in the order of primary keys, it is best to use optimize table to re-organize the table after loading.
3. it takes a lot of time to update clustered index columns, because it forces InnoDB and the new row to the new location.
4. when a new record is inserted to a table created by clustered index, if it is not in the primary key order, the page may be split. When the key of a row needs to insert data into a full page, the page is split. The split of the page causes the table to use more space.
5. the secondary index will become very large because the leaf node contains the row referenced by the key.
6. secondary indexes need to be searched twice instead of once.

Overwrite index
Indexes are used to efficiently search rows. However, MySQL can use indexes to retrieve data in a column. Therefore, you do not need to read rows. The leaf node of the index contains the data you want to search for. Therefore, you do not need to read the row before searching for the data. This index contains the data to be retrieved by the query.

Covering indexes is a very powerful tool that can greatly improve the performance. Advantages:
1. The index size is much larger and smaller than that of all rows in the table. Therefore, Mysql only needs to access a small amount of data to obtain the required data. This is very good for the Cache, because the index is much smaller than the data, and it is better to put it in the memory. This is especially true for MyISAM, because its index is compressed, which makes it smaller.
2. The index is sorted by the index value, so this requires less I/O than accessing each row from the disk. For some storage engines, such as MyISAM, you can use OPTIMIZE to obtain an index in full order. This allows a simple range-based query to fully use sequential index access.
3. a lot of storage engine cache indexes are better than data (Falcon is an exception ). Some storage engines such
MyISAM caches indexes only in the MySQL memory. because the operating system caches data for MyISAM, it needs to be called by the system, which may cause high performance overhead, especially in the cache, system calling is a very expensive part of data access.
4. covering indexes is very helpful for InnoDB tables because of InnoDB's clustered indexes. The leaf of the InnoDB secondary index stores the value of the primary key. Therefore, if the secondary index overwrites the queried data, secondary queries of the primary key can be avoided.

Overwriting indexes are not applicable to any index type. Indexes must be able to store index column values. Therefore, Hash, space, and full-text indexes cannot store these values. only B-Tree indexes can be used. And different storage engines support different (for example, the memory and Falcon engines are not supported yet ).

When a query is overwritten by an index, you can use the Explain Extra column to see that the "Using index" is used ".


Sort by index scan
MySQL can generate ordered result sets in two ways: 1. filesort 2. scan by index order.
You can use Explain to check whether the type column in the query plan has an "index.

Scanning an index is fast, because it only needs to go from one entry of the index to another. However, if MySQL cannot use index to overwrite the query, you need to search for each row based on the index. this is basically a random I/O operation, therefore, reading data in the order of indexes is usually much slower than scanning tables in order.

MySQL can use the same index for sorting and searching at the same time. Index is used to sort the result set. only the index ORDER is the same as the order by order, and all the sorted columns are sorted in the same direction (descending or ascending). If multiple tables are joined, only the column of the first table appears in order by, and order by must satisfy the leftmost match. In other cases, MySQL uses filesort.

Compression (prefix compression) Index
MyISAM uses prefix compression to reduce the size of the index, so that more indexes can be placed in the memory, and performance can be greatly optimized in some cases. MyISAM compresses the value of the character type. you can also tell it to compress the integer value.

MyISAM compresses each index block. it completely stores the first value of the index block, and then records and the maximum size of the common prefix of the prefix, add different suffix values to store other values. For example, if the first is perform and the second is performance, the second value is stored as 7, ance. MySQL colleagues store the pointers of adjacent rows in prefix compression mode.

The compressed block uses less space, but slows down some operations. Because the compressed values of each prefix depend on the preceding values, MyISAM cannot use binary search to find the values in the index block. Therefore, the values must be scanned sequentially from the beginning.
Sequential forward scanning is very efficient, but the opposite scanning direction, such as order by desc, cannot work well.
Any value in the center of a block needs to be scanned sequentially, and half a block needs to be scanned on average. We did a performance test and found that the compression index was several times slower, because scanning requires random search, reverse scanning will be worse. This is a trade-off between cpu and I/O operations. the compressed index may be about 1/10 of the original disk space.

You can use the PACK_KEYS option when creating a table to control whether the index of a table is compressed.

Redundant and duplicate indexes
MySQL allows you to create multiple indexes for a column. MySQL requires independent maintenance of these duplicate indexes, and each of them should be considered for query optimization. This causes serious performance problems.

You may inadvertently create duplicate indexes, such:
SQL code
Create table test (
Id int not null primary key,
UNIQUE (ID ),
INDEX (ID)
);

MySQL automatically creates indexes for columns with UNIQUE and primary key constraints. Therefore, three indexes are repeatedly created for one column ID.

Redundant indexes are a little different from duplicate indexes. If you create A joint index for (A, B) and an index for A, index A is duplicated because it is the prefix of the first index.

Index and lock
The index plays an important role in InnoDB because it locks the query to fewer rows. this is an important consideration because InnoDB of MySQL 5.0 locks the row only when the transaction ends.
If the query does not touch unnecessary rows, fewer rows will be locked to achieve better performance:
1. although InnoDB is very efficient and uses a small amount of memory, it may still lead to premature locking of some rows.
2. locking more rows increases lock competition and reduces concurrency.

Try to extend the index as much as possible, instead of adding a new index, because it is usually better to maintain the Index of Multiple columns than to add several single columns. if you do not know the distribution of your query, try to create an index for columns with discrimination.

Supports multiple filtering conditions
Index creation with partitioned columns is usually more efficient, bitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.