MySQL's index optimizer

Source: Internet
Author: User

Many database system performance is not ideal because the systems are not overall optimized, there are a lot of low-performing SQL statements. The primary reason for the poor performance of such SQL statements is the lack of efficient indexing. No index In addition to the statement itself to run slow, but also caused a large number of disk read and write operations, so that the overall system performance is affected by the poor.

The first way to solve this type of system is to optimize these SQL statements that are not indexed or indexed well enough.
1. Key to index creation
The key to optimizing SQL statements is to minimize the logical reads of statements. The logical reads refers to the total number of data pages that need to be accessed at the time the statement executes, which is 8K. The less logical reads, the less memory and CPU time it needs, and the faster the statement executes. It goes without saying that the greatest benefit of an index is that it can greatly reduce the number of logical reads of SQL statements, thus greatly reducing the execution time of statements. The key to creating an index is to be able to significantly reduce the logical reads of the statement. An index is not good, mainly see it reduced logical reads more. Run the SET STATISTICS IO command to get the logical reads information for the SQL statement.
SET STATISTICS IO on
Select Au_id,au_lname, au_fname
from pubs. Authors WHERE au_lname = ' Green '
SET STATISTICS IO on
If the logical reads is large and the number of rows returned is very small, that is, the difference between the two is large, then often the meaning of the statement needs to be optimized. The Logical reads contains the number of pages that the statement accesses from the memory data buffer and the number of pages read from the physical disk. Instead, physical reads represents data pages that do not reside in memory buffers that need to be read from disk. Read-ahead reads is a pre-read that SQL Server generates to improve performance. Read-ahead may read more data. We focus on logical reads when it comes to optimization. Note If physical reads or read-ahead reads is large, it often means that the execution time of the statement (duration) is partially spent waiting on the physical disk IO.
2. Single-field index, combined index and overlay index

The single-field index refers to the index of only one of the fields, whereas a composite index refers to an index with multiple fields.
(1) Index The fields that appear in the WHERE clause
Set STATISTICS PROFILE on
SET STATISTICS IO on
Go
Select .... from TB where ...
Go
Set STATISTICS profile off
SET STATISTICS IO off
The SET STATISTICS Profile command outputs the execution plan of the statement.
You might ask, why not set SHOWPLAN_ALL? It is also possible to use SET SHOWPLAN_ALL. However, the set statistics profile outputs the execution plan that is actually used when the SQL statement is run, and the SET SHOWPLAN_ALL output is the expected (Estimate) execution plan. Using SET SHOWPLAN_ALL is the subsequent statement that does not actually run. A table scan is used, that is, a full table is scanned for entire tables. The performance of a full-table scan is usually poor and should be avoided as much as possible. If the SELECT statement above is a critical statement that is frequently run by the database system, you should create an appropriate index on it. One of the techniques for creating an index is to create an index on a field that often appears in the Where condition. The table scan also becomes index Seek, which greatly improves performance, and tries to avoid a table scan or index scan as a common technique used to optimize SQL statements. Index seek usually requires much less logical reads than the previous two.

(2). Combined Index
If there are multiple fields in the where statement, you can consider creating a composite index. The order of the fields in the composite index is very important, and the more unique the field the more forward. In addition, regardless of the index of a composite or individual column, try not to select those fields that are very low in uniqueness. For example, it doesn't make much sense to build an index on a field with only two values of 0 and 1.
So if you are indexing a single field, we recommend using SET STATISTICS profile to verify that the index is actually fully used. Logical reads the fewer indexes the better.
(3). Overwrite Index
Overwriting an index makes it possible for the statement to have all the required data without accessing the table and accessing the index only. Because the clustered index leaf node is the data, it does not matter whether it is overwritten or not, so the overriding index is primarily for nonclustered indexes. There is also a bookmark lookup keyword in the execution plan, in addition to the index seek. The bookmark lookup indicates that the statement also needs additional bookmark lookup operations on the table to get the data after it accesses the index. That is, to get a row of data at least two Io, one access to the index, one access to the base table. If the statement returns a large number of rows, the cost of the bookmark lookup operation is significant. Overriding the index avoids expensive bookmark lookup operations, reduces the number of Io, and improves the performance of the statement. The overwrite index needs to contain all the fields that appear in the SELECT clause and the WHERE clause. The fields in the where statement are in front, and select is behind. Logical reads, is greatly reduced. The Bookmark lookup operation also disappears. Therefore, creating an overlay index is a very useful optimization technique to reduce the performance of logical reads elevation statements.
In fact, the creation principle of indexes is more complex. Sometimes you can't include all the fields in the WHERE clause in the index. When considering whether an index should contain a field, you should consider the role of the field in the statement. For example, if you often return very few rows with a field as a where condition for exact matches, then it is absolutely worth indexing the field. For example, for those very unique fields such as primary and foreign keys, the fields that often appear in the group By,order by are worth creating indexes.
Question 1, is it worthwhile to establish a clustered index on the identity field.
The answer depends on how the identity field is used in the statement. If you often return very few rows based on the field search, it is worthwhile to index them. Conversely, if the identity field is rarely used in statements at all, then no index should be established on it.
question 2, how many indexes a table should establish is appropriate.
If more than 80% of the statements in the table are read operations, then the index can be more. But not too much. In particular, do not build many indexes on those tables that are updated frequently. Few tables have more than 5 indexes. Too many indexes increase the amount of disk space they consume and increase the cost of maintaining indexes on SQL Server.
Question 3: Why does SQL Server not use an index that you think should be used in the execution plan?

The reasons are diverse. One reason is that the statement returns more than 20% of the table's data, making SQL Server think that scan is more efficient than seek. Another reason may be that the statistics of a table field is out of date and does not accurately reflect the distribution of the data.
You can update it with the command update STATISTICS tablename with Fullscan. Only a synchronized, accurate statistics can guarantee that SQL Server produces the correct execution plan. Outdated old statistics often cause SQL Server to generate an insufficiently optimized or even foolish execution plan. So if your table is updated frequently and you feel that the SQL statements that are associated with it are running slowly, you might want to try the update statistic with FULLSCAN statement.
Question 4, what is the use of a clustered index, and when to use non-clustered indexes
There are two types of clustered and nonclustered indexes in SQL Server. The main difference is that the index leaves of the former are the data itself, while the latter's leaf nodes contain bookmarks that point to the data (that is, the data line number or key of the clustered index).

A clustered index can have only one for a table, and a nonclustered index may have multiple. Just the clustered index does not have a bookmark lookup operation.  When should I use a clustered index? When do I use a nonclustered index? Depends on the access mode of the application. My advice is to use a clustered index on those key fields. A table generally needs to establish a clustered index.
For when to use a clustered index, the SQL Server 2000 online manual has the following description:
Before you create a clustered index, you should first understand how your data is accessed. consider using a clustered index for:

A column that contains a large number of distinct values;

Use the following operators to return a query for a range value: Between, >, >=, <, and <=. Columns that are continuously accessed;
Queries that return large result sets;
Columns that are often accessed by queries using joins or GROUP by clauses; In general, these are foreign key columns.
Indexing the columns specified in the ORDER by or GROUP by clause allows SQL Server to not have to sort the data because the rows are already sorted. This can improve query performance. OLTP-type applications that require very fast single-line lookups (typically through primary keys). The clustered index should be created on the primary key.
clustered indexes do not apply to:

Frequently changed columns: This causes the entire row to move (because SQL Server must keep the data values in the row in physical order). This is especially important because data is volatile in a large data-processing system.

Wide key: The key value from the clustered index is used by all nonclustered indexes as lookup keys, so it is stored in the leaf entry for each nonclustered index.

MySQL's index optimizer

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.