High-performance indexing strategy (top)

Last Update:2015-11-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

There are many ways to efficiently select and use indexes, some of which are optimized for a particular case, some are optimizations for a particular behavior, which index to use, and how to evaluate the ability to select different index performance effects, which requires constant practice. Next, you'll learn how to use indexes efficiently.

Stand-alone columns

We usually see some poorly indexed use indexes, or MySQL cannot use an existing index. If the columns in the query are not independent, then MySQL does not use the index. A "stand-alone column" means that an indexed column cannot be part of an expression or a parameter to a function.

For example, the following query cannot use the index of actor_id:

SELECT actor_id from actor WHERE actor_id +1 = 5;

It is easy to see in the eye that the expression in the where is equivalent to actor_id=4, but MySQL cannot parse the equation automatically. This is entirely the user's behavior. We should develop a habit of simplifying where conditions, and always place the index columns on a side that is more consistent.

The following are common errors:

SELECT ... WHERE to_days (current_date)-To_days (Date_col) <=10;

Prefix index and index selectivity

Sometimes you need to index a long character column, which can make the index a big wait. One strategy is the simulated hash index mentioned earlier. Eggs sometimes do not have enough, what else can be done?

You can usually index the starting part of the string, which can greatly save the index space, thus improving the efficiency of the index. However, this also reduces the selectivity of the index. The selectivity of an index is the ratio of the value (also known as cardinality) of the index and the total number of records (#T) of the data table, ranging from 1/#T到1之间. The higher the selectivity of the index, the more efficient the query, because a highly selective index allows MySQL to filter out more rows when it looks for it. The selectivity of a unique index is 1, which is the best index selectivity and the best performance.

Typically, the selectivity of a column prefix index is high enough to satisfy query performance. For a blob, text, or long varchar-type column, the prefix index must be used because MySQL does not allow the full length of the indexes to be indexed.

The trick is to choose a long enough prefix to ensure high selectivity and not too long (to save space). The prefix should be long enough so that the selectivity of the prefix index is close to the entire column of the index, in other words, the "cardinality" of the prefix should be close to the "cardinality" of the complete column.

To determine the appropriate length of the prefix, you need to find a list of the most common values, and then compare them with the most common prefix columns.

Prefix indexing is an efficient way to make indexes smaller and faster, but on the other hand there are drawbacks: MySQL cannot use the prefix index for both order by and group by operations, nor can it use the prefix index to do overwrite scans.

A common scenario is to use a prefix index for a very knowledgeable hexadecimal unique ID. There are a number of valid cardinality that have been discussed earlier to store this type of ID information, but what if you are using a packaged solution and cannot modify the storage structure? At this point, a shallow drunk index with a length of 8 can generally improve performance significantly, and this method is completely transparent to the reference layer.

Sometimes a suffix index can also be used (for example, to find all e-mail addresses that touch a domain name). MySQL native does not support reverse indexing, but it can be stored after the string is reversed and given this prefix index. This index can be maintained through triggers.

Multi-column Index

Many people do not have enough understanding of multi-column indexes. A common mistake is to establish a separate index for each column, or create a multicolumn index in the wrong order.

We'll discuss the order of indexed columns separately in a later chapter. Let's take a look at the first question, a separate index for each column frame, which is easy to see from show CREATE TABLE:

CREATE TABLE T (

C1 INT,C2 int, C3 int, key (C1), key (C2), Key (C3)

);

This indexing strategy is usually caused by a vague suggestion that some experts such as "index the columns in the Where condition". In fact, this suggestion is very wrong. In this way, the best case can only be a "one-star" index, whose performance may be several orders of magnitude worse than the truly most efficient index. Sometimes if you can't design a "Samsung" index, you might as well ignore the WHERE clause, focus on optimizing the order of the indexed columns, or create a full-coverage index.

Rewarding independent single-column indexes on multiple columns in most cases does not improve the query performance of MySQL. MySQL5.0 and later versions medical uses a policy called an "index merge" that, to a certain extent, can be used to locate a specified row using multiple single-column indexes on the table. Earlier versions of MySQL can use only one of these single-column indexes, but in this case no independent index is very effective. For example, Film_actor has a single-column index on fields film_id and actor_id. But for this query where condition, neither of these two-column indexes is a good choice:

SELECT film_id, actor_id from Film_actor WHERE actor_id=1 or film_id = 1;

In the old MySQL version, MySQL will use a full table scan for this query, unless it is rewritten as a two query union:

SELECT film_id, actor_id from Film_actor WHERE actor_id=1

UNION All

SELECT film_id, actor_id from Film_actor WHERE film_id=1;

However, in the MySQL5.0 and higher versions, the production line can simultaneously use two single-row indexes to sweep the meter, and merge the results. This algorithm has three variants: the Union of an OR condition, the intersection of an and condition, and the combination and intersection of the preceding two cases. The following query is a union that uses two index scans, as you can see through the extra column in explain:

EXPLAIN SELECT film_id,actor_id from Film_actor WHERE actor_id=1 or film_id = 1 \g

MySQL uses such techniques to optimize responsible queries, so nested operations are also visible in the extra column of some statements.

The index merge strategy is sometimes an optimized structure, but in fact it is more of an indication that the indexes on the table are poorly built:

When a server intersects multiple indexes (usually with multiple and conditions), it usually means that you need a multiple index that contains all the related columns, rather than a separate single-column index.

When a server needs to do a joint operation on multiple indexes (usually with multiple or conditions), it usually takes a lot of CPU and memory resources to buffer, sort, and merge the algorithms on the operations. Especially when some of these indexes are not highly selective. When a merge scan is required to return a large amount of data.

More importantly, the optimizer does not calculate these costs into the "query cost", the tour Huqiu only cares about random page reads. This causes the query cost to be underestimated, resulting in an execution plan that is less likely to go straight to a full table scan. Not only does this consume more CPU and memory resources, but it can also affect the concurrency of queries, but if it is a query such as an individual fishy, it tends to ignore the effects that are found. Generally speaking, it's not as weak as in MySQL4.1 or earlier times, it is often better to rewrite the query into union.

High-performance indexing strategy (top)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

High-performance indexing strategy (top)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

High-performance indexing strategy (top)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support