Prefix index, a solution for optimizing index size

Last Update:2015-03-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When I read an article about database indexing today, the article mentions the prefix index, which is really a new word for people who have been working on database applications for so many years. So I intend to study it.

The prefix index seems to be a concept in MySQL that is not presented in SQL Server and Oracle. So I installed a MySQL to do the experiment and figure out the prefix index.

The prefix index, in other words, is to index the first few characters of the text (specifically, when several characters are indexed) so that the index is smaller, so the query is faster. A bit equivalent to using the left function for fields in Oracle, the function index is established, except that the MySQL prefix index is internally automatically matched at query time and does not need to use the left function.

Other articles mention:

The MySQL prefix index can effectively reduce the size of the index file and increase the speed of the index. But the prefix index also has its drawbacks: MySQL cannot use prefix indexes in ORDER by or GROUP by, nor can they be used as an overlay index (covering index).

The syntax for establishing a prefix index is:

ALTERTABLEtable_nameADDKEY(column_name(Prefix_length));

The key parameter here is prefix_length, which needs to get the appropriate index selectivity (indexselectivity) based on the contents of the actual table. Index selectivity is the ratio of the number of repetitions to the total number.

Select1.0*Count(distinctCOLUMN_NAME)/Count(*)
fromtable_name

For example, we now have an employee table with a FirstName field, which is varchar (50), and we query the index selectivity for that field:

Select1.0*Count(distinctFirstName)/Count(*)
fromEmployee

Get the result 0.7500, and then we want to build a prefix index on FirstName, and we want the selectivity of the prefix index to be as close as possible to the selectivity of indexing the entire field. Let's take a look at 3 characters, how to:

Select1.0*Count(distinct Left(FirstName,3))/Count(*)
fromEmployee

The result is 0.58784, as if the gap is a bit big, let's try another 4 characters:

Select1.0*Count(distinct Left(FirstName,4))/Count(*)
fromEmployee

Get 0.68919, has been promoted a lot, and then try 5 characters, the result is 0.72297, this result and 0.75 is very close, so we think the prefix length 5 is a suitable value. So we can build a prefix index for FirstName:

AlterTableTest. EmployeeAddKey(FirstName (5))

The query statement does not need to change after the prefix index is established, and if we want to query all the employee FirstName for Devin, then SQL still writes:

Select*
fromEmployee E
whereE.firstname='Devin';

Here's a summary of what happens when you use the prefix index:

String columns (Varchar,char,text, etc.) that require a full field match or a previous match. That is, ' xxx ' or like ' xxx% '
The string itself may be longer, and the first few characters start to be different. For example, we have a Chinese name prefix index does not make any sense, because Chinese names are very short, and the recipient address using prefix index is not very practical, because on the one hand, the address is generally the beginning of xx province, that is, the first few characters are similar, and the pickup address is generally like '% xxx% ', will not use a previous match. On the contrary, the names of foreigners can be indexed by prefix because their characters are longer and the first few characters are more selective. The same email is also a field that can be indexed using a prefix.
The index selectivity of the first half of the characters is already close to the index selectivity of the full field. If the length of the whole field is 20, the index selectivity is 0.9, and we prefix the first 10 characters with a selectivity of only 0.5, then we need to continue to increase the length of the prefix character, but this time the advantage of the prefix index is not obvious, there is not much need to build a prefix index.

Prefix index, a solution for optimizing index size

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Prefix index, a solution for optimizing index size

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support