MySQL MyISAM index Type Analysis and Optimization

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

MySQL MyISAM index types include primary (Primary Key), unique (unique), index (common index), and Fulltext (full-text search ). Generally, the InnoDB Storage engine is used to search for clustered indexes and non-clustered indexes on the network. How do I know the default engine? Run show variables like '% storage_engine %'. Here we only discuss the MyISAM engine.

What is an index?

Indexes are used to quickly search for records with specific values. All MySQL indexes are saved as B-trees. If no index exists, MySQL must scan all the records of the entire table from the first record until the required records are found. The more records in the table, the higher the operation cost. If an appropriate index has been created for the column used as a search condition, MySQL can quickly obtain the location of the target record through the index without scanning any records.

Analyze the basic storage structure of the MyISAM storage engine index:

For the basic data storage structure of indexes, the storage structure of MyISAM indexes is basically the same, whether it is a primary key or a common index. The basic structure is a Balance Tree (B-tree for short ), all key value details and row "Pointer" information are stored on the leaf nodes of B-tree. This basic data structure is basically the same as other MySQL storage engines such as InnoDB. However, the index of MyISAM is not as different from the data stored in primary key and secondary index as in the InnoDB Storage engine. In the MyISAM storage engine, the main difference between primary key and other common indexes is that the index key of primary key must meet the unique value of non-null, another difference is that there is a difference between every common index, that is, the key-Value Order of the entire index tree is not the same.

Because the storage of data rows in the MyISAM storage engine is divided into two types: Fixed Length and dynamic length, there are two ways to locate the information required for a row of data in the MyISAM storage engine data file. One is to locate rows of table data with fixed length directly by row number, the other method is to locate the row of the dynamic length table data through other relative file location identification information. Let's refer to the two methods collectively as the row ID.

The following figure shows the basic storage method of MyISAM indexes:

Next let's take a look at how the index achieves quick search:

Suppose we have created a table named "people:

CREATE TABLE people ( peopleid SMALLINT NOT NULL, name CHAR(50) NOT NULL );

Then, we randomly insert 1000 different name values into the people table. Displays a small part of the data file of the people table:

We can see that there is no clear order for the name column in the data file. If we create an index for the name column, MySQL will sort the name column in the index:

For each item in the index, MySQL internally stores the "Pointer" of the actual record location in a data file for it ". Therefore, if we want to find the peopleid of the record whose name is equal to "Mike" (the SQL command is "select peopleid from people where name = 'Mike ';"), mySQL can search for the "Mike" value in the name index, and then directly go to the corresponding row in the data file to return the peopleid (999) of the row accurately ). In this process, MySQL only needs to process one row to return results. If there is no index for the "name" column, MySQL will scan all records in the data file, that is, 1000 records! Obviously, the less records that need to be processed by MySQL, the faster it can complete the task.

Index type

Primary Key (primary)

A primary key is a unique index, but it must be specified as a "primary key ". If you have used columns of the auto_increment type, you may already be familiar with primary keys and other concepts. The primary key is generally specified during table creation, for example, "create table tablename ([...], primary key (column list ));". However, we can also add a primary key by modifying the table, for example, "alter table tablename add primary key (column list );". Each table can have only one primary key.

Unique Index)
This index is basically the same as the previous "normal index", but there is a difference: all values of the index column can only appear once, that is, they must be unique. You can create a unique index in the following ways:

Create an index, such as create unique index <index Name> On tablename (column list); Modify a table, such as alter table tablename add unique [index name] (column list ); specify an index when creating a table, for example, create table tablename ([...], unique [index name] (column list ));

General Index)
This is the most basic index type, and it has no limitations such as uniqueness. Common indexes can be created in the following ways:

Create an index, such as create index <index Name> On tablename (column list); Modify a table, such as alter table tablename add index [index name] (column list ); specify an index when creating a table, for example, create table tablename ([...], index [index name] (column list ));

Full text index (Fulltext)
MySQL supports full-text indexing and full-text retrieval from version 3.23.23. In MySQL, the full-text index type is Fulltext. Full-text indexes can be created on varchar or text columns. It can be created using the create table command or the alter table or create Index Command. For large-scale datasets, using the alter table (or create index) command to create a full-text index is faster than inserting a record into an empty table with a full-text index. Full-text indexing is not covered in the following discussions. For more information, see MySQL documentation.

Single Column index and multi-column Index

An index can be a single-column index or multiple-column index. The following is an example to illustrate the differences between the two indexes. Suppose there is a people table:

CREATE TABLE people ( peopleid SMALLINT NOT NULL AUTO_INCREMENT, firstname CHAR(50) NOT NULL, lastname CHAR(50) NOT NULL, age SMALLINT NOT NULL, townid SMALLINT NOT NULL, PRIMARY KEY (peopleid) );

The following figure shows the data we inserted into the people table:

In this data segment, there are four people named "mikes" (two named Sullivans and two named McConnells), two 17-year-olds, and one named Joe Smith.

This table is mainly used to return the corresponding peopleid Based on the specified user name, name, and age. For example, we may need to find the peopleid of a user named Mike Sullivan and a 17-year-old user (the SQL command is select peopleid from people where firstname = 'Mike 'and lastname = 'sullivan' and age = 17; ). Because we don't want MySQL to scan the entire table every time it executes a query, we need to consider using indexes here.

First, we can consider creating an index on a single column, such as the firstname, lastname, or age column. If we create an index for the firstname column (alter table people add index firstname (firstname);), MySQL will use this index to quickly limit the search range to those records whose firstname = 'Mike, then, search for other conditions on the intermediate result set: it first excluded the records whose lastname is not equal to "Sullivan", and then excluded those records whose age is not equal to 17. After all the search conditions are met, MySQL returns the final search result.

Because the index of the firstname column is created, MySQL is much more efficient than the full scan of the execution table. However, we require that the number of records scanned by MySQL still far exceed the actual needs. Although we can delete the index on the firstname column and then create the index on the lastname or age column, it seems that no matter which column is created, the search efficiency is still similar.

To improve search efficiency, we need to consider using multi-column indexes. If you create a multi-column index for the columns firstname, lastname, and age, MySQL only needs to retrieve the correct results once! The following is an SQL command to create this multi-column index:

ALTER TABLE people ADD INDEX fname_lname_age (firstname,lastname,age);

Because the index file is saved in B-tree format, MySQL can immediately convert it to the appropriate firstname, then to the appropriate lastname, and finally to the appropriate age. Without scanning any record of the data file, MySQL finds the target record correctly!

So, if you create a single column index on the columns firstname, lastname, and age respectively, will the effect be the same as creating a multi-column index with firstname, lastname, and age? The answer is no. The two are completely different. When we perform a query, MySQL can only use one index. If you have three single-column indexes, MySQL will try to select the most restrictive index. However, even if it is the most restrictive single-column index, its capacity is certainly far lower than the multiple-column index of the three columns firstname, lastname, and age.

Leftmost prefix

Multi-column index has another advantage, which is embodied by the concept of leftmost prefixing. Continue to consider the previous example. Now we have a multiple-column index on the firstname, lastname, and age columns. We call this index fname_lname_age. When the search condition is a combination of the following columns, MySQL uses the fname_lname_age index:

firstname，lastname，agefirstname，lastnamefirstname

On the other hand, it is equivalent to the index created on the combination of columns (firstname, lastname, age), (firstname, lastname), and (firstname. All of the following queries can use this fname_lname_age index:

SELECT peopleid FROM people WHERE firstname='Mike' AND lastname='Sullivan' AND age='17';SELECT peopleid FROM people WHERE firstname='Mike' AND lastname='Sullivan'; SELECT peopleid FROM people WHERE firstname='Mike'; The following queries cannot use the index at all: SELECT peopleid FROM people WHERE lastname='Sullivan'; SELECT peopleid FROM people WHERE age='17'; SELECT peopleid FROM people WHERE lastname='Sullivan' AND age='17';

Select index Column

In the performance optimization process, selecting the columns to create an index is one of the most important steps. You can consider using two types of indexes: columns that appear in the WHERE clause and columns that appear in the join clause. See the following query:

Select age # Do not use the index from people where firstname = 'Mike '# consider using the index and lastname = 'sullivan' # consider using the index

This query is slightly different from the previous query, but it is still a simple query. Because age is referenced in the Select part, MySQL does not use it to restrict Column Selection operations. Therefore, it is unnecessary to create an index for the age column for this query. The following is a more complex example:

Select people. age, ## do not use the index town. name ## from people left join town onpeople. townid = town. townid ## use index where firstname = 'Mike '## use index and lastname = 'sullivan' ## use index

As in the preceding example, because firstname and lastname appear in the WHERE clause, the two columns still need to create indexes. In addition, because the townid of the town table is listed in the join clause, we need to consider creating an index for this column.

So can we simply think that every column in The WHERE clause and join clause should be indexed? This is almost the case, but not completely. We must also consider the operator types for column comparison. MySQL uses indexes only for the following operators: <, <=, =,>,> =, between, in, and sometimes like. If you can use an index in the like operation, another operand does not start with a wildcard (% or. For example, the query "select peopleid from people where firstname like 'mich % ';" uses an index, but "select peopleid from people where firstname like' % Ike '; "This query does not use indexes.

Index Efficiency Analysis

Now we know some knowledge about how to select an index column, but we cannot determine which one is the most effective. MySQL provides a built-in SQL command to help us complete this task. This is the explain command. The general syntax of the explain command is: Explain. You can find more instructions on this command in the MySQL documentation. The following is an example:

Explain select peopleid from people where firstname = 'Mike'
And lastname = 'sullivan' and age = '17 ';

This command returns the following analysis results:

Table	Type	Possible_keys	Key	Key_len	Ref	Rows	Extra
People	Ref	Fname_lname_age	Fname_lname_age	102	Const, const, const	1	Where used

Next, let's take a look at the meaning of the explain analysis result.

Table: the name of the table.

Type: the type of the connection operation. The following describes the ref connection type in the MySQL documentation:

"For the combination of each record and another table, MySQL reads all records with matching index values from the current table. If the connection operation only uses the leftmost prefix of the key, or if the key is not of the unique or primary key type (in other words, if the connection operation cannot select a unique row based on the key value ), mySQL uses the ref connection type. If the key used for the connection operation matches only a small number of records, the ref is a good connection type ." In this example, since the index is not of the unique type, ref is the best connection type we can get. If the explain command shows that the connection type is "all" and you do not want to select a majority of records from the table, the MySQL operation efficiency will be very low because it needs to scan the entire table. You can add more indexes to solve this problem. For more information, see the instructions in the MySQL manual.

Possible_keys:

Possible index names. The index name is the index nickname specified during index creation. If the index does not have a nickname, the name of the first column in the index is displayed by default (in this example, it is "firstname "). The meaning of the default index name is often not obvious.

Key: it displays the name of the index actually used by MySQL. If it is null, MySQL does not use an index.

Key_len: the length of the part used in the index, in bytes. In this example, key_len is 102, of which firstname occupies 50 bytes, lastname occupies 50 bytes, and age occupies 2 bytes. If MySQL only uses the firstname part of the index, key_len will be 50.

Ref: it displays the column name (or the word "const"). MySQL selects rows based on these columns. In this example, MySQL selects rows based on three constants.

Rows: the number of records that MySQL deems necessary to scan before finding the correct results. Obviously, the ideal number here is 1.

Extra: there may be many different options, most of which will have a negative impact on the query. In this example, MySQL only reminds us that it will use the WHERE clause to limit the search result set.

Index disadvantages

So far, we have discussed the advantages of indexes. In fact, indexes also have disadvantages. First, the index occupies disk space. Generally, this problem is not very prominent. However, if you create an index for each possible combination of columns, the size of the index file will grow much faster than that of the data file. If you have a large table, the size of the index file may reach the maximum file size allowed by the operating system. Second, for operations that require data writing, such as delete, update, and insert operations, indexes will reduce their speed. This is because MySQL not only writes the changed data to the data file, but also writes the changes to the index file. Some comments about MyISAM index use.

1. Using indexes can reduce the overhead of queries (select), but also increase the overhead of update, insert, and delete, which can be weighed based on your own applications.
2. If the row content of a column is not repeated, a unique index is used.
3. duplicate content is small and frequently used. Common indexes are used.
For a 4-character column, you can specify the first few bytes of the column for indexing, which saves space and improves query speed. (It seems that the column length is used as the index by default ).
5. If multiple and conditions are used in the WHERE clause, the composite index is used. The order of the index formation corresponds to the order of and conditions as much as possible.
(It does not match, but the engine may consume time adjustments ).

Conclusion

In large databases, indexing is a key factor to increase the speed. No matter how simple the table structure is, a 500000-Row Table scan operation will not be fast at any time. If you have such a large table on your website, you must take some time to analyze which indexes can be used and check whether you can rewrite the query to optimize the application. For more information, see MySQL manual. Note that this document assumes that you are using MySQL 3.23, and some queries cannot be executed on MySQL 3.22.

Source: http://blog.diyiye.com /? Post = 49

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL MyISAM index Type Analysis and Optimization

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support