What is an index?
Indexes are used to quickly search for records with specific values. All MySQL indexes are saved as B-trees. If no index exists, MySQL must scan all the records of the entire table from the first record until the required records are found. The more records in the table, the higher the operation cost. If an index has been created on the column used as a search condition, MySQL can quickly obtain the location of the target record without scanning any records. If the table has 1000 records, the index search records should be at least 100 times faster than the Sequential Scan records.
Suppose we have created a table named "people ".
Create Table people (peopleid smallint not null, name char (50) not null ); |
Then, we randomly insert 1000 different name values into the people table. Displays a small part of the data file of the people table:
We can see that there is no clear order for the name column in the data file. If we create an index for the name column, MySQL will sort the name column in the index:
For each item in the index, MySQL internally stores the "Pointer" of the actual record location in a data file for it ". Therefore, if we want to find the peopleid of the record whose name is equal to "Mike" (the SQL command is "select peopleid from people where name = 'Mike ';"), mySQL can search for the "Mike" value in the name index, and then directly go to the corresponding row in the data file to return the peopleid (999) of the row accurately ). In this process, MySQL only needs to process one row to return results. If there is no index for the "name" column, MySQL will scan all records in the data file, that is, 1000 records! Obviously, the less records that need to be processed by MySQL, the faster it can complete the task.
Index type
MySQL provides multiple index types:
Common Index
This is the most basic index type, and it has no limitations such as uniqueness. Common indexes can be created in the following ways:
Create an index, for example, create index <index Name> On tablename (column list ); Modify a table, such as alter table tablename add index [index name] (column list ); Specify an index when creating a table, for example, create table tablename ([...], index [index name] (column list )); |
Unique Index
This index is basically the same as the previous "normal index", but there is a difference: all values of the index column can only appear once, that is, they must be unique. You can create a unique index in the following ways:
Create an index, for example, create unique index <index Name> On tablename (column list ); Modify a table, such as alter table tablename add unique [index name] (column list ); Specify an index when creating a table, for example, create table tablename ([...], unique [index name] (column list) ); |
Primary Key
A primary key is a unique index, but it must be specified as a "primary key ". If you have used columns of the auto_increment type, you may already be familiar with primary keys and other concepts. The primary key is generally specified during table creation, for example, "create table tablename ([...], primary key (column list ));". However, we can also add a primary key by modifying the table, for example, "alter table tablename add primary key (column list );". Each table can have only one primary key.
Full-text index
MySQL supports full-text indexing and full-text retrieval from version 3.23.23. In MySQL, the full-text index type is Fulltext. Full-text indexes can be created on varchar or text columns. It can be created using the create table command or the alter table or create Index Command. For large-scale datasets, using the alter table (or create index) command to create a full-text index is faster than inserting a record into an empty table with a full-text index. Full-text indexing is not covered in the following discussions. For more information, see MySQL documentation.
Single Column index and multi-column Index
An index can be a single-column index or multiple-column index. The following is an example to illustrate the differences between the two indexes. Suppose there is a people table:
Create Table people (peopleid smallint not null auto_increment, firstname char (50) Not null, lastname char (50) not null, age smallint not null, townid smallint not Null, primary key (peopleid )); |
The following figure shows the data we inserted into the people table:
In this data segment, there are four people named "mikes" (two named Sullivans and two named McConnells), two 17-year-olds, and one named Joe Smith.
This table is mainly used to return the corresponding peopleid Based on the specified user name, name, and age. For example, we may need to find the peopleid of a user named Mike Sullivan and a 17-year-old user (the SQL command is select peopleid from people where firstname = 'Mike 'and lastname = 'sullivan' and age = 17; ). Because we don't want MySQL to scan the entire table every time it executes a query, we need to consider using indexes here.
First, we can consider creating an index on a single column, such as the firstname, lastname, or age column. If we create an index for the firstname column (alter table people add index firstname (firstname);), MySQL will use this index to quickly limit the search range to those records whose firstname = 'Mike, then, search for other conditions on the intermediate result set: it first excluded the records whose lastname is not equal to "Sullivan", and then excluded those records whose age is not equal to 17. After all the search conditions are met, MySQL returns the final search result.
Because the index of the firstname column is created, MySQL is much more efficient than the full scan of the execution table. However, we require that the number of records scanned by MySQL still far exceed the actual needs. Although we can delete the index on the firstname column and then create the index on the lastname or age column, it seems that no matter which column is created, the search efficiency is still similar.
To improve search efficiency, we need to consider using multi-column indexes. If you create a multi-column index for the columns firstname, lastname, and age, MySQL only needs to retrieve the correct results once! The following is an SQL command to create this multi-column index.
Alter table people add index fname_lname_age (firstname, lastname, age ); |
Because the index file is saved in B-tree format, MySQL can immediately convert it to the appropriate firstname, then to the appropriate lastname, and finally to the appropriate age. Without scanning any record of the data file, MySQL finds the target record correctly!
So, if you create a single column index on the columns firstname, lastname, and age respectively, will the effect be the same as creating a multi-column index with firstname, lastname, and age? The answer is no. The two are completely different. When we perform a query, MySQL can only use one index. If you have three single-column indexes, MySQL will try to select the most restrictive index. However, even if it is the most restrictive single-column index, its capacity is certainly far lower than the multiple-column index of the three columns firstname, lastname, and age.
Leftmost prefix
Multi-column index has another advantage, which is embodied by the concept of leftmost prefixing. Continue to consider the previous example. Now we have a multiple-column index on the firstname, lastname, and age columns. We call this index fname_lname_age. When the search condition is a combination of the following columns, MySQL uses the fname_lname_age index:
Firstname, lastname, age Firstname, lastname Firstname |
On the other hand, it is equivalent to the index created on the combination of columns (firstname, lastname, age), (firstname, lastname), and (firstname. All of the following queries can use this fname_lname_age index:
Select peopleid from people where firstname = 'Mike 'and lastname = 'Sullivan' and Age = '17'; select peopleid from people where firstname = 'Mike 'and Lastname = 'Sullivan '; select peopleid from people where firstname = 'Mike'; Following queries cannot use the index at all: Select peopleid from people where Lastname = 'sullivan'; select peopleid from people where age = '17'; select peopleid From people where lastname = 'sullivan' and age = '17 '; |
Select index Column
In the performance optimization process, selecting the columns to create an index is one of the most important steps. You can consider using two types of indexes: columns that appear in the WHERE clause and columns that appear in the join clause. See the following query:
Select age # Do not use the index from people where firstname = 'Mike '# consider using the index and Lastname = 'sullivan' # consider using Indexes |
This query is slightly different from the previous query, but it is still a simple query. Because age is referenced in the Select part, MySQL does not use it to restrict Column Selection operations. Therefore, it is unnecessary to create an index for the age column for this query. The following is a more complex example:
Select people. Age, # No index town. name # No index from people left join town on People. townid = town. townid # consider using the index where firstname = 'Mike '# consider using the index and Lastname = 'sullivan' # consider using Indexes |
As in the preceding example, because firstname and lastname appear in the WHERE clause, the two columns still need to create indexes. In addition, because the townid of the town table is listed in the join clause, we need to consider creating an index for this column.
So can we simply think that every column in The WHERE clause and join clause should be indexed? This is almost the case, but not completely. We must also consider the operator types for column comparison. MySQL uses indexes only for the following operators:,> =, between, in, and sometimes like. If you can use an index in the like operation, another operand does not start with a wildcard (% or. For example, the "select peopleid from people where firstname like 'mich % ';" query uses an index, but the "select peopleid
From people where firstname like '% Ike'; "No index is used for this query.
Index Efficiency Analysis
Now we know some knowledge about how to select an index column, but we cannot determine which one is the most effective. MySQL provides a built-in SQL command to help us complete this task. This is the explain command. The general syntax of the explain command is: Explain <SQL command>. You can find more instructions on this command in the MySQL documentation. The following is an example.
Explain select peopleid from people where firstname = 'Mike 'and lastname = 'Sullivan' And age = '17 '; |
This command returns the following analysis results:
Table |
Type |
Possible_keys |
Key |
Key_len |
Ref |
Rows |
Extra |
People |
Ref |
Fname_lname_age |
Fname_lname_age |
102 |
Const, const, const |
1 |
Where used |
Next, let's take a look at the meaning of the explain analysis result.
Table: the name of the table.
Type: the type of the connection operation. The following describes the ref connection type in the MySQL documentation:
"For the combination of each record and another table, MySQL reads all records with matching index values from the current table. If the connection operation only uses the leftmost prefix of the key, or if the key is not of the unique or primary key type (in other words, if the connection operation cannot select a unique row based on the key value ), mySQL uses the ref connection type. If the key used for the connection operation matches only a small number of records, the ref is a good connection type ."
In this example, since the index is not of the unique type, ref is the best connection type we can get.
If the explain command shows that the connection type is "all" and you do not want to select a majority of records from the table, the MySQL operation efficiency will be very low because it needs to scan the entire table. You can add more indexes to solve this problem. For more information, see the instructions in the MySQL manual.
Possible_keys:
Possible index names. The index name is the index nickname specified during index creation. If the index does not have a nickname, the name of the first column in the index is displayed by default (in this example, it is "firstname "). The meaning of the default index name is often not obvious.
Key:
It displays the name of the index actually used by MySQL. If it is null, MySQL does not use an index.
Key_len:
The length of the part used in the index, in bytes. In this example, key_len is 102, of which firstname occupies 50 bytes, lastname occupies 50 bytes, and age occupies 2 bytes. If MySQL only uses the firstname part of the index, key_len will be 50.
Ref:
It displays the column name (or the word "const"). MySQL selects rows based on these columns. In this example, MySQL selects rows based on three constants.
Rows:
The number of records that MySQL considers to be scanned before finding the correct results. Obviously, the ideal number here is 1.
Extra:
There may be many different options, most of which will have a negative impact on the query. In this example, MySQL only reminds us that it will use the WHERE clause to limit the search result set.
Index disadvantages
So far, we have discussed the advantages of indexes. In fact, indexes also have disadvantages.
First, the index occupies disk space. Generally, this problem is not very prominent. However, if you create an index for each possible combination of columns, the size of the index file will grow much faster than that of the data file. If you have a large table, the size of the index file may reach the maximum file size allowed by the operating system.
Second, for operations that require data writing, such as delete, update, and insert operations, indexes will reduce their speed. This is because MySQL not only writes the changed data to the data file, but also writes the changes to the index file.
[Conclusion] indexing is a key factor in improving the speed of large databases. No matter how simple the table structure is, a 500000-Row Table scan operation will not be fast at any time. If you have such a large table on your website, you must take some time to analyze which indexes can be used and check whether you can rewrite the query to optimize the application. For more information, see MySQL manual. Note that this document assumes that you are using MySQL 3.23, and some queries cannot be executed on MySQL 3.22.