Mysql index type

Last Update:2017-06-01 Source: Internet

Author: User

Tags mysql index

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Mysql index type
In MySQL, there are four types of indexes: B-Tree index, Hash index, fulltext indexes and RTree indexes 1. B-Tree indexes are the most frequently used index types in MySQL databases, all other storage engines except the Archive storage engine support B-Tree indexes. Not only in MySQL, but in many other database management systems, the B-Tree index is also the most important index type, this is mainly because the storage structure of B-Tree indexes has excellent performance in database data retrieval. Generally, the physical files of the B-Tree index in MySQL are mostly stored in the Balance Tree structure, that is, all the actual data is stored in the Leaf Node of the Tree, in addition, the shortest path length to any Leaf Node is exactly the same, so we all call it a B-Tree index. Of course, there may be various databases (or various storage engines of MySQL) when you store your own B-Tree indexes, the storage structure is slightly modified. For example, the actual storage structure used by the B-Tree index of the Innodb Storage engine is actually B + Tree, that is, a small transformation has been made on the basis of the B-Tree data structure, in addition to the index key information stored on each Leaf Node, the pointer information pointing to the next Leaf Node adjacent to the Leaf Node is also stored, this is mainly to accelerate the efficiency of retrieving multiple adjacent Leaf nodes. There are two different types of indexes in the Innodb Storage engine. One is Cluster-based primary key index (PrimaryKey ), the other is a common B-Tree Index that is basically the same as that of other storage engines (such as the MyISAM storage engine). This Index is called the Secondary Index in the Innodb Storage engine. The following figure compares the two indexes. In the diagram, the left side is the Primary Key stored in the Clustered format, and the right side is the normal B-Tree index. Both indexes are identical in terms of Root Node and Branch Nodes. Leaf Nodes is different. In Primary Key, Leaf Nodes stores the actual data of the table, not only the data of the Primary Key field, but also the data of other fields. The entire data is ordered by the Primary Key value. Secondary Index is not much different from other common B-Tree indexes, except that the information about the Index key is stored in Leaf Nodes, and the Innodb primary key value is also stored. Therefore, in Innodb, It is very efficient to access data through the primary key, and if the data is accessed through the Secondary Index, Innodb first uses the relevant information of the Secondary Index, after the Leaf Node is retrieved through the corresponding index key, the corresponding data row needs to be obtained through the primary key value stored in the Leaf Node and then through the primary key index. The primary key index and non-primary key index of the MyISAM storage engine differ little, but the index key of the primary key index is a unique and non-empty key. In addition, the storage structure of MyISAM storage engine indexes is basically the same as that of Innodb's Secondary Index. The main difference is that MyISAM storage engine stores Index key information on Leaf Nodes, store the information of the corresponding data Row (such as Row Number) that can be directly located in the MyISAM data file, but does not store the key value information of the primary key. 2. Not many Hash indexes are used in MySQL. Currently, they are mainly used by the Memory storage engine, and the Memory storage engine uses the Hash index as the default index type. The so-called Hash index actually uses a certain Hash algorithm to Hash the key values to be indexed, and then stores the obtained Hash values in a Hash table. Then, each time the search is required, the search conditions are Hash calculated based on the same algorithm, and then compared with the Hash value in the Hash table to obtain the corresponding information. In the Memory storage engine, MySQL also supports non-unique Hash indexes. Many may be surprised. If it is a non-unique Hash index, what should we do with the same value? In the Hash index of the Memory storage engine, if a non-unique value is encountered, the storage engine will link them to the same hash key value and exist in the form of a linked list, then, when obtaining the actual key value, filter out the key that does not match. Due to the particularity of the Hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at one time, the Hash index is much more efficient than the B-Tree index because it does not require multiple I/O accesses from the root node to the branch node to the page node. Many may have doubts. Since Hash indexes are much more efficient than B-Tree indexes, why do we need to use B-Tree indexes instead of Hash indexes? Everything has two sides. The same is true for Hash indexes. Although the efficiency of Hash index retrieval is very high, the Hash index itself also brings many restrictions and drawbacks due to its particularity, mainly include the following: 1 ). hash indexes can only meet "=", "IN", and "<=>" queries, but cannot use range queries. Because Hash indexes compare Hash values after Hash operations, therefore, Hash indexes can only be used for equivalent filtering, but not range-based filtering, because the size relationship of Hash values processed by the corresponding Hash algorithm is, there is no guarantee that it is the same as before the Hash operation. 2 ). hash indexes cannot be used to avoid data sorting. Because Hash indexes store Hash values after Hash calculation, in addition, the relationship between the Hash value size and the Hash value is not necessarily the same as that before the Hash operation, so the database cannot use the index data to avoid any and sort operations; 3 ). hash indexes cannot be queried using some index keys. For composite indexes, when calculating Hash values, Hash indexes combine index bonding and then calculate Hash values together instead of separately calculating Hash values, therefore, when we query by one or more index keys before the combined index, the Hash index cannot be used. 4 ). the Hash index cannot avoid table scanning at any time. As we know before, the Hash Index uses the index key after the Hash operation, store the Hash value of the Hash operation result and the corresponding row pointer information in a Hash table, and the same Hash value may exist for different index keys, so even if we only retrieve the number of records that satisfy a Hash key value You cannot directly complete the query from the Hash index, or you need to access the actual data in the table for corresponding comparison to get the corresponding results. 5 ). when the Hash index encounters a large number of equal Hash values, the performance may not necessarily be higher than the B-Tree index. For the index keys with low selectivity, if we create a Hash index, A large number of records of pointer information will be stored and connected to the same Hash value. In this way, it will be very troublesome to locate a record, and may waste a lot of table data access, resulting in the overall performance. 3. Full-text index, also known as Full-text index, is currently only supported by the MyISAM storage engine in MySQL, in addition, not all data types support full-text indexing. Currently, Full-TEXT indexes can be created for columns with only CHAR, VARCHAR, and text data types. In general, Fulltext indexes are mainly used to replace inefficient LIKE '% *** %' operations. In fact, the Full-text index can not only replace the traditional Full-fuzzy LIKE operation, but also fully fuzzy match multiple fields at a time through the Full-text index combined by multiple fields. Full-text indexes differ greatly from ordinary B-Tree indexes. Although they store index data in the form of B-Tree, however, instead of completely matching the field content, it uses a specific algorithm to separate the field data and then index it. In general, the MySQL System will be separated by four bytes. In the entire Full-text index, the storage content is divided into two parts, one is the index string data set before the separation, and the other is the index information of the words (or phrases) after the separation. Therefore, in the Full-text index, the actual details in the B-Tree index are not the raw data in our table, but the index data after word segmentation. The node information of the B-Tree index stores the word information after each separator and the position information of the string information that points to the delimiter containing the word in the index dataset. Full-text indexes not only support fuzzy search, but also enable natural language-based matching. Of course, the accuracy of this matching reading requires readers and friends to verify it on their own. Full-text matches various rules for natural language based on specific syntax information, and finally provides non-negative matching values. In addition, we need to pay attention to the fact that MySQL's Full-text index is not very well supported in Chinese. It must be completed by using third-party patches or plug-ins. In addition, the creation of Full-text consumes a large amount of resources. Therefore, we recommend that you evaluate it before applying it to the actual production environment. 4. The R-Tree Index R-Tree index may be an index type that we rarely see in other databases. It is mainly used to solve the spatial data retrieval problem. In MySQL, GEOMETRY is a data type used to store spatial information and is based on OpenGIS specifications. In versions earlier than MySQL5.0.16, only the MyISAM storage engine supports this data type. However, starting from MySQL5.0.16, The BDB, Innodb, NDBCluster, and Archive storage engines also support this data type. Of course, although multiple storage engines are beginning to support the GEOMETRY data type, the MyISAM storage engine only supports the R-Tree index. In MySQL, the R-Tree with secondary split is used to index spatial data information, and then the index is created through the geometric object (MRB) information. Although only the MyISAM storage engine supports spatial indexes (R-Tree indexes), if we use exact equivalence matching, the B-Tree Index created on the spatial data can also optimize the search results. The main advantage of the spatial index is that when we use range search, we can use the R-Tree index, at this time, the B-Tree index will be powerless.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More