【Abstract】 this is an excerpt from Chapter 6 of MySQL performance optimization and architecture design. [Topic] Hash index B-Tree Index [content] 1. the Hash index has a very high retrieval efficiency because of its special structure. The index retrieval can be located at a time. Unlike B-Tree indexes, it can be accessed only from the root node to the branch node.
【Abstract】 this is an excerpt from Chapter 6 of MySQL performance optimization and architecture design. [Topic] Hash index B-Tree Index [content] 1. the Hash index has a very high retrieval efficiency because of its special structure. The index retrieval can be located at a time. Unlike B-Tree indexes, it can be accessed only from the root node to the branch node.
【Abstract]
This is an excerpt from Chapter 6 of MySQL performance tuning and architecture design.
[Topic]
[Content]
1. Hash Index
Because of the particularity of the Hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at a time, unlike B-Tree indexes that need to go from the root node to the branch node, the Hash index query efficiency is much higher than that of B-Tree indexes.
Many people may have doubts. Since Hash indexes are much more efficient than B-Tree indexes, why do we need to use B-Tree indexes instead of Hash indexes? Everything has two sides. The same is true for Hash indexes. Although Hash indexes are highly efficient, Hash indexes also impose many restrictions and drawbacks due to their particularity.
(1) The Hash index only supports "=", "IN" and "<=>" queries, and does not support range queries.
Because the Hash Index compares the Hash value after Hash calculation, it can only be used for equivalent filtering and cannot be used for range-based filtering, because the relationship between the size of Hash values processed by the corresponding Hash algorithm cannot be exactly the same as that before the Hash operation.
(2) Hash indexes cannot be used to avoid data sorting.
Hash indexes store Hash values after Hash calculation, and the relationship between Hash values is not necessarily the same as that before Hash calculation, therefore, the database cannot use the index data to avoid any sort operations;
(3) Hash indexes cannot be queried using some index keys.
For a composite index, when calculating the Hash value, the Hash value is calculated after the composite index is bonded, instead of separately calculating the Hash value, therefore, when one or more index keys are used to query a combined index, the Hash index cannot be used.
(4) Hash indexes cannot avoid table scanning at any time.
As we already know, the Hash index stores the Hash value of the Hash operation result and the row pointer information corresponding to the index key in a Hash table, because different index keys have the same Hash value, the query cannot be completed directly from the Hash index even if the number of records that meet the Hash key value is obtained, you still need to compare the actual data in the Access Table and obtain the corresponding results.
(5) When the Hash index encounters a large number of equal Hash values, the performance is not necessarily higher than the B-Tree index.
For low-selectivity index keys, if a Hash index is created, a large amount of Record Pointer information is stored in the same Hash value. In this way, it will be very troublesome to locate a record, which will waste multiple table data accesses, resulting in low overall performance.
2. B-Tree indexes
B-Tree indexes are the most frequently used index types in MySQL databases. All storage engines except the Archive storage engine support B-Tree indexes. Not only in MySQL, but in many other database management systems, the B-Tree index is also the most important index type, this is mainly because the data check of the B-Tree index storage structure in the database
Cable has excellent performance.
Generally, the physical files of the B-Tree index in MySQL are mostly stored in the Balance Tree structure, that is, all the actually needed data is stored in the Leaf Node of the Tree, in addition, the shortest path length to any Leaf Node is exactly the same, so we all call it a B-Tree index. Of course, there may be various databases (or various storage engines of MySQL) when you store your own B-Tree indexes, the storage structure is slightly modified. For example, the actual storage structure used by the B-Tree index of the Innodb Storage engine is actually B + Tree, that is, a small transformation is made on the basis of the B-Tree data structure.
In addition to information about the index key, the Leaf Node also stores pointer information pointing to the next LeafNode adjacent to the Leaf Node, this is mainly to accelerate the efficiency of retrieving multiple adjacent Leaf nodes.
There are two different types of indexes in the Innodb Storage engine, one is the Cluster form of Primary Key index (Primary Key ), the other is a common B-Tree Index that is basically the same as that of other storage engines (such as the MyISAM storage engine). This Index is called the Secondary Index in the Innodb Storage engine. The following figure shows how to store these two indexes.
Form to make a comparison.
In the diagram, the left side is the Primary Key stored in the Clustered format, and the right side is the normal B-Tree index. The two Root Nodes and Branch Nodes are identical. Leaf Nodes is different. In Prim, Leaf Nodes stores the actual data of the table, not only the data of the primary key field, but also the data of other fields in an orderly arrangement of the primary key values. Secondary Index is not much different from other common B-Tree indexes. Leaf Nodes stores the Index key information and Innodb primary key values.
Therefore, in Innodb, It is very efficient to access data through the primary key, and if the data is accessed through the Secondary Index, Innodb first uses the relevant information of the Secondary Index, after the Leaf Node is retrieved through the corresponding index key, the corresponding data row needs to be obtained through the primary key value stored in the Leaf Node and then through the primary key index. The primary key index and non-primary key index of the MyISAM storage engine differ little, but the index key of the primary key index is a unique and non-empty key. In addition, the storage structure of MyISAM storage engine indexes is basically the same as that of Innodb's Secondary Index. The main difference is that MyISAM storage engine stores Index key information on Leaf Nodes, store the information of the corresponding data Row (such as Row Number) that can be directly located in the MyISAM data file, but does not store the key value information of the primary key.