Hash index and B-Tree Index

Last Update:2014-09-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hash Index

Because of the particularity of the hash index structure, the retrieval efficiency is very high, and the index retrieval can be located at a time, unlike B-tree indexes that need to go from the root node to the branch node, the hash index query efficiency is much higher than that of B-tree indexes.
Many people may have doubts. Since hash indexes are much more efficient than B-tree indexes, why do we need to use B-tree indexes instead of hash indexes? Everything has two sides. The same is true for hash indexes. Although hash indexes are highly efficient, hash indexes also impose many restrictions and drawbacks due to their particularity.

(1) The hash index only supports "=", "in" and "<=>" queries, and does not support range queries. Because the hash Index compares the hash value after hash calculation, it can only be used for equivalent filtering and cannot be used for range-based filtering, because the relationship between the size of hash values processed by the corresponding hash algorithm cannot be exactly the same as that before the hash operation. (2) hash indexes cannot be used to avoid data sorting. Hash indexes store hash values after hash calculation, and the relationship between hash values is not necessarily the same as that before hash calculation, therefore, the database cannot use the index data to avoid any sorting operations. (3) hash indexes cannot use some index keys for queries. For a composite index, when calculating the hash value, the hash value is calculated after the composite index is bonded, instead of separately calculating the hash value, therefore, when one or more index keys are used to query a combined index, the hash index cannot be used. (4) hash indexes cannot avoid table scanning at any time. As we already know, the hash index stores the hash value of the hash operation result and the row pointer information corresponding to the index key in a hash table, because different index keys have the same hash value, the query cannot be completed directly from the hash index even if the number of records that meet the hash key value is obtained, you still need to compare the actual data in the Access Table and obtain the corresponding results. (5) When the hash index encounters a large number of equal hash values, the performance is not necessarily higher than the B-tree index. For low-selectivity index keys, if a hash index is created, a large amount of Record Pointer information is stored in the same hash value. In this way, it will be very troublesome to locate a record, which will waste multiple table data accesses, resulting in low overall performance.

B-Tree Index

B-tree indexes are the most frequently used index types in MySQL databases. All storage engines except the archive storage engine support B-tree indexes. Not only in MySQL, but in many other database management systems, the B-tree index is also the most important index type, this is mainly because the storage structure of B-tree indexes has excellent performance in database data retrieval.
Generally, the physical files of the B-tree index in MySQL are mostly stored in the balance tree structure, that is, all the actual data is stored in the leaf node of the tree, in addition, the shortest path length to any leaf node is exactly the same, so we all call it a B-tree index. Of course, there may be various databases (or various storage engines of MySQL) when you store your own B-tree indexes, the storage structure is slightly modified. For example, the actual storage structure used by the B-tree index of the InnoDB Storage engine is actually B + tree, that is, a small transformation has been made on the basis of the B-tree data structure, in addition to the index key information stored on each leaf node, the pointer information pointing to the next leafnode adjacent to the leaf node is also stored, this is mainly to accelerate the efficiency of retrieving multiple adjacent leaf nodes.
There are two different types of indexes in the InnoDB Storage engine, one is the cluster form of primary key index (Primary Key ), the other is a common B-tree index that is basically the same as that of other storage engines (such as the MyISAM storage engine). This index is called the secondary index in the InnoDB Storage engine. The following figure compares the two indexes.

In the diagram, the left side is the primary key stored in the clustered format, and the right side is the normal B-tree index. The two root nodes and branch nodes are identical. Leaf nodes is different. In Prim, leaf nodes stores the actual data of the table, not only the data of the primary key field, but also the data of other fields in an orderly arrangement of the primary key values. Secondary index is not much different from other common B-tree indexes. Leaf nodes stores the index key information and InnoDB primary key values.

Therefore, in InnoDB, It is very efficient to access data through the primary key, and if the data is accessed through the secondary index, InnoDB first uses the relevant information of the secondary index, after the leaf node is retrieved through the corresponding index key, the corresponding data row needs to be obtained through the primary key value stored in the leaf node and then through the primary key index. The primary key index and non-primary key index of the MyISAM storage engine differ little, but the index key of the primary key index is a unique and non-empty key. In addition, the storage structure of MyISAM storage engine indexes is basically the same as that of InnoDB's secondary index. The main difference is that MyISAM storage engine stores index key information on leaf nodes, store the information of the corresponding data row (such as row number) that can be directly located in the MyISAM data file, but does not store the key value information of the primary key.

I am the dividing line of tiantiao

Reference: http://blog.sina.com.cn/s/blog_6776884e0100pko1.html

Hash index and B-Tree Index

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hash index and B-Tree Index

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Hash index and B-Tree Index

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support