Database (one)--the difference between a hash index and a btree index

Source: Internet
Author: User

An index is a data structure that helps MySQL get data. The most common indexes are the Btree index and the hash index.

Different engines have different support for indexes: InnoDB and MyISAM The default index is the Btree index, and the Mermory default index is the hash index.

Hash index

The so-called hash index, when we want to add an index to a column of a table, this column of the table is hashed algorithm, the hash value, and sorted on the hash array. So the hash index can be positioned one at a time, its efficiency is high, and the Btree index needs to go through multiple disk IO, but InnoDB and MyISAM did not adopt it because it has many drawbacks:

1, because the hash index comparison is the value of the hash calculation, so only the equality comparison can not be used for range query

1, every time to the full table scan

2, because the hash value is arranged in order, but the real data of the hash value mapping in the hash table is not necessarily in order, so the hash index cannot be used to speed up any sort operation

3, cannot use partial index key to search, because the composite index calculates the hash value to be calculated together.

4, when the hash value is large and the amount of data is very large, its retrieval efficiency is not btree index high.

 Btree Index

As for the Btree index, it is implemented as a storage structure with B + trees.

However, the storage structure of the Btree index differs greatly in InnoDB and MyISAM.

In MyISAM, if we want to establish a btree index on a column of a table,

So we often say that the data files and index files in MyISAM are separate.

Therefore, the index of MyISAM is also known as nonclustered, and the index of InnoDB becomes a clustered index.

As for secondary indexes, like the primary index, the only difference is that the values on the primary index cannot be duplicated, and the secondary indexes can be duplicated.


So when we search based on the btree index, if key exists, it finds its address in the data domain and then finds the data record in the table based on the address.

As for InnoDB it is very different from the above, and its leaf node stores not the address of the table, but the data


We can see here does not put the address into the leaf node, but directly into the corresponding data, which is what we usually say, InnoDB index file is the data file,

The secondary index structure for InnoDB is also quite different from the primary index.


We can find that the leaf nodes here store the information of the primary key, so when we use the secondary index, we retrieve the primary key information, then the primary key to the main index to locate the data in the table, this can explain why the primary key in InnoDB is not used too long field, because all the secondary index contains the primary index, So it's easy to make the secondary index huge.

We can also find that: as far as possible in the InnoDB to use the self-increment of the primary key, so that each time the data added only need to add later, non-monotonic primary key in the insertion will need to maintain the b+tree characteristics of the split adjustment, very inefficient.

Btree the leftmost matching principle in the index:

Btree are built from left to right in order to build the search tree. For example, if the index is (name,age,sex), the Name field is checked first, and if the Name field is the same, then the two fields are checked.

So when it comes in is the last two fields of data (Age,sex), because the search tree is built according to the first field, so you must according to the name field to know where the next field to query.

So when it comes in (name,sex), the search direction is specified first based on name, but the second field is missing, so the name field is found correctly before it matches the sex data.

Rules for indexing:

1. Use the leftmost prefix: MySQL will always look to the right until it encounters a range operation (>,<,like, between) to stop the match. For example A=1 and b=2 and c>3 and d=6; if the (A,B,C,D) index is established, then the D index behind it is completely unused and can be used when replaced (A,B,D,C).

2, can not be over-indexed: When modifying the table content, the index must be updated or refactored, so the index is too high, it consumes more time.

3. Try to expand the index without creating a new index

4. The most appropriate column for the index is the one specified in the column or join clause that appears in the WHERE clause.

5. Columns with different values are not necessarily indexed (gender).

Reprinted from: http://blog.csdn.net/u014307117/article/details/47325091

Database (one)--the difference between a hash index and a btree index

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.