MySQL in the btree and Hash index more detailed

Source: Internet
Author: User
Tags character set hash mysql in

MySQL's most common index structure is btree (O (n)), but there are always situations where we want to use other types of indexes for better performance. Hash is one of the options, for example, when we retrieve the user ID by user name, they are always one-to-one relationships, the operator used is only =, if the use of hash as the index data structure, time complexity can be reduced to O (1). Unfortunately, in the current version of MySQL (5.6), the hash only supports the memory and NDB two kinds of engines, while our most common InnoDB and MyISAM do not support hash type indexes.

Anyway, to understand the difference between the two indexes, the following translation from the MySQL official website document on the two explanations.

B-tree Index Features

The B-tree index can be used on comparison operators like =,>,>=,<,<= and between. It can also be used with the LIKE operator as long as its query condition is a constant that does not begin with a wildcard character. You can use indexes like the following statements:

The code is as follows Copy Code
SELECT * from Tbl_name WHERE key_col like ' patrick% ';
SELECT * from Tbl_name WHERE key_col like ' pat%_ck% ';

The following two scenarios do not use indexes:

The code is as follows Copy Code
SELECT * from Tbl_name WHERE key_col like '%patrick% ';
SELECT * from Tbl_name WHERE key_col like Other_col;

The first is because it starts with a wildcard character and the second is because the constants are not used.

If you use ... Like '%string% ' and string more than three characters, MySQL uses the turbo Boyer-moore algorithm algorithm to initialize the query expression, and then use this expression to make the query faster.

One such query, Col_name is NULL, can be indexed with col_name.

Any index that does not overwrite all where and level conditions will not be used. In other words, to use an index, the first column in the index needs to appear in each and group.

The following where condition uses the index:

The code is as follows Copy Code

... WHERE index_part1=1 and index_part2=2 and other_column=3
/* index = 1 OR index = 2 */
... WHERE index=1 OR a=10 and index=2
/* Optimized to "index_part1= ' Hello '" * *
... WHERE index_part1= ' Hello ' and index_part3=5
/* can use INDEX1 Index but will not use INDEX2 and index3 * *
... WHERE index1=1 and index2=2 or index1=3 and index3=3;

The following where condition does not use an index:

The code is as follows Copy Code

/* Index_part1 not used to * *
... WHERE Index_part2=1 and index_part3=2

/* Index index does not appear in each WHERE clause * *
... WHERE index=1 OR a=10

/* No index covering all columns * *
... WHERE index_part1=1 OR index_part2=10

Sometimes MySQL does not use an index, even if it is available. For example, when the MySQL estimate uses the index to read most of the row data. (In this case, a full table scan can be faster than using an index because it requires less retrieval). However, if the statement uses limit to qualify the number of rows returned, MySQL uses the index. Because it is more efficient to use indexes when there are fewer rows of results.

Hash Index Features
The index of the hash type has some characteristics that differ from those described above:

They can only be used for peer-to-peer comparisons, such as = and <=> operators (but much faster). They cannot be used for range query conditions such as <. If the system only needs to use a storage structure such as "key Pair", use the hash type index as much as possible.

The optimizer cannot use a hash index to accelerate the order by operator. (Such indexes cannot be used to search for values in the next order)

MySQL cannot determine the number of data between two values (this requires using the range query operator to determine which index to use). If you convert a MyISAM table to a memory table that relies on a hash index, some statements (performance) may be affected.

Only the full key can be used to search a row of data. (If you use the B-tree index, any fragment of a key can be used to find it.) I think it might mean that the wildcard like operator doesn't work.


Hash index is efficient, but the hash index itself has brought many limitations and drawbacks because of its particularity, mainly have the following.

(1) Hash index can only satisfy "=", "in" and "<=>" query, can not use range query.

Because the hash index is compared to the hash after the hash value, so it can only be used for equivalent filtering, can not be used for filtering based on the scope, because after the corresponding hash algorithm to deal with the size of the hash value of the relationship, and can not guarantee the same as the hash operation before.

(2) Hash index cannot be used to avoid sorting operations of data.

Because the hash index is stored in the hash after the hash value, and the size of the hash value is not necessarily the same as the hash operation before the key value is exactly the same, so the database can not use the index data to avoid any sort operation;

(3) Hash index can not use the partial index key query.

For the combined index, the hash index calculates the hash value when the combination index key merges and then calculates the hash value together, instead of calculating the hash value separately, the hash index can not be used when the query is made by the first or several index keys of the combined index.

(4) Hash index can not avoid table scan at any time.

As already known before, hash index is the key through the hash operation, the hash of the results and the corresponding row pointer information stored in a hash table, because of the same hash value of different key, so even if the data to meet a hash key value of the number of records, can not To complete the query directly from the Hash index, or to compare it by accessing the actual data in the table, and get the corresponding results.

(5) When a hash index encounters a large number of hash values, performance is not necessarily higher than the B-tree index.

For index keys with a low selectivity, if a hash index is created, there will be a large number of record pointer information that is associated with the same Hash value. This will be very cumbersome to locate a record, will waste a lot of table data access, resulting in poor overall performance.

Postscript

By the way, there are some problems encountered in using MySQL:

Sometimes using a script to migrate data will encounter garbled problem, even if the table character set into UTF8 also useless, this time before executing SQL add a set names UTF8 can.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.