Database index Design

Last Update:2017-04-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Type of index:
Normal: Indicates normal index

Unique: Represents a unique, not allowed duplicate index, if the field information is guaranteed not to be duplicated such as a social security number used as an index, can be set to a unique

Full TEXTL: The index that represents the fulltext search. Fulltext works best when searching for a very long article. Used in relatively short text, if the one or two lines of the word, the normal INDEX can also.

In summary, the category of the index is determined by the indexed field content attribute, which is usually the most common.

Normal index Add index
ALTER TABLE ' table_name ' ADD INDEX index_name (' column ')
The following demo adds an index to the name field of the user table
How the MySQL database creates an index
How the MySQL database creates an index
2
Primary key index Add primary key
ALTER TABLE ' table_name ' ADD PRIMARY KEY (' column ')
How the MySQL database creates an index
How the MySQL database creates an index
3
Unique index Add unique
ALTER TABLE ' table_name ' ADD UNIQUE (' column ')
How the MySQL database creates an index
4
Full-text index add fulltext
ALTER TABLE ' table_name ' ADD fulltext (' column ')
How the MySQL database creates an index
5
How to add multi-column indexes
ALTER TABLE ' table_name ' ADD INDEX index_name (' column1 ', ' column2 ', ' column3 ')

Principles of Index Establishment
Based on reasonable database design, it is the foundation of obtaining high-performance database system to index the table after careful consideration. adding indexes without reasonable analysis can degrade the overall performance of the system. The index, while increasing the speed of data access, also increases the processing time for INSERT, update, and delete operations.
Whether you want to add indexes to the table, and where the indexes are built on those fields, is an issue that you must consider before you create the index. A good way to solve this problem is to analyze the application's business process, data usage, and index the fields that are often used as query criteria or ordered. Based on optimized processing of SQL statements by the optimizer, we can follow the following general principles when creating an index:

(1) An index is created for fields that often appear after the keyword order BY, group by, and distinct.
Indexing on these fields is an effective way to avoid sorting operations. If you are building a composite index, the field order of the index is consistent with the order of the fields following these keywords, otherwise the index will not be used.
(2) On the result set field of a set operation such as union, an index is established. The purpose of its indexing is ibid.
(3) An index is established for a field that is frequently used as a query selection.
(4) An index is established on a property that is often used as a table connection.
(5) Consider using index overrides. For tables where the data is rarely updated, if the user frequently queries only a few of these fields, consider indexing these fields to change the scan of the table to an indexed scan.

In addition to the above principles, when creating an index, we should also pay attention to the following limitations:

The

(1) restricts the number of indexes on the table.
for a table that has a large number of update operations, the number of indexes is typically no more than 3, not more than 5. Although the index improves access speed, too many indexes can affect the update operation of the data.
(2) Do not create an index on a field that has a large number of identical values.
indexing on such a field (for example, gender) will return a large number of records that satisfy the criteria when the field is selected, and the optimizer will not use the index as the access path.
(3) Avoid indexing fields that grow in one direction (for example, fields of date types), and compound indexes to avoid placing fields of this type at the front.
because the value of the field always grows in one direction, the new record is always placed in the last leaf page of the index, which constantly causes the leaf page's access competition, the allocation of the new leaf page, and the split of the intermediate branch page. In addition, if the index is a clustered index, the data in the table is placed in the order of the index, and all inserts are centered on the last data page, causing the "hot spot" to be inserted.
(4) The composite index, indexed by the frequency with which the field appears in the query condition.
in a composite index, records are first sorted by the first field. For records with the same value on the first field, the system is sorted by the value of the second field, and so on. Therefore, only the first field of the composite index appears in the query condition, and the index may be used. The
therefore applies a field with high frequency, placed in front of the composite index, which makes it possible for the system to use the index in the most probable way.
(5) Delete indexes that are no longer used or are rarely used. Some of the existing indexes may no longer be needed when the data in the
table is heavily updated, or if the usage of the data is changed. The database administrator should periodically identify these indexes and remove them, thereby reducing the impact of the index on the update operation.

Index Establishment Principle II:
Principles for indexing:
1) The data column defining the primary key must be indexed.
2) The data column that defines the foreign key must be indexed.
3) It is best to index data columns that are queried frequently.
4) For data columns that require quick or frequent queries within a specified range;
5) Data columns that are frequently used in the WHERE clause.
6) often appear in the keyword order BY, group by, distinct after the field, to establish an index. If you are building a composite index, the field order of the index is consistent with the order of the fields following these keywords, otherwise the index will not be used.
7) For columns that are rarely involved in those queries, duplicate values are not indexed for more columns.
8) do not index columns for data types that are defined as text, image, and bit.
9) Avoid indexing for frequently accessed columns
9) Limit the number of indexes on the table. For a table that has a large number of update operations, the number of indexes is typically no more than 3, not more than 5. Although the index improves access speed, too many indexes can affect the update operation of the data.
10) Index The composite index, based on the frequency of the fields appearing in the query criteria. In a composite index, a record is first sorted by the first field. For records with the same value on the first field, the system is sorted by the value of the second field, and so on. Therefore, only the first field of the composite index appears in the query criteria, the index may be used, so the application of a high frequency field, placed in front of the composite index, will allow the system to use the index in the most possible way to play the role of the index.

Combining multiple indexes
A separate index scan can only be used for such conditional clauses: operators in indexed fields and index operator classes are used, and these conditions are concatenated with and. If there is an index on (a, B), then a condition like where a = 5 and B = 6 can use an index, but a condition like where a = 5 OR b = 6 cannot use the index directly.
A query such as where x =42 or x = or x = or x = 99 can be decomposed into four independent scans on X, each scan using one condition, and finally the results of these scans are together to produce the final result. Another example is if we have independent indexes on X and y, a query like where x = 5 and y = 6 can be decomposed into several clauses that use separate indexes, and then combine these results together to produce the final result.

In most of the simplest applications, there may be multiple combinations of indexes that are useful, and the database developer must strike a balance between which indexes to use. Sometimes multi-field indexes are best, and sometimes it is best to create a stand-alone index and rely on the combination of indexes. For example, if your query sometimes involves only field X, sometimes only field y, sometimes two fields are involved, then you might choose to create two separate indexes on X and Y, and then rely on the index combination to handle queries that use two fields at the same time. You can also create a multi-field index on (x, y), where a query that uses two fields at a time is usually more efficient, but it is almost useless for queries that contain only Y, so it cannot be the only index. A multi-field index and a stand-alone index on Y may be better. Because you can use a multi-field index for queries that involve only X, it is larger and therefore slower than indexes that are only on X. The last option is to create three indexes, but this approach is reasonable only if the table is updated far less than the query, and all three queries are common. If one of these queries is much larger than others, you might prefer to create only two indexes that match more common queries.

Differences in Indexing methods:
Hash index structure of the particularity, its retrieval efficiency is very high, index retrieval can be located at once, unlike B-tree index need from the root node to the side point, and finally access to the page node so many IO access, so the Hash index query efficiency is much higher than the B-tree index.
Probably a lot of people have doubts, since the efficiency of the hash index is much higher than b-tree, why do we not all use hash index and also use B-tree index? Everything has two sides, hash index is the same, although the hash index is high efficiency, but the hash index itself due to its particularity also brought a lot of limitations and drawbacks, mainly have the following.

(1) Hash index can only meet "=", "in" and "<=>" query, can not use range query.

Because the hash index comparison is the hash value after the hash operation, so it can only be used for the equivalent of filtering, can not be used for range-based filtering, because the corresponding hash algorithm after processing the hash value of the size of the relationship, and can not be guaranteed and hash before the exact same.

(2) Hash index cannot be used to avoid sorting operations of data.

Because the hash index is stored in the hash after the hash value, and the size of the hash value is not necessarily the same as the key value before the hash operation, so the database can not use the index data to avoid any sorting operations;

(3) Hash index cannot use partial index key query.

For the composite index, the hash index in the calculation of the hash value when the combination index key merge and then calculate the hash value together, rather than calculate the hash value alone, so by combining the index of the previous or several index key query, the Hash index can not be exploited.

(4) Hash index cannot avoid table scan at any time.

As already known, the hash index is the index key through the hash operation, the hash value of the result of hashing and the corresponding line pointer information stored in a hash table, because the different index keys exist the same hash value, so even if the number of data that satisfies a hash key value of the record bar, also can not The query is completed directly from the Hash index, or the actual data in the table is accessed, and the corresponding results are obtained.

(5) When a hash index encounters a large number of equal hash values, performance is not necessarily higher than the B-tree index.

For low-selectivity index keys, if a hash index is created, then there will be a large number of record pointer information associated with the same hash value. This can be very cumbersome to locate a record, wasting multiple table data access, resulting in poor overall performance

2. B-tree Index

The B-tree index is the most frequently used index type in a MySQL database, and all storage engines except the Archive storage engine support B-tree indexes. Not only in MySQL, but in many other database management systems, the B-tree index is also the most important index type, mainly because the storage structure of the B-tree index has a very good performance in data retrieval of the database.
In general, the physical files of the B-tree index in MySQL are mostly stored in the structure of the Balance tree, that is, all the data that is actually needed is stored in the leaf node of the tree, and the shortest path to any leaf node is exactly the same length The same, so we all call it B-tree index of course, it is possible that various databases (or MySQL's various storage engines) will slightly transform the storage structure when storing their own b-tree indexes. such as the B-tree index of the INNODB storage engine actually uses the storage structure is b+tree, that is, on the basis of the B-tree data structure made a small transformation, in each
The leaf node contains information about the index key, and it stores pointers to the last leafnode adjacent to the leaf node, primarily to speed up the efficiency of retrieving multiple neighboring leaf node.
In the INNODB storage engine, there are two different forms of indexes, one is the Cluster form of the primary key index (Primary key), the other is the same as other storage engines (such as the MyISAM storage engine) stored in the same general B-tree index, the index in the Inno The DB storage engine is known as secondary Index. Let's take a picture of how these two indexes are stored
form to make a comparison.

Forget to cite the Big God's blog, if there is infringement, please contact me delete (this blog only as a study of their own use)

Database index Design

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Database index Design

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Database index Design

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support