Simply talk about database indexing

Source: Internet
Author: User
Tags create index mysql query mysql index

 The recent written interview particularly many asked the database index, because before did not do a systematic review, resulting in a lot of knowledge about the index of the memory is very vague, today to collate the relevant notes (and not deep digging, for beginners can still see), for reference only.

What is an index

A database index is like a directory in front of a book, speeding up the query speed of a database.

For example such a query: SELECT * FROM table1 where id=44. If there is no index, the entire table must be traversed until the row with the ID equal to 44 is found, and after the index (which must be an index established on the ID column), you can find the position of the line by looking at 44 (that is, in the ID column) to find the line. It is visible that the index is used for positioning.

The index is divided into clustered index and non-clustered index, and the clustered index is in order according to the physical location of the data, and the non-clustered index is different; The clustering index can improve the speed of multi-row retrieval, but the non-clustered index is very fast for the single-line retrieval.

The purpose of indexing is to speed up the lookup or sorting of records in a table.

There is a cost to indexing a table: one is to increase the storage space for the database, and the other is to spend more time inserting and modifying the data (because the index changes as well).

Why to create an index

Creating an index can greatly improve the performance of your system.

First, by creating a unique index, you can guarantee the uniqueness of each row of data in a database table.

Second, it can greatly speed up the retrieval of data, which is the main reason for creating indexes.

Thirdly, the connection between tables and tables can be accelerated, particularly in terms of achieving referential integrity of the data.

Finally, when using grouping and sorting clauses for data retrieval, you can also significantly reduce the time to group and sort in queries.

By using the index, we can improve the performance of the system by using the optimized hidden device in the process of querying.

Perhaps someone will ask: there are so many advantages to adding indexes, why not create an index for each column in the table? Because there are many disadvantages to increasing the index.

First, it takes time to create indexes and maintain indexes, and this time increases as the amount of data increases.

Second, the index needs to occupy the physical space, in addition to the data table to occupy the data space, each index also occupies a certain amount of physical space, if you want to establish a clustered index, then the space will be larger.

Thirdly, when the data in the table is added, deleted and modified, the index should be maintained dynamically, thus reducing the maintenance speed of the data.

Where to build the index

Indexes are built on top of some columns in a database table. When you create an index, you should consider which columns you can create indexes on and which columns you cannot create indexes on. In general, you should create indexes on these columns:

1. You can speed up your search on columns that you often need to search for;

2. On the column that is the primary key, enforce the uniqueness of the column and the arrangement of the data in the organization table;

3. On the columns that are often used on the connection, these columns are mainly foreign keys, which can speed up the connection and create an index on a column that often needs to be searched by scope, because the index is sorted and its specified range is continuous;

4. Create an index on a column that is often ordered, because the index is sorted so that the query can use the sorting of the index to speed up the sort query time;

5. Speed up the judgment of the condition by creating an index on a column that is frequently used in the WHERE clause.

Similarly, indexes should not be created for some columns. In general, these columns that should not be indexed have the following characteristics:

First, the index should not be created for columns that are seldom used or referenced in queries. This is because, since these columns are seldom used, they are indexed or non-indexed and do not improve query speed. Conversely, by increasing the index, it reduces the system maintenance speed and increases the space requirement.

Second, you should not increase the index for columns that have only a few data values. This is because, because these columns have very few values, such as the gender column of the personnel table, in the results of the query, the data rows of the result set occupy a large proportion of the data rows in the table, that is, the data rows that need to be searched in the table are large. Increasing the index does not significantly speed up the retrieval.

Third, for those columns defined as text, the image and bit data types should not be indexed. This is because the amount of data in these columns is either quite large or has little value and is not conducive to the use of indexes.

The index should not be created when the performance of the modification is far greater than the retrieval performance. This is because modifying performance and retrieving performance are conflicting. When you increase the index, the retrieval performance is improved, but the performance of the modification is reduced. When you reduce the index, you increase the performance of the modification and reduce the retrieval performance. Therefore, you should not create an index when the modification operation is much more than a retrieval operation.

data structure of the index

B-tree,b is balance, which is commonly used for database indexing . Using the B-TREE structure can significantly reduce the intermediate process that is experienced when locating records, thus speeding up the access speed. While B+tree is a variant of B-tree, the famous MySQL generally uses B+tree to implement its index structure.

insert Operation : When inserting an element, first in B-tree, if it does not exist, it ends at the leaf node, and then inserts the new element in the leaf node, note: If the leaf node space is sufficient, Here, you need to move the element to the right that is larger than the newly inserted keyword, and if the space is full so that there is not enough space to add the new element, divide the node into a new adjacent right node and split the half number of key elements into the parent node (of course, If the parent node space is full, the split action is also required, and when the key element in the node moves to the right, the relevant pointer also needs to move to the right. If a new element is inserted at the root node and the space is full, the split operation is performed so that the intermediate key element in the original root node moves up to the new root node, thus causing the tree's height to increase by one level.

Remove (delete) Action: first find the element to be deleted in the b-tree, if the element exists in the B-tree, then the element is deleted in its node, if the element is deleted, first determine whether the element has left and right child nodes, if there is, then move up the child node in a similar element to the parent. , and then the situation after the move, if not, after the move after the direct deletion. After deleting an element and moving the element, if the number of elements in a node is less than Ceil (M/2)-1, you need to see whether one of its neighboring sibling nodes is plump (the number of elements in the node is greater than Ceil (M/2)-1), and if it is plump, borrow an element from the parent node to satisfy the condition; That is, when the number of nodes is less than ceil (M/2)-1, the node is "merged" into a node with a neighboring sibling node to satisfy the condition.

The following example explains in detail the use of indexes in MySQL

Indexing is the key to fast searching. MySQL indexing is important for the efficient operation of MySQL. Here are a few common types of MySQL indexes.

In a database table, indexing a field can greatly improve query speed. Suppose we create a mytable table:

CREATE TABLE mytable (ID INT NOT NULL, username VARCHAR (+) NOT NULL
); We randomly inserted 10,000 records, including one: 5555, admin.

In the Find username= "admin" record select * FROM MyTable where
Username= ' admin '; If an index has been established on the username, MySQL does not need any scanning, that is, the record can be found exactly. Instead, MySQL scans all records, that is, to query 10,000 records.

Index sub-column indexes and composite indexes. A single-column index, that is, an index contains only single columns, and a table can have multiple single-row indexes, but this is not a composite index. A composite index, that is, a cable that contains multiple columns.

MySQL index types include:

(1) General index

This is the most basic index and it has no limitations. It is created in the following ways:

Create an index

CREATE INDEX indexname on mytable (username (length));
If it is a Char,varchar type, length can be less than the actual length of the field, and if it is a blob and text type, length must be specified.

Modify Table Structure

ALTER mytable ADD INDEX [IndexName] on (username (length))

Specify directly when creating a table

CREATE TABLE mytable (ID INT NOT NULL, username VARCHAR (+) NOT NULL,
INDEX [IndexName] (username (length))); Syntax for dropping an index:

DROP INDEX [IndexName] on mytable;

(2) Unique index

It is similar to the previous normal index, except that the value of the indexed column must be unique, but it allows for a null value. If it is a composite index, the combination of column values must be unique. It is created in the following ways:

Create an index

CREATE UNIQUE INDEX indexname on mytable (username (length))

Modify Table Structure

ALTER mytable ADD UNIQUE [IndexName] on (username (length))

Specify directly when creating a table

CREATE TABLE mytable (ID INT NOT NULL, username VARCHAR (+) NOT NULL,
UNIQUE [IndexName] (username (length)));

(3) Primary key index

It is a special unique index and is not allowed to have null values. The primary key index is typically created at the same time as the table is built:

CREATE TABLE mytable (ID INT NOT NULL, username VARCHAR (+) NOT NULL,
PRIMARY KEY (ID)); Of course, you can also use the ALTER command. Remember: A table can have only one primary key.

(4) Combined index

To visually compare single-column and composite indexes, add multiple fields to the table:

CREATE TABLE mytable (ID INT NOT NULL, username VARCHAR (+) NOT NULL,
City VARCHAR (a) is not NULL, and age INT is not null);
To further extract the efficiency of MySQL, it is necessary to consider building a composite index. is to build name, city, and age into an index:

ALTER TABLE mytable ADD INDEX name_city_age (name (ten), city,age);
When the table is built, the usernname length is 16, where
10. This is because, in general, the length of the name does not exceed 10, which speeds up the index query, reduces the size of the index file, and increases the update speed of the insert.

If you separate the
A single-column index on the usernname,city,age makes the table have 3 single-column indexes, and the efficiency of the combined index at query time and above is much lower than our combined index. Although there are three indexes at this point, MySQL can only use one of the single-column indexes that it considers to be the most efficient.

The establishment of such a composite index, in fact, is equivalent to the following three sets of composite indexes:

Usernname,city,age usernname,city usernname why not?
City,age such a combination index? This is because the MySQL composite index is the result of the "leftmost prefix". The simple understanding is only from the left to the beginning of the combination. Not as long as the combined index is used for queries that contain these three columns, the following SQL uses this combined index:

SELECT * FROM MyTable whree username= "admin" and city= "Zhengzhou" SELECT * FROM
MyTable whree username= "admin" and the next few are not used:

SELECT * FROM MyTable whree age=20 and city= "Zhengzhou" select * FROM MyTable whree
City= "Zhengzhou"

(5) Timing of index creation

Here we have learned to build an index, so where do we need to build the index? In general, the columns that appear in the where and join need to be indexed, but not entirely, because MySQL uses the index only for <,<=,=,>,>=,between,in, and sometimes like. For example:

SELECT t.name from MyTable T left joins MyTable m on T.name=m.username
WHERE m.age=20 and m.city= ' Zhengzhou '
The city and age need to be indexed, because the userame of the MyTable table also appears in the join clause, and it is necessary to index it.

Just now it is only necessary to index the like at certain times. Because MySQL does not use indexes when querying with wildcards% and _. For example, the following sentence will use the index:

SELECT * FROM MyTable WHERE username like ' admin% ' and the following sentence will not be used:

SELECT * FROM MyTable wheret Name like '%admin ' Therefore, you should pay attention to the above differences when using like.

(6) Deficiencies of the index

The benefits of using indexes are described above, but excessive use of indexes will result in abuse. So the index has its drawbacks as well:

Although the index greatly improves query speed, it also slows down the updating of tables, such as INSERT, UPDATE, and delete on tables. Because when updating a table, MySQL not only saves the data, but also saves the index file.

Index files that create indexes that consume disk space. The general situation is not too serious, but if you create multiple combinations of indexes on a large table, the index file will swell up quickly.

Indexing is just one factor in efficiency, and if your MySQL has a large data size table, you need to spend time studying to build the best indexes, or refine the query statements.

(7) Considerations for using Indexes

There are some tips and considerations when working with indexes:

The index does not contain a column with null values

This column is not valid for this composite index as long as the column contains null values that will not be included in the index, as long as there is a column in the composite index that contains null values. So we don't want the default value of the field to be null when the database is designed.

Use short Index

Index A string, or specify a prefix length if possible. For example, if you have a column of char (255), and if the majority value is unique within the first 10 or 20 characters, do not index the entire column. Short indexes not only improve query speed but also save disk space and I/O operations.

Indexed column Sorting

The MySQL query uses only one index, so if an index is already used in the WHERE clause, the order
The columns in by are not indexed. So do not use sort operations where the default sorting of the database is acceptable, and try not to include multiple columns, if you need to create a composite index for those columns.

Like statement operations

It is generally discouraged to use the like operation, which is also an issue if it is not used. Like "%aaa%" does not use indexes like
"aaa%" can use the index.

Do not perform calculations on columns

SELECT * from the users where year (adddate) <2007;
The operation will be performed on each line, which will cause the index to fail with a full table scan, so we can change to

SELECT * from users where adddate< ' 2007-01-01 ';

Do not use not in and <> operations

Above, the MySQL index type is introduced.

Simply talk about database indexing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.