MySQL index type and optimization

Source: Internet
Author: User
Tags mysql query mysql index

Indexing is the key to fast searching. MySQL indexing is important for the efficient operation of MySQL. Here are a few common types of MySQL indexes.

In a database table, indexing a field can greatly improve query speed. Suppose we create a mytable table:

CREATE TABLE MyTable (

ID INT not NULL,

Username VARCHAR (+) not NULL

);

We randomly inserted 10,000 records, including one: 5555, admin.

In the Find username= "admin" record select * from MyTable WHERE username= ' admin '; If an index has been established on username, MySQL does not need any scanning, that is, the record can be found exactly. Instead, MySQL scans all records, that is, to query 10,000 records.

Index sub-column indexes and composite indexes. A single-column index, that is, an index contains only single columns, and a table can have multiple single-row indexes, but this is not a composite index. A composite index, that is, a cable that contains multiple columns.

MySQL index types include:

(1) General index

This is the most basic index and it has no limitations. It is created in the following ways:

Create an index

CREATE INDEX indexname on mytable (username (length));
If it is a Char,varchar type, length can be less than the actual length of the field, and if it is a blob and text type, length must be specified.

Modify Table Structure

ALTER mytable ADD INDEX [IndexName] on (username (length))
Specify directly when creating a table

CREATE TABLE MyTable (

ID INT not NULL,

Username VARCHAR (+) not NULL,

INDEX [IndexName] (username (length))

);
Syntax for dropping an index:

DROP INDEX [IndexName] on mytable;
(2) Unique index

It is similar to the previous normal index, except that the value of the indexed column must be unique, but it allows for a null value. If it is a composite index, the combination of column values must be unique. It is created in the following ways:

Create an index

CREATE UNIQUE INDEX indexname on mytable (username (length))
Modify Table Structure

ALTER mytable ADD UNIQUE [IndexName] on (username (length))
Specify directly when creating a table

CREATE TABLE MyTable (

ID INT not NULL,

Username VARCHAR (+) not NULL,

UNIQUE [IndexName] (username (length))

);
(3) Primary key index

It is a special unique index and is not allowed to have null values. The primary key index is typically created at the same time as the table is built:

CREATE TABLE MyTable (

ID INT not NULL,

Username VARCHAR (+) not NULL,

PRIMARY KEY (ID)

);
Of course, you can also use the ALTER command. Remember: A table can have only one primary key.

(4) Combined index

To visually compare single-column and composite indexes, add multiple fields to the table:

CREATE TABLE MyTable (

ID INT not NULL,

Username VARCHAR (+) not NULL,

City VARCHAR (a) is not NULL,

Age INT Not NULL

);
To further extract the efficiency of MySQL, it is necessary to consider building a composite index. is to build name, city, and age into an index:

ALTER TABLE mytable ADD INDEX name_city_age (name (ten), city,age);
When the table is built, the usernname length is 16, which is used here in 10. This is because, in general, the length of the name does not exceed 10, which speeds up the index query, reduces the size of the index file, and increases the update speed of the insert.

If you set up a single-column index on Usernname,city,age, so that the table has 3 single-column indexes, the efficiency of the query and the combined index above is very different, much lower than our combined index. Although there are three indexes at this point, MySQL can only use one of the single-column indexes that it considers to be the most efficient.

The establishment of such a composite index, in fact, is equivalent to the following three sets of composite indexes:

Usernname,city,age

Usernname,city

Usernname
Why isn't there a combination index like city,age? This is because the MySQL composite index is the result of the "leftmost prefix". The simple understanding is only from the left to the beginning of the combination. Not as long as the combined index is used for queries that contain these three columns, the following SQL uses this combined index:

SELECT * FROM MyTable whree username= "admin" and city= "Zhengzhou"

SELECT * FROM MyTable whree username= "admin"
And the next few are not used:

SELECT * FROM MyTable whree age=20 and city= "Zhengzhou"

SELECT * FROM MyTable whree city= "Zhengzhou"
(5) Timing of index creation

Here we have learned to build an index, so where do we need to build the index? In general, the columns that appear in the where and join need to be indexed, but not entirely, because MySQL uses the index only for <,<=,=,>,>=,between,in, and sometimes like. For example:

SELECT T.name

From MyTable T left JOIN mytable m

On T.name=m.username WHERE m.age=20 and m.city= ' Zhengzhou '
The city and age need to be indexed, because the userame of the MyTable table also appears in the join clause, and it is necessary to index it.

Just now it is only necessary to index the like at certain times. Because MySQL does not use indexes when querying with wildcards% and _. For example, the following sentence will use the index:

SELECT * FROM MyTable WHERE username like ' admin% '
And the following sentence will not be used:

SELECT * FROM MyTable wheret Name like '%admin '
Therefore, you should pay attention to the above differences when using like.

(6) Deficiencies of the index

The benefits of using indexes are described above, but excessive use of indexes will result in abuse. So the index has its drawbacks as well:

Although the index greatly improves query speed, it also slows down the updating of tables, such as INSERT, UPDATE, and delete on tables. Because when updating a table, MySQL not only saves the data, but also saves the index file.

Index files that create indexes that consume disk space. The general situation is not too serious, but if you create multiple combinations of indexes on a large table, the index file will swell up quickly.

Indexing is just one factor in efficiency, and if your MySQL has a large data size table, you need to spend time studying to build the best indexes, or refine the query statements.

(7) Considerations for using Indexes

There are some tips and considerations when working with indexes:

The index does not contain a column with null values

This column is not valid for this composite index as long as the column contains null values that will not be included in the index, as long as there is a column in the composite index that contains null values. So we don't want the default value of the field to be null when the database is designed.

Use short Index

Index A string, or specify a prefix length if possible. For example, if you have a column of char (255), and if the majority value is unique within the first 10 or 20 characters, do not index the entire column. Short indexes not only improve query speed but also save disk space and I/O operations.

Indexed column Sorting

The MySQL query uses only one index, so if an index is already used in the WHERE clause, the column in order by is not indexed. So do not use sort operations where the default sorting of the database is acceptable, and try not to include multiple columns, if you need to create a composite index for those columns.

Like statement operations

It is generally discouraged to use the like operation, which is also an issue if it is not used. Like "%aaa%" does not use the index and like "aaa%" can use the index.

Do not perform calculations on columns

SELECT * from the users where year (adddate) <2007;
The operation will be performed on each line, which will cause the index to fail with a full table scan, so we can change to

SELECT * from users where adddate< ' 2007-01-01 ';
Do not use not in and <> operations

Above, the MySQL index type is introduced.

Indexes have a critical impact on the speed of queries, and understanding indexes is also a starting point for database Performance tuning. Consider the following scenario, assuming that a table in the database has 10^6 records, the DBMS has a page size of 4K, and stores 100 records. If there is no index, the query will scan the entire table, in the worst case, if all the data pages are not in memory, need to read 10^4 pages, if the 10^4 pages on the disk randomly distributed, need to 10^4 times I/O, assuming that the disk each time I/O time is 10ms (ignoring data transfer time), It will take a total of 100s (but actually much better). If the B-tree index is established, only log100 (10^6) = 3 page reads are required, and the worst case time is 30ms. This is the effect of the index, and many times, when your application makes SQL queries very slowly, you should think about whether you can build an index. Go to the Chase:

Chapter II, Index and optimization

1. Select the data type of the index

MySQL supports many data types, and choosing the right data type to store data has a significant impact on performance. In general, you can follow some of the following guidelines:

(1) Smaller data types are generally better: smaller data types typically require less space in disk, memory, and CPU caches, and are processed faster.
(2) Simple data types are better: integer data is less expensive to handle than characters, because string comparisons are more complex. In MySQL, you should use a built-in date and time data type instead of a string to store the time, and an integer data type to store the IP address.
(3) Try to avoid null: The column should be specified as NOT NULL unless you want to store null. In MySQL, columns with null values are difficult to query optimization because they complicate indexing, index statistics, and comparison operations. You should use 0, a special value, or an empty string instead of a null value.

1.1. Select identifiers
It is important to select the appropriate identifiers. The choice should not only consider the storage type, but also consider how MySQL is calculated and compared. Once the data type is selected, you should ensure that all related tables use the same data type.
(1) Integer type: Usually the best choice as an identifier, because it can be processed faster and can be set to auto_increment.

(2) String: Try to avoid using strings as identifiers, which consume better space and are slower to handle. And, generally, strings are random, so their position in the index is also random, which results in page splitting, random access to the disk, and clustered index splitting (for storage engines that use clustered indexes).

2. Getting Started with indexing
For any DBMS, the index is the most important factor for optimization. For a small amount of data, the lack of proper index impact is not very large, but when the amount of data increases, the performance will drop sharply.
If multiple columns are indexed (combined), the order of the columns is important, and MySQL can only make valid lookups on the leftmost prefix of the index. For example:
Suppose there is a composite index IT1C1C2 (C1,C2), the query statement select * from T1 where c1=1 and c2=2 can use the index. The query statement select * FROM T1 where C1=1 is also able to use the index. However, the query statement select * FROM T1 where c2=2 is not able to use the index because there is no combined index of the boot column, that is, to use the C2 column to find, the C1 equals a value must occur.

2.1. Type of index
Indexes are implemented in the storage engine, not in the server tier. Therefore, the indexes for each storage engine are not necessarily identical, and not all storage engines support all index types.
2.1.1, B-tree Index
Let's say it's the next table:

CREATE TABLE People (

last_name varchar () NOT NULL,

first_name varchar () NOT NULL,

DOB date NOT NULL,

Gender enum (' m ', ' F ') is not NULL,

Key (last_name, first_name, DOB)

);

Its index contains last_name, first_name, and DOB columns for each row in the table. The structure is broadly as follows:

The values stored by the

Index are arranged in the order of the indexed columns. You can use the B-tree index for full-keyword, keyword-range, and keyword-prefix queries, and of course, if you want to use an index, you must ensure that you query by the leftmost prefix of the index (leftmost prefix of the "index"). The
(1) matches the full value (match the "All"): Specifies a specific value for all columns in the index. For example, a mid-index can help you find Cuba Allen, born in 1960-01-01.
(2) matches the leftmost prefix (match a leftmost prefix): You can use the index to find the last person named Allen, using only the 1th column in the index.
(3) match column prefix (match a column prefix): For example, you can use the index to find the last name of the person starting with J, which only uses the 1th column in the index.
(4) Range query matching values (match a range of values): You can use the index to find the last name between Allen and Barrymore, using only the 1th column in the index. The
(5) Match section is accurate and the rest of the range is matched (match one part exactly and match a range on another parts): You can use the index to find last name Allen, and first name with the letter K The beginning of the people.
(6) queries the index only (index-only queries): If the queried columns are in the index, you do not need to read the values of the tuples.
because the nodes in the B-tree are stored sequentially, you can use the index for lookups (some values are found), or you can order by for the query results. Of course, using the B-tree index has the following limitations:
(1) The query must start at the leftmost column of the index. It has been mentioned many times about this. For example, you can't use an index to find people born on a given day.
(2) cannot skip an indexed column. For example, you cannot use an index to find a person who was named Smith and was born on a day. The
(3) storage engine cannot use the column to the right of the scope condition in the index. For example, if your query statement is where last_name= "Smith" and first_name like ' j% ' and dob= ' 1976-12-23 ', then the query will only use the first two columns in the index because like is a range query.

2.1.2, hash index
MySQL, only memory storage engine display Support hash index, is the memory table default index type, although memory table can also use B-tree index. The memory storage engine supports non-unique hash indexes, which are rare in the database domain, and if multiple values have the same hash code, the index saves their row pointers to the same hash table item with the linked list.
Suppose you create one of the following tables:
CREATE TABLE Testhash (
FName VARCHAR () not NULL,
LName VARCHAR () not NULL,
KEY USING HASH (fname)
) Engine=memory;
The following data is included:

Suppose the index uses the hash function f (), as follows:

F (' Arjen ') = 2323

F (' Baron ') = 7437

F (' Peter ') = 8784

F (' Vadim ') = 2458

At this point, the structure of the index is probably as follows:

The slots are orderly, but the records are not orderly. When you perform
Mysql> SELECT lname from Testhash WHERE fname= ' Peter ';
MySQL calculates the ' Peter ' hash value and then queries the index's row pointer through it. Because F (' Peter ') = 8784,mysql finds 8784 in the index, it gets a pointer to record 3.
Because indexes only store very short values, the index is very compact. The hash value does not depend on the data type of the column, and the index of a tinyint column is as large as the index of a long string column.

The hash index has some of the following limitations:
(1) Because the index contains only hash code and record pointers, MySQL cannot avoid reading records by using an index. But accessing the in-memory records is very rapid and does not have too much impact on sex.
(2) Cannot use hash index to sort.
(3) The hash index does not support partial matching of keys, because the hash value is computed by the entire index value.
(4) The hash index only supports equivalent comparisons, such as using =,in () and <=>. For where PRICE>100 does not speed up the query.
2.1.3, Space (r-tree) index
MyISAM supports spatial indexes, primarily for geospatial data types, such as geometry.
2.1.4, full-text (full-text) index
Full-text indexing is a special type of index for MyISAM, and is mainly used for full-text retrieval.

3, High-performance indexing strategy
3.1. Clustered index (Clustered Indexes)
The clustered index guarantees that the value of the key is similar to the physical location of the tuple store (so the string type should not be clustered index, especially the random string, will make the system to carry out a large number of mobile operations), and a table can only have one clustered index. Because indexes are implemented by the storage engine, not all engines support clustered indexes. Currently, only SOLIDDB and InnoDB support.
The structure of the clustered index is roughly as follows:

Note: The leaf page contains the full tuple, while the Inner node page contains only the indexed columns (indexed column integers). Some DBMS allow users to specify clustered indexes, but MySQL's storage engine is not supported so far. InnoDB the clustered index on the primary key. If you do not specify a primary key, InnoDB replaces it with an index that has a unique and non-null value. If such an index does not exist, InnoDB defines a hidden primary key and then establishes a clustered index on it. In general, the DBMS stores the actual data in the form of a clustered index, which is the basis for other two-level indexes.

Comparison of data layouts for 3.1.1, InnoDB, and MyISAM
To better understand clustered and nonclustered indexes, or primary indexes and second indexes (MyISAM does not support clustered indexes), compare the data layouts of InnoDB and MyISAM for the following table:

CREATE TABLE Layout_test (

col1 int not NULL,

col2 int not NULL,

PRIMARY KEY (col1),

KEY (col2)

);

Assume that the value of the primary key is between 1---10,000, inserted in a random order, and then optimized with optimize table. Col2 randomly assigns a value between 1---100, so there are many duplicate values.
(1) MyISAM data layout
The layout is simple and MyISAM stores data on disk in the order in which they are inserted, as follows:

Note: The left side is the row number, starting with 0. Because the tuple size is fixed, MyISAM can easily find the position of a byte from the beginning of the table.
The index structure of some of the established primary keys is broadly as follows:

Note: MyISAM does not support clustered indexes, and each leaf node in the index contains only the row number, and the leaf nodes are stored in col1 order.
Let's look at the index structure of col2:

In fact, there is no difference between primary key and other indexes in MyISAM. Primary key is just a unique, non-empty index called Primary.

(2) InnoDB data layout
InnoDB stores data in the form of clustered indexes, so its data layout is very different. It stores the table structure in the following general form:

Note: Each leaf node in the clustered index contains the value of the primary key, the transaction ID and the rollback pointer (rollback pointer)--for transactions and MVCC, and for the remaining columns (such as col2).

Two-level indexes are very different from clustered indexes compared to MyISAM. InnoDB's two-index leaf contains the value of primary key instead of the row pointer (row pointers), which reduces the overhead of maintaining a two-level index when moving data or splitting the data page, because InnoDB does not need to update the index's row pointers. The structure is broadly as follows:

MySQL index type and optimization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.