Mysql database index optimization and Practice (I), mysql Index

Source: Internet
Author: User
Tags mysql index

Mysql database index optimization and Practice (I), mysql Index
Preface

Mysql database is currently the most widely used database system. Dealing with databases is one of the daily work of every Java programmer. index optimization is one of the essential skills.

Why should we understand the real cases of indexing?

Case 1: After a university has been studying Crawlers for a period of time, it crawled million user answer data and stored it in mysql data. At that time, I didn't know about the index. A simple "search all the SQL statements answered by user name" takes about half a minute, which is completely insufficient for normal use.

Case 2: Recently, many slow SQL risk warnings frequently appear in online application databases, but little has been known about database optimization since work. For example, a user data page needs to execute many database queries, and the performance is slow. You can barely access the page by increasing the timeout time, but the performance needs to be optimized.

Advantages of Indexes

Appropriate indexes can greatly reduce the volume of data scanned by the mysql server, avoid Memory sorting and temporary tables, and improve the query performance of applications.

Index type

Mysql Data has multiple index types: primary key, unique, and normal, but the underlying data structure is BTREE. Some storage engines also provide hash indexes and full-text indexes.

B-tree is the most common index structure to be optimized. It is based on B-tree.

B-TREE

The simplest and most violent way to query data is to traverse all records. If the data is not repeated, it can be organized into a binary tree and queried through the binary search algorithm, which greatly improves the query performance. B-tree is a more powerful sorting tree that supports multiple branches with a lower height and faster data insertion, deletion, and update.

The index files and file system file blocks of modern databases are organized into btrees.

Each node of B-tree contains key, data, and only child node pointers.

The concept d> = 1 of btree's degree. Assume that the degree of the btree is d, then each internal node can have n = [d + 1, 2d + 1) key, n + 1 sub-node pointer. The maximum height of the tree is h = Logb [(N + 1)/2].

In the index and file system, the B-TREE node is always counted as close to a memory page size (also the disk sector size), and the tree degree is very large. In this way, the number of disk I/O operations is equal to the height h of the tree. Assume that B is 100, and h is only three layers for the tree with 1 million nodes. That is, only three disk I/O queries can be completed, and the performance is very high.

Index Query

After an index is created, appropriate query statements can maximize the index advantages.

In addition, because the query optimizer can parse the client's SQL statements, it will adjust the conditional order of the SQL query statements to match the most appropriate index.

-- Create table statement create table people (last_name VARCHAR (20) not null, first_name VARCHAR (20) not null, gender CHAR (1) not null,
Birth date not null, KEY last_first_name_gender_key (last_name, first_name, gener ));
1. Full value matching

The where condition of the query statement matches all columns in the index.

1 SELECT * FROM people WHERE last_name='zhang' AND first_name='yin' AND gender='m';
2. matching the leftmost prefix

The query condition can match the leftmost columns of an index.Note the keyword "leftmost prefix".

-- Partial indexes can be used "last_name" SELECT * FROM people WHERE last_name = 'zhang' AND gender = 'M '; -- you cannot use the index SELECT * FROM people WHERE first_name = 'zhang' AND gender = 'M ';
3. Column prefix matching

The like condition in the query can also be used in some scenarios. For example, last_name like 'zh % 'can use indexes, while last_name like' % ing 'cannot use indexes.

-- Indexes can be used, because the BTREE node matches SELECT * FROM people WHERE last_name like 'zhang % 'AND gender = 'M' FROM the left of the key value when comparing the key value ';
Iv. Range Query

Index columns also support range query.

SELECT * FROM people WHERE last_name > 'zhang' AND last_name <'wang'
5. Sorting

The order by statement also supports sorting BY indexes in specific situations to improve performance.

EXPLAIN SELECT * FROM people WHERE last_name = 'zhang' ORDER BY first_name ASC
Vi. Restrictions

1. query Columns cannot participate in expression operations; otherwise, indexes cannot be used.

-- The table design does not contain the age column for reference. -- assume that age is a part of the index, this query will not be able to use the index SELECT * FROM people WHERE last_name = 'zhang' AND age + 3> 28; -- in this case, you can use the index SELECT * FROM people WHERE last_name = 'zhang' AND age> 25;

2. If it is not from the leftmost column of the index, the index cannot be used. For example, indexes cannot be used for queries based on first_name, gender, or search.

-- The match does not start FROM last_name, so the index SELECT * FROM people WHERE first_name = 'zhang' AND gender = 'M' cannot be used'

3. The column in the index cannot be skipped.

-- The first_name query cannot be skipped. Otherwise, only the last_name column uses the index SELECT * FROM people WHERE last_name = 'zhang' AND gender = 'M'

4. If a column in the query is a Range Query (like, between, >,<, etc.), indexes cannot be used for all columns on the right.

-- Because first_name uses like query, the gender Column cannot use the index SELECT * FROM people WHERE last_name = 'zhang' AND first_name LIKE '% in' AND gender = 'M ';
Efficient index Policy

We have discussed various queries that can use indexes. The following describes how to create efficient indexes.

1. Create multiple column Indexes

Create an index for multiple columns, instead of creating a separate index for each column. This is because the mysql server can only match an index (or none) based on the query and analysis. Therefore, assume that a separate index is created on each of the multiple columns. Even if multiple columns are used in the Combined Query, only one column uses the index.

Therefore, if your most common query is based on last_name, first_name, and gender, you should create an Index containing three columns.

ALTER TABLE people ADD INDEX idx_name_gender(last_name, first_name , gender);
2. Order of index Columns

In a multi-column B-TREE index, it means that the index is sorted from left to right starting from the leftmost column. A design empirical rule that places "highly selective" columns on the leftmost column of the index. This helps the index to find the target tuples after the minimum comparison.

Index column selectivity:The ratio of non-repeated index values to the total number of all records in the table, 0 <T <= 1. The selectivity of the unique index column is 1. The higher the index selectivity, the higher the query efficiency. You can filter out unmatched records "earlier.

Suppose you want to create an index for the last_name, first_name, and gender columns.

T (last_name) = select count (distinct last_name)/count (*);

T (first_name) = select count (distinct first_name)/count (*);

T (gender) = select count (distinct gender)/count (*);

Obviously, last_name and first_name should be placed before the index (based on the actual situation)

 

End

I learned about common index policies and query techniques, but how can I apply and troubleshoot SQL Performance defects in existing databases in actual projects? The next article will introduce the explain Keywords of the mysql database, and summarize and analyze common slow SQL skills.

Reference

1. High-Performance mysql

2, "MySQL index data structure and algorithm principle" https://www.kancloud.cn/kancloud/theory-of-mysql-index/41844

3, https://zh.wikipedia.org/wiki/ B %E6%A0%91

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.