MySQL database index optimization and practice (i)

Source: Internet
Author: User

Preface

MySQL database is now the most widely used database system. Working with databases is one of the daily tasks of every Java programmer, and indexing optimization is one of the necessary skills.

Why to understand the index real case

Case one: The university has a period of time to learn the crawler, crawled through the 300w user answer data, stored in the MySQL data. At that time did not understand the index, a simple "search by user name full answer SQL" needs to execute about half a minute, completely can not meet the normal use.

Case two: Recent online application of the database frequently has multiple slow SQL risk hints, and since the work, the database optimization is known very little. For example, a user data page needs to perform many database queries, performance is slow, by increasing the time-out is barely accessible, but performance needs to be optimized.

Benefits of indexing

The appropriate index can greatly reduce the amount of data scanned by the MySQL server, avoid memory sorting and temporary tables, and improve the query performance of the application.

Type of index

MySQL data has multiple index types, primary key,unique,normal, but the underlying data structures are btree; some storage engines also provide hash indexes, full-text indexes.

Btree is the most common optimization to face the index structure, are based on the btree discussion.

B-tree

The simplest way to query data violence is to traverse all records, and if the data is not duplicated, it can be organized into a sort of binary tree, query by binary search algorithm, greatly improve query performance. Btree is a more powerful sort of tree that supports multiple branches with lower height and faster insertion, deletion, and update of data.

The index file and file system file blocks of the modern database are organized into btree.

Each node of the btree contains Key,data and only child node pointers.

Btree has a degree of concept d>=1. Assuming that the degree of btree is d, each internal node can have n=[d+1,2d+1) of key,n+1 child node pointers. The maximum height of the tree is h=logb[(n+1)/2].

In indexes and file systems, B-tree nodes are often designed to be close to a memory page size (also disk sector size), and the tree is very large. So the number of disk I/O is equal to the height of the tree H. Suppose b=100, a tree of 1 million nodes, H will have only 3 layers. That is, only 3 disk I/O can be found, performance is very high.

Index Query

After indexing, the appropriate query statements can maximize the benefits of indexing.

Also, because the query optimizer resolves the client's SQL statements, it adjusts the condition order of the SQL query statement to match the most appropriate index.

--Table Creation Statement CREATE TABLE people (    last_name varchar) NOT NULL,    first_name varchar (1) is not NULL,    gender CHAR Not NULL,
Birth date not NULL, KEY last_first_name_gender_key (last_name, first_name, Gener));
One, full value match

A query statement matches all columns in the Where Condition and index.

1 SELECT * from people WHERE Last_name= ' Zhang ' and first_name= ' Yin ' and gender= ' m ';
Two, the leftmost prefix matches

The query criteria can match the leftmost columns of the index. Note the keyword "leftmost prefix".

--You can use the partial index "last_name" SELECT * from people where last_name= ' Zhang ' and gender= ' m ';--Cannot use index SELECT * from people WHERE firs T_name= ' Zhang ' and gender= ' m ';
Three, column prefix match

The like condition in the query, and the index can be used in some scenarios. such as Last_Name like ' zh% ' can use the index, and last_name '%ing ' cannot use the index.

--You can use the index because the Btree node compares the key value when the key is worth the leftmost start to match the select * from people WHERE the last_name like ' zhang% ' and gender= ' m ';
Four, scope query

The indexed columns also support range queries.

SELECT * from people WHERE last_name > ' Zhang ' and last_name < ' Wang '
Five, sort

The order BY statement also supports sorting by index in specific cases to improve performance.

EXPLAIN SELECT * from people WHERE last_name = ' Zhang ' ORDER by First_Name ASC
VI, limit

1, the query column cannot participate in an expression operation, otherwise the index cannot be used.

--There is no age column in the table design as a reference-assuming that the age is part of the index, such a query will not be available to the index SELECT * from people WHERE Last_name= ' Zhang ' and age+3>28;-- This allows you to use the index SELECT * from people WHERE Last_name= ' Zhang ' and age>25;

2, the index cannot be used if it is not started from the leftmost column of the index. For example, a query based on first_name, gender, or lookup cannot use an index.

--not starting match from last_name, so cannot use index SELECT * from people WHERE First_name= ' Zhang ' and gender= ' m '

3, you cannot skip columns in the index.

--Cannot skip first_name query, otherwise only last_name column used index SELECT * from people WHERE Last_name= ' Zhang ' and gender= ' m '

4, if a column in the query is a range query (like,between,>,<, etc.), all columns on its right cannot use the index.

--because First_Name uses the like query, the Gender column cannot be indexed with the SELECT * from people WHERE Last_name= ' Zhang ' and first_name like '%in ' and gender= ' m ';
Efficient indexing Strategy

Here's a look at a variety of query scenarios that can be indexed, and here's how to build efficient indexes.

1, set up multi-column index

Create multiple-column indexes instead of creating separate columns for each column. Because after the MySQL server has been queried for analysis, it is only possible to match the query to an index (or not) and use it. So, let's say that there are separate indexes on multiple columns, even if the combined query uses multiple columns, and eventually only one column is used for the index.

So, assuming that your most common query is based on last_name, first_name, and gender, you should create an index that contains three columns.

ALTER TABLE People ADD INDEX idx_name_gender (last_name, first_name, gender);
2, the Order of indexed columns

In a multicolumn B-tree index, it means that the index is sorted from left to right, starting with the leftmost column. A design rule of thumb that places the "high selectivity" column in the leftmost column of the index. This helps the index to find the target tuple with a minimal comparison.

index Column selectivity: the ratio of non-repeating index values to the total number of records in the table, 0<t<=1. The selectivity of a unique indexed column is 1. The higher the selectivity of the index, the higher the query efficiency and the ability to filter out unmatched records "earlier".

Suppose you want to establish an index of last_name, first_name, gender three columns.

T (last_name) = select count (distinct last_name)/count (*);

T (first_name) = select count (distinct first_name)/count (*);

T (gender) = SELECT COUNT (distinct gender)/count (*);

Obviously, last_name and first_name should be placed in front of the index (the actual situation is the main)

End

Learn about common indexing strategies and query techniques, but how do you apply and troubleshoot SQL performance flaws in an existing database in a real-world project? The next article will cover the MySQL database's explain keyword, summarizing and analyzing common techniques for slow SQL.

MySQL database index optimization and practice (i)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.