Mysql Performance Optimization tutorial _ MySQL

Source: Internet
Author: User
Mysql Performance Optimization tutorial bitsCN.com: understanding why data Indexing improves efficiency

The storage of data indexes is ordered.

In order, you do not need to traverse index records when querying a data through an index.

In extreme cases, the query efficiency of data indexes is the binary query efficiency, which is close to that of log2 (N)

How to understand the data index structure

By default, data indexes use B-tree indexes (hash indexes are also used in memory tables ).

A single ordered sorting sequence is the most efficient (binary search, or semi-query). The purpose of using a tree index is to quickly update and add or delete operations.

In extreme cases (for example, the demand for data query is very large, the demand for data update is very small, the real-time requirement is not high, and the data size is limited), a single sorting sequence is directly used, and the Half-query speed is the fastest.

Practical example: IP address lookup

Resource: The table corresponding to the IP address. the source data format is startip, endip, and area.

The number of source data entries is about 0.1 million, which is highly dispersed.

Objective: to query the region of an ip address

The performance requirement is more than 1000 queries per second.

Challenges: such as using... And database operations cannot effectively use indexes.

If you need to traverse 0.1 million records for each query request, it is not feasible.

Method: one-time sorting (only in data preparation, data can be stored in the memory sequence)

Half-lookup (each request is performed in half-lookup mode)

When performing index analysis and SQL optimization, you can think of the data Index field as a single ordered sequence and use it as the basis for analysis.

Practical examples: Composite index query optimization practices, the same city heterosexual list

Resource: user table, sex field; area; lastlogin last logon time; others

Objective: to find the opposite sex in the same region in reverse order of the last logon time

How to optimize high-frequency queries in high-traffic communities.

Query SQL: select * from user where area = '$ region' and sex =' $ sex' order by lastlogin desc limit;

Challenge: it is not difficult to create a composite index. how can we understand the composite index of area + sex + lastlogin?

First, forget about the B-tree and regard the index field as a sort sequence.

What if I only use area? Search will find all the results that match the area, traverse it, select and sort the results that match the sex. Traverse all area = '$ region' data!

If area + sex is used, it is better to traverse all area = '$ region' and sex =' $ sex 'data and then sort the data based on this !!

When Area + sex + lastlogin Composite Index (remember that lastlogin is at the end), the index is sorted based on the merging results of area + sex + lastlogin. the list can be imagined as follows.

Guangzhou Female $ time 1

Guangzhou Female $ time 2

Guangzhou Female $ time 3

...

Guangzhou male

....

Shenzhen female

....

The database easily hits the boundary of area + sex and traces 30 records up based on the bottom boundary! Quickly hit all results in the index without secondary traversal!

How to understand the impact result set

The affected result set is an important intermediate data for data query optimization.

The relationship between query conditions and indexes determines the impact on the result set.

As shown in the preceding example, even if an index is used for a query, if the query and sorting targets cannot be directly hit in the index, the results may be affected. This directly affects the query efficiency.

Microsecond-level optimization

Optimization query cannot only view slow query logs. generally, queries over 0.01 seconds are not optimized.

Practical examples

Similar to the previous case, a game community needs to display user dynamics. select * from userfeed where uid = $ uid order by lastlogin desc limit; uid is used as the index field by default in the initial stage, the query results that hit all uid = $ uid are sorted by lastlogin. When user behavior is very frequent, this SQL index hit affects hundreds or even thousands of records in the result set. The query efficiency exceeds 0.01 seconds, and the database pressure is high when the concurrency is large.

Solution: change the index to uid + lastlogin Composite Index. The index hits 30 result sets directly, and the query efficiency is improved by 10 times. The average value is 0.001 seconds, causing a sudden drop in database pressure.

Common mistakes affecting result sets

The impact on the result set does not mean the number of results queried by the data or the number of results affected by the operation, but the number of results hit by the index of the query condition.

Practical examples

A game database uses innodb, which is a row-level lock and rarely locks tables theoretically. An SQL statement (delete from tabname where xid =…) appears ...), This SQL statement is very commonly used. it only appears under certain circumstances. the frequency is not high every day (only about 10 times a day). The data table capacity is millions, but this xid has not been indexed, so the miserable thing happened. when the delete statement was executed, there were very few records actually deleted, maybe one or two, maybe none;! Because this xid has not been indexed, the delete operation traverses the full table record, the full table is locked by the delete operation, and the select operation is all locked. because the traversal time of millions of records is long, during this period, a large number of select statements are blocked, and too many database connections crash.

This kind of non-high-risk request requires a very small number of SQL statements to operate on, and the query of the entire database is blocked due to the absence of indexes, which requires great vigilance.

Summary:

The result set that affects the search condition index, rather than the output and operation.

The more results that affect the result set, the more efficient the index is.

Please note that I will never talk about optimization of foreign keys and joins here, because in our system, this is not allowed at all! The architecture optimization section explains why.

Understanding common analysis methods of execution status

Slow query logs. the focus is as follows:

Lock or not, and lock time

If a lock exists, the slow query is usually caused by the locking factor, and does not need to be optimized. the locking problem needs to be solved.

Affected result set

If the impact result set is large, it is obvious that there is a problem with index item hit, so you must take it seriously.

Explain operation

Index usage

We do not recommend that you use using index as a mandatory index. if you do not use an index as expected, we recommend that you reconsider the table structure and index settings.

Affected result set

The numbers shown here are not necessarily accurate. remember from the previous understanding of the data index? We can understand the index as an ordered sequence and reflect on SQL.

Set profiling, show profiles for query operations

Execution overhead

Note: if the problematic SQL statement is executed repeatedly, it may be stored in the cache. in this case, avoid the impact of the cache. You can see it here.

We recommend that you analyze the SQL statements that take more than 0.005 seconds.

In-depth understanding of the database execution process and overhead distribution

Show processlist

Status List

Sleep status, which usually indicates that the resource has not been released. if it is through the connection pool, the sleep status should be constant within a certain number.

Practical example: when the front-end data is output (especially to the user terminal), the database connection is not closed in time, resulting in a large number of sleep connections due to the network connection speed. when the network speed is abnormal, the database too connector connections is suspended.

Simple explanation: Data Query and execution usually takes less than 0.01 seconds, while network output usually takes about 1 second or longer. the original data connection can be released in 0.01 seconds, however, because the front-end program does not execute the close operation and directly outputs the result, the database connection remains in sleep state until the result is displayed on the user's desktop!

Waiting for net, reading from net, writing to net

Occasionally appears.

In case of a large number of attacks, quickly check the network connection status and traffic from the database to the front-end.

Case: Due to plug-ins, a large number of intranet databases are read, and the Mbit/s exchange used by the intranet is rapidly full. as a result, a large number of connections are blocked in waiting for net, and too many database connections are crashed.

Locked status

UPDATE operation lock

The use of innodb can effectively reduce the generation of locked states, but remember that the index should be correctly used for update operations, and even the infrequent access update operations should not be neglected. As shown in the example of the affected result set above.

In the myisam era, locked is a nightmare for many highly concurrent applications. Therefore, mysql officially began to recommend innodb.

Copy to tmp table

The index and existing structure cannot cover the query conditions, so a temporary table is created to meet the query requirements, resulting in a huge I/o pressure.

A terrible search statement can cause such a situation. if it is a data analysis task or a data cleaning task in the middle of the night, it may occasionally appear. It must be optimized frequently.

Copy to tmp table is usually related to table join queries. we recommend that you do not use table join queries.

Practical examples:

A community database is blocked and asks for help. after investigation, its server has multiple database applications and websites, and an uncommon small website database produces a horrible copy to tmp table operation, this overload the I/o and cpu of the entire hard disk. Kill the operation and restore everything.

Sending data

Sending data is not Sending data. do not be spoofed by this name. this is the process of obtaining data from a physical disk. if you have more results, then we need to extract data from different disk fragments,

Occasionally, this status does not affect connections.

Return to the problem that affects the result set above. generally, if there are too many sending data connections, it is usually because the result set affected by a query is too large, that is, the index item of the query is not optimized enough.

If a large number of similar SQL statements appear in the show proesslist list, all of them are in the sending data state, optimize the query index, and remember to consider using the ideas that affect the result set.

Freeing items

In theory, there will not be many such things. Occasional problems

If a large number of memory or hard disk problems occur. For example, the hard disk is full or damaged.

Sorting...

Similar to Sending data, the result set is too large and the sorting conditions are not indexed. you need to sort the data in the memory or even create a temporary structure.

Others

There are still many statuses. if you encounter this problem, check the information. Basically, we encounter less blocking in other states, so we don't care.

Analysis process

Basic process

Learn more about the problem

Too connector connections is a common appearance for many reasons.

Index corruption rarely occurs in innodb.

Logs and error messages should be traced in other cases.

Understand basic load status and Operation Status

Basic operation status

READ requests per second

Write requests per second

Current online user

Current data capacity

Basic load

Learn to use these commands

Top

Vmstat

Uptime

Iostat

Df

Cpu load composition

Pay special attention to I/o pressure (wa %)

Multi-core load distribution

Memory usage

Whether Swap partition is encroached on

If Swap partition is occupied, check whether the physical memory is idle.

Disk status

When the hard disk is full or the inode node is full, you must quickly locate and handle the problem.

BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.