MySQL Performance Optimization (1)

MySQL Performance Optimization (1) _ MySQL

Last Update:2018-04-18 Source: Internet

Author: User

Tags mysql query optimization

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article summarizes and introduces MySQL Performance Optimization in detail. For more information, see the following code:

SELECT id, nick_name FROM user, user_group WHERE user_group.group_id = 1 and user_group.user_id = user. id order by user_group.gmt_create desc limit 100,20;

Solution 2:

SELECT user.id,user.nick_name FROM(SELECT user_idFROM user_groupWHERE user_group.group_id=1ORDER BY gmt_create desclimit 100,20)t,userWHERE t.user_id=user.id;

By comparing the execution plans of the two solutions, we can see that the number of records that need to Join with the user table in solution 1 MySQL estimates that it is 31156 through statistical data, that is, the number of records that meet group_id = 1 returned through the user_group table (the actual data in the system is 20000 ). In the execution plan of the second solution, there are only 20 pieces of data that the user table participates in the Join operation, and there is a big difference between the two. we think that the solution in the second solution should be significantly better than the first solution.

4. Impact of Schema design on system performance

Minimize database access requests.

Minimize useless data query requests.

5. Impact of hardware environment on system performance

1. typical OLTP application system

The OLTP system, which is the most common in various database system environments, features a large concurrency and a large amount of data. However, each access to the OLTP system is relatively small and the accessed data is relatively discrete, the proportion of active data to the total data is not too large. Databases of such systems are actually the most difficult to maintain and optimize, and have the highest overall performance requirements on hosts. Because not only is the traffic high, but the data volume is not small.

Based on the above features and analysis, we can draw a general direction for OLTP.

Although the total data volume of the system is large, but the proportion of the system's active data in the total data volume is not large, we can increase the memory capacity to cache as much active data as possible into the memory;

Although I/O access is very frequent, but the amount of data accessed each time is small and discrete, our requirement for disk storage is that the IOPS performance is good, and the throughput is a secondary factor;

The concurrency is high, and the number of requests processed by the CPU per second is naturally large, so the CPU processing capability needs to be strong;

Although the amount of data for each interaction with the client is not very large, but the network interaction is very frequent, so the network device for interaction between the host and the client must not be too weak in traffic capabilities.

2. typical OLAP application systems

The main feature of the OLAP system for data analysis is that the data volume is very large and there are not many concurrent accesses. However, each access requires a large amount of data to be retrieved, and the data access is relatively concentrated, there is no obvious concept of active data.

Based on the characteristics of the OLAP system and the corresponding analysis, the general strategy for OLAP system hardware optimization is as follows:

The data volume is very large, so the unit capacity of the disk storage system should be as large as possible;

If the data volume for a single access is large and the access data is concentrated, the performance requirement for the IO system should be as high as the IO throughput per second. Therefore, the disk with the maximum throughput per second should be selected;

Although I/O performance requirements are relatively high, but the number of concurrent requests is small, so the CPU processing capability is more difficult to become a performance bottleneck, so the CPU processing capability is not too demanding;

Although each request has a large access volume, most of the data in the execution process is not returned to the client, and the data returned to the client is small, therefore, the requirement for network devices to interact with clients is not too high;

In addition, because the OLAP system has a long operation process and can be parallel, the OLAP system is generally a cluster composed of multiple hosts, in general, the amount of data interaction between hosts in the cluster is very large, so the network equipment between hosts in the cluster is highly required.

3. in addition to the preceding two typical applications, there is also a special type of application system. their data volume is not very large, but they are frequently accessed, and most of them are read requests. Tens of thousands or even tens of thousands of requests may be required per second. each request is very simple. most requests may only return one or several small Records, for example, the database-based DNS service is such a type of service.

Although the data size is small, but access is extremely frequent, you can use a large memory to cache most of the data, which can ensure a very high hit rate, the disk IO volume is relatively small, therefore, disks do not require high performance;

The number of concurrent requests is very frequent, which requires a strong CPU processing capability;

Although the amount of interaction between applications and databases is very large, there is little interaction data each time, and the overall traffic is also large, but generally the general Gigabit Nic is enough.

V. MySQL locking mechanism

Row-level locking)

Table-level locking)

Page-level locking)

In MySQL databases, table-level locking mainly involves some non-transactional storage engines such as MyISAM, Memory, and CSV, while row-level locking mainly involves the Innodb storage engine and NDBCluster storage engine, page-level locking mainly refers to the locking method of the BerkeleyDB storage engine.

VI. MySQL Query optimization

The optimization ideas and principles of Query statements are described in the following aspects:

1. optimize the Query that needs optimization;

2. locate the performance bottleneck of the optimization object;

3. clear optimization objectives;

4. start with Explain;

5. use profile more

6. always use a small result set to drive a large result set;

7. complete sorting in the index as much as possible;

8. retrieve only the Columns you need;

9. use only the most effective filter conditions;

10. avoid complex Join and subqueries as much as possible;

Rational design and use of indexes

1) B-Tree index

Generally, the physical files of the B-Tree index in MySQL are mostly stored in the BalanceTree structure, that is, all the actual data is stored in the LeafNode of the Tree, in addition, the shortest path length to any LeafNode is exactly the same, so we all call it a B-Tree index. of course, there may be various databases (or various storage engines of MySQL) when you store your own B-Tree indexes, the storage structure is slightly modified. For example, the actual storage structure used by the B-Tree index of the Innodb storage engine is actually B + Tree, that is, a small transformation has been made on the basis of the B-Tree data structure, in addition to information related to the index key, each LeafNode also stores pointer information pointing to the next LeafNode adjacent to the LeafNode, this is mainly to accelerate the efficiency of retrieving multiple adjacent leafnodes.

2) Hash index

Not many Hash indexes are used in MySQL. Currently, they are mainly used by the Memory storage engine, and the Memory storage engine uses the Hash index as the default index type. The so-called Hash index actually uses a certain Hash algorithm to Hash the key values to be indexed, and then stores the obtained Hash values in a Hash table. Then, each time the search is required, the search conditions are Hash calculated based on the same algorithm, and then compared with the Hash value in the Hash table to obtain the corresponding information.

Hash indexes can only meet "=", "IN", and "<=>" queries, but cannot use range queries;

Hash indexes cannot be used to avoid data sorting;

Hash indexes cannot be queried using some index keys;

Hash indexes cannot avoid table scanning at any time;

When the Hash index encounters a large number of equal Hash values, the performance is not necessarily higher than the B-Tree index;

3) Full-text index

Full-text indexes are also the Full-text indexes we often call. Currently, only the MyISAM storage engine supports Full-text indexes in MySQL, and not all data types support Full-text indexes. Currently, Full-TEXT indexes can be created for columns with only CHAR, VARCHAR, and text data types.

Indexes can greatly improve the efficiency of data retrieval and improve the performance of sorting and grouping operations. However, we cannot ignore the problem that indexes are completely independent of a part of data except basic data, updating data results in IO volume and resource consumption caused by index adjustment.

Whether to create an index or not. several principles are as follows: you should create an index for fields that are frequently used as query conditions; fields with poor uniqueness cannot be created separately, even if they are frequently used as query conditions; frequently updated fields are not suitable for index creation;

Fields that do not appear in the WHERE clause should not be indexed;

Optimization of Join statements

Minimize the total number of NestedLoop cycles in the Join statement. "Always use a small result set to drive a large result set ".

The inner loop of NestedLoop is optimized first;

Make sure that the Join condition field in the drive table has been indexed;

When the Join condition field of the driven table cannot be indexed and the memory resources are sufficient, do not ignore JoinBuffer settings too much;

Order by, group by, and DISTINCT optimization

1) implementation and optimization of ORDER

When optimizing order by in a Query statement, we try to use existing indexes to avoid actual sorting calculation, which can greatly improve the performance of order by operations.

Optimized sorting:

1. set the max_length_for_sort_data parameter;

2. remove unnecessary returned fields;

3. increase the sort_buffer_size parameter settings;

2) implementation and optimization of GROUP

Group by actually requires sorting operations, and compared with order by, group by mainly only requires grouping operations after sorting. Of course, if other aggregate functions are used during grouping, some aggregate functions are required for calculation. Therefore, in the implementation of group by, indexes can also be used like order.

3) implementation and optimization of DISTINCT

DISTINCT is actually very similar to the group by operation, except that only one record is retrieved in each GROUP after group. Therefore, the implementation of DISTINCT is similar to that of group by, and there is no big difference. It can also be achieved through loose index scanning or compact index scanning. of course, MySQL can only complete DISTINCT by using a temporary table when the index cannot be used only. However, the difference from group by is that DISTINCT does not need to be sorted. That is to say, MySQL uses a temporary table to "cache" data when only the Query operated by DISTINCT cannot complete operations by using indexes ", but does not perform the filesort operation on the data in the temporary table.

The above is all the content of this article, hoping to help you learn.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More