Optimize SQL statements in mysql

Source: Internet
Author: User

Mysql SQL statement optimization common mistakes: www.2cto.com misunderstanding 1: count (1) and count (primary_key) are better than count (*) Many people use count (1) to count the number of records) and count (primary_key) instead of count (*), they think this performance is better, in fact, this is a misunderstanding. In some scenarios, the performance may be worse. The database should optimize the count (*) operation. Misunderstanding 2: count (column) and count (*) are the same. This misunderstanding is even common in many senior engineers or DBAs. Many people will take it for granted. In fact, count (column) and count (*) are completely different operations, and the meaning is completely different. Count (column) indicates the number of columns in the result set that are not empty. count (*) indicates the number of records in the result set. Misunderstanding 3: select a, B from... Than select a, B, c from... The misunderstanding that the database can access less data mainly exists in a large number of developers, mainly because they do not know much about the database storage principle. In fact, most relational databases store data in row mode, and data access operations use a fixed IO unit (called block or page) as a single bit, usually 4 kb, 8 KB... In most cases, multiple rows are stored in each IO unit, and each row stores all fields of the row (except for special fields such as lob ). Therefore, we need to retrieve one or more fields. In fact, the amount of data that the database needs to access in the table is the same. Of course, there are exceptions, that is, our query can be completed in the index, that is, when only two fields a and B are obtained, no need to return to the table, the field c is not used in the index and needs to be retrieved from the table. Under such circumstances, the IO volume of the two will be significantly different. (Covering indexes) misunderstanding 4: order by must be sorted. We know that index data is actually ordered. If the data we need is consistent with the order of an index, in addition, if our query is executed through this index, the database will generally omit the sorting operation and directly return the data because the database knows that the data has met our sorting requirements. In fact, using indexes to optimize SQL statements with sorting requirements is an important optimization method. For more information, see Implementation Analysis of MySQL ORDER, the basic implementation principle of MySQL group by and MySQL DISTINCT. (Order by null) misunderstanding 5: There is a filesort in the execution plan that will sort disk files. This misunderstanding does not blame us, but it is because MySQL developers use words. Filesort is the information displayed in the "Extra" column when we use the explain command to view an SQL Execution Plan. In fact, as long as an SQL statement needs to be sorted, "Using filesort" will be displayed, which does not mean that there will be File Sorting operations. Read more: Understand the filesort In the MySQL Explain command output, which gives us a better understanding of Explain and gives us a clear idea of which execution plan is better. Basic Principle 1: the advantage of joining MySQL as little as possible is that it is simple, but it is also a disadvantage in some aspects. The MySQL optimizer is highly efficient. However, due to the limited amount of statistics, the Optimizer may be more prone to deviations during its operation. For complex multi-table Join, on the one hand, the optimizer is limited, and the effort to Join is not enough, so the performance is still a little far from that of Oracle and other relational databases. However, for simple single-table queries, this gap will be extremely small and even superior to those of database predecessors in some scenarios. Principle 2: As few sorting operations as possible will consume a large amount of CPU resources, so reducing sorting will greatly affect the SQL response time when the cache hit rate is high and the IO capability is sufficient. For MySQL, there are multiple ways to reduce sorting, such as: 1. as mentioned in the preceding misunderstanding, index-based sorting is used for optimization. 2. reduce the number of records involved in sorting. 3. do not sort data unless necessary... Principle 3: Try to avoid select * many people find this difficult to understand. Isn't the above misunderstanding that the number of fields in the select clause just mentioned does not affect the data read? Yes, most of the time it does not affect the IO volume, but when we still have the order by operation, the number of fields in the select clause will greatly affect our sorting efficiency. In addition, the above misunderstanding also said that most of the time it will not affect the IO volume, when our query results only need to be found in the index, it will greatly reduce the IO volume. (Covering indexes) Principle 4: Try to use join instead of subqueries. Although the Join performance is not good, it has a great performance advantage compared with MySQL subqueries. MySQL's subquery Execution Plan has been a big problem. Although this problem has existed for many years, it has been common in all the stable versions that have been released and has not been significantly improved. Although the official team admitted this issue for a long time and promised to solve it as soon as possible, at least till now we have not seen any better version to solve this problem. Principle 5: minimize or when there are multiple conditions in the where clause that coexist with "or", the MySQL optimizer does not solve the optimization problem of its execution plan very well, in addition, the MySQL-specific SQL and Storage layered architecture causes poor performance. In many cases, union all or union is used (when necessary) to replace "or. Principle 6: Try to use union all to replace union and union all. The difference is that the former must set two (or more) results and then perform unique filtering, this involves sorting, adding a lot of CPU operations, and increasing resource consumption and latency. So when we can confirm that there is no repeated result set or we do not care about the repeated result set, we should try to use union all instead of union. Principle 7: filtering this optimization policy as early as possible is most common in the index Optimization Design (putting the fields with better filtering at the top ). This principle can also be used in SQL writing to optimize some Join SQL statements. For example, when we query paging data in multiple tables, we 'd better filter data in one table and split the pages, then, we can use the result set of split pages to Join other tables. This can reduce unnecessary IO operations as much as possible, greatly saving the time consumed by IO operations. (The storage engine itself can identify the optimal connection sequence, but it is also a program after all. When multiple tables are connected in a complex way, it will not work. Sometimes manual straight-join controls the table connection sequence, and the effect will be obvious.) Principle 8: avoid type conversion here "type conversion" refers to the type conversion that occurs when the column field type in the where clause is inconsistent with the input parameter type. Manually convert column_name by using the conversion function, which directly leads to MySQL (in fact, other databases also have the same problem) failure to use the index. If conversion is required, it should be performed on the input parameter. If the input data type and field type are inconsistent, and no type conversion is performed, MySQL may convert the data type by itself, or it may not be processed by the storage engine. As a result, the index cannot be used, resulting in execution plan problems. (Note: <> indexes are not used even if not. If the data volume is large, you can use the subquery not in or further optimize the external connection mode.) Principle 9: optimize high-concurrency SQL statements with priority, rather than lower execution frequency. For some "large" SQL statements that are destructive, high-concurrency SQL statements are always greater than low-frequency ones, this is because if a high-concurrency SQL statement encounters a problem, it will not even give us any chance to breathe out, and the system will be overwhelmed. For some SQL statements that consume a large amount of IO and slow response, even if the response frequency is low, the system may slow down at most, but it may take at least a while, let's have a buffer meeting. Principle 10: optimization from the global perspective, rather than one-sided adjustment of SQL optimization, cannot be performed on a specific one, but all the SQL statements in the system should be fully considered, especially when you adjust the index to optimize the SQL Execution Plan, you must never ignore this problem because it is too small. Try to explain and optimize each SQL statement running in the database. You need to be aware of the SQL Execution Plan to determine whether there is room for optimization and whether there is an execution plan problem. After optimizing the SQL statements running in the database for a period of time, it is obvious that there may be very few SQL statements, and most of them need to be explored, at this time, a large number of explain operations are required to collect execution plans and determine whether optimization is required. You can also filter the SQL statements that can be optimized by means of slow query log analysis and SQL Execution monitoring.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.